diff --git a/supporting-blog-content/why-rag-still-matters/dataset.json b/supporting-blog-content/why-rag-still-matters/dataset.json
new file mode 100644
index 00000000..f04cd42a
--- /dev/null
+++ b/supporting-blog-content/why-rag-still-matters/dataset.json
@@ -0,0 +1,1820 @@
+[
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it Lucene BT AL By: Benjamin Trent and Ao Li On February 7, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Yep, another bug fixing blog. But this one has a twist, an open-source hero swoops in and saves the day. Debugging concurrency bugs is no picnic, but we're going to get into it. Enter Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, that turns flaky failures into reliably reproducible ones. Thanks to Fray’s clever shadow lock design and precise thread control, we tracked down a tricky Lucene bug and finally squashed it. This post explores how open-source heroes and tools are making concurrency debugging less painful—and the software world a whole lot better. Software Engineer's Bane Concurrency bugs are the worst. Not only are they difficult to fix, simply getting them to fail reliably is the hardest part. Take this test failure, TestIDVersionPostingsFormat#testGlobalVersions , as an example. It spawns multiple document writing and updating threads, challenging Lucene’s optimistic concurrency model. This test exposed a race condition in the optimistic concurrency control. Meaning, a document operation may falsely claim to be the latest in a sequence of operations 😱. Meaning, in certain conditions, an update or delete operation might actually succeed when it should have failed given optimistic concurrency constraints. Apologies for those who hate Java stack traces. Note, delete doesn’t necessarily mean “delete”. It can also indicate a document “update”, as Lucene’s segments are read-only. Apache Lucene manages each thread that is writing documents through the DocumentsWriter class. This class will create or reuse threads for document writing and each write action controls its information within the DocumentsWriterPerThread (DWPT) class. Additionally, the writer keeps track of what documents are deleted in the DocumentsWriterDeleteQueue (DWDQ). These structures keep all document mutation actions in memory and will periodically flush, freeing up in-memory resources and persisting structures to disk. In an effort to prevent blocking threads and ensuring high throughput in concurrent systems, Apache Lucene tries to only synchronize in very critical sections. While this can be good in practice, like in any concurrent systems, there are dragons. A False Hope My initial investigation pointed me to a couple of critical sections that were not appropriately synchronized. All interactions to a given DocumentsWriterDeleteQueue are controlled by its enclosing DocumentsWriter . So while individual methods may not be appropriately synchronized in the DocumentsWriterDeleteQueue , their access to the world is (or should be). (Let’s not delve into how this muddles ownership and access—it’s a long-lived project written by many contributors. Cut it some slack.) However, I found one place during a flush that was not synchronized. These actions aren’t synchronized into a single atomic operation. Meaning, between newQueue being created, and calling getMaxSeqNo , other code could have executed incrementing the sequence number in the documentsWriter class. I found the bug! If only it were that easy. But, as with most complex bugs, finding the root cause wasn't simple. That's when a hero stepped in. A hero in the fray Enter our hero: Ao Li and his colleagues at the PASTA Lab. I will let him explain how they saved the day with Fray. Fray is a deterministic concurrency testing framework developed by researchers at the PASTA Lab , Carnegie Mellon University. The motivation behind building Fray stems from a noticeable gap between academia and industry: while deterministic concurrency testing has been extensively studied in academic research for over 20 years, practitioners continue to rely on stress testing—a method widely acknowledged as unreliable and flaky—to test their concurrent programs. Thus, we wanted to design and implement a deterministic concurrency testing framework with generality and practical applicability as the primary goal. The Core Idea At its heart, Fray leverages a straightforward yet powerful principle: sequential execution. Java’s concurrency model provides a key property —if a program is free of data races, all executions will appear sequentially consistent. This means the program’s behavior can be represented as a sequence of program statements. Fray operates by running the target program in a sequential manner: at each step, it pauses all threads except one, allowing Fray to precisely control thread scheduling. Threads are selected randomly to simulate concurrency, but the choices are recorded for subsequent deterministic replay. To optimize execution, Fray only performs context-switches when a thread is about to execute a synchronizing instruction such as locking or atomic/volatile access. A nice property about data-race freedom is that this limited context switching is sufficient to explore all observable behaviors due to any thread interleaving ( our paper has a proof sketch). The Challenge: Controlling Thread Scheduling While the core idea seems simple, implementing Fray presented significant challenges. To control thread scheduling, Fray must manage the execution of each application thread. At first glance, this might seem straightforward—replacing concurrency primitives with customized implementations. However, concurrency control in the JVM is intricate, involving a mix of bytecode instructions , high-level libraries , and native methods . This turned out to be a rabbit hole: For example, every MONITORENTER instruction must have a corresponding MONITOREXIT in the same method. If Fray replaces MONITORENTER with a method call to a stub/mock, it also needs to replace MONITOREXIT . In code that makes use of object.wait/notify , If MONITORENTER is replaced, the corresponding object.wait must also be replaced. This replacement chain extends to object.notify and beyond. JVM invokes certain concurrency-related methods (e.g., object.notify when a thread ends) within native code. Replacing these operations would require modifying the JVM itself. JVM functions, such as class loaders and garbage collection (GC) threads, also use concurrency primitives. Modifying these primitives can create mismatches with those JVM functions. Replacing concurrency primitives in the JDK often results in JVM crashes during its initialization phase. These challenges made it clear that a comprehensive replacement of concurrency primitives was not feasible. Our Solution: Shadow Lock Design To address these challenges, Fray uses a novel shadow lock mechanism to orchestrate thread execution without replacing concurrency primitives. Shadow locks act as intermediaries that guide thread execution. For example, before acquiring a lock, an application thread must interact with its corresponding shadow lock. The shadow lock determines whether the thread can acquire the lock. If the thread cannot proceed, the shadow lock blocks it and allows other threads to execute, avoiding deadlocks and allowing controlled concurrency. This design enables Fray to control thread interleaving transparently while preserving the correctness of concurrency semantics. Each concurrency primitive is carefully modeled within the shadow lock framework to ensure soundness and completeness. More technical details can be found in our paper. Moreover, this design is intended to be future-proof. By requiring only the instrumentation of shadow locks around concurrency primitives, it ensures compatibility with newer versions of JVM. This is feasible because the interfaces of concurrency primitives in the JVM are relatively stable and have remained unchanged for years. Testing Fray After building Fray, the next step was evaluation. Fortunately, many applications, such as Apache Lucene, already include concurrency tests. Such concurrency tests are regular JUnit tests that spawn multiple threads, do some work, then (usually) wait for those threads to finish, and then assert some property. Most of the time, these tests pass because they exercise only one interleaving. Worse yet, some tests only fail occasionally in the CI/CD environment, as described earlier, making these failures extremely difficult to debug. When we executed the same tests with Fray, we uncovered numerous bugs. Notably, Fray rediscovered previously reported bugs that had remained unfixed due to the lack of a reliable reproduction, including this blog’s focus: TestIDVersionPostingsFormat.testGlobalVersions . Luckily, with Fray, we can deterministically replay them and provide developers with detailed information, enabling them to reliably reproduce and fix the issue. Next Steps for Fray We are thrilled to hear from developers at Elastic that Fray has been helpful in debugging concurrency bugs. We will continue to work on Fray to make it available to more developers. Our short-term goals include enhancing Fray’s ability to deterministically replay the schedule, even in the presence of other non-deterministic operations such as a random-value generator or the use of object.hashcode . We also aim to improve the usability of Fray, enabling developers to analyze and debug existing concurrency tests without any manual intervention. Most importantly, if you are facing challenges debugging or testing concurrency issues in your program, we’d love to hear from you. Please don’t hesitate to create an issue in the Fray Github repository . Time to fix the danged thing Thanks to Ao Li and the PASTA lab, we now have a reliably failing instance of this test! We can finally fix this thing. The key issue resided in how DocumentsWriterPerThreadPool allowed for thread and resource reuse. Here we can see each thread being created, referencing the initial delete queue at generation 0. Then the queue advance will occur on flush, correctly seeing the previous 7 actions in the queue. But, before all the threads can finish flushing, two are reused for an additional document: These will then increment the seqNo above the assumed maximum, which was calculated during the flush as 7. Note the additional numDocsInRAM for segments _3 and _0 Thus causing Lucene to incorrectly account for the sequence of document actions during a flush and tripping this test failure. Like all good bug fixes, the actual fix is about 10 lines of code . But took two engineers multiple days to actually figure out: Some lines of code take longer to write than others. And even require the help of some new friends. Not all heroes wear capes Yes, it's cliche – but it's true. Concurrent program debugging is incredibly important. These tricky concurrency bugs take an inordinate amount of time to debug and work through. While new languages like Rust have built in mechanisms to help prevent race conditions like this, the majority of software in the world is already written, and written in something other than Rust . Java, even after all these years, is still one of the most used languages. Improving debugging on JVM based languages makes the software engineering world better. And given how some folks think that code will be written by Large Language Models, maybe our jobs as engineers will eventually just be debugging bad LLM code instead of just our own bad code. But, no matter the future of software engineering, concurrent program debugging will remain critical for maintaining and building software. Thank you Ao Li and his colleagues from the PASTA Lab for making it that much better. Report an issue Related content Lucene December 27, 2024 Lucene bug adventures: Fixing a corrupted index exception Sometimes, a single line of code takes days to write. Here, we get a glimpse of an engineer's pain and debugging over multiple days to fix a potential Apache Lucene index corruption. BT By: Benjamin Trent Lucene January 3, 2025 Lucene Wrapped 2024 2024 has been another major year for Apache Lucene. In this blog, we’ll explore the key highlights. CH By: Chris Hegarty Vector Database Lucene June 26, 2024 Elasticsearch vs. OpenSearch: Vector Search Performance Comparison Elasticsearch is out-of-the-box 2x–12x faster than OpenSearch for vector search US By: Ugo Sangiorgi Lucene ML Research November 11, 2023 Understanding scalar quantization in Lucene Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights. BT By: Benjamin Trent Vector Database February 6, 2025 A quick introduction to vector search This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. VC By: Valentin Crettaz Jump to Software Engineer's Bane A False Hope A hero in the fray The Core Idea The Challenge: Controlling Thread Scheduling Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Concurrency bugs in Lucene: How to fix optimistic concurrency failures - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/optimistic-concurrency-lucene-debugging",
+    "meta_description": "Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Elastic Cloud Serverless Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Elastic Cloud Serverless Ingestion September 20, 2024 Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. VR MR By: Vishal Raj and Marc Lopez Rubio Elastic Cloud Serverless September 6, 2024 Stateless: Data safety in a stateless world We discuss the data durability guarantees in stateless including how we fence new writes and deletes with a safety check which prevents stale nodes from acknowledging new writes or deletes HA By: Henning Andersen Elastic Cloud Serverless September 3, 2024 Leveraging Kubernetes controller patterns to orchestrate Elastic workloads globally Understand how Kubernetes controller primitives are used at very large scale to power Elastic Cloud Serverless. SG By: Sebastien Guilloux Elastic Cloud Serverless August 8, 2024 Search tier autoscaling in Elasticsearch Serverless Explore search tier autoscaling in Elasticsearch Serverless. Learn how autoscaling works, how the search load is calculated and more. MP JV By: Matteo Piergiovanni and John Verwolf 1 2 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elastic Cloud Serverless - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/elastic-cloud-serverless",
+    "meta_description": "Elastic Cloud Serverless articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog A tutorial on building local agent using LangGraph, LLaMA3 and Elasticsearch vector store from scratch This article will provide a detailed tutorial on implementing a local, reliable agent using LangGraph, combining concepts from Adaptive RAG, Corrective RAG, and Self-RAG papers, and integrating Langchain, Elasticsearch Vector Store, Tavily AI for web search, and LLaMA3 via Ollama. Generative AI Vector Database Agent How To PR By: Pratik Rana On September 2, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this tutorial we are going to see how we can create a reliable agent using LangGraph, LLaMA3 and Elasticsearch Vector Store from scratch. We will be combining ideas from 3 Advanced RAG papers: Adaptive RAG for Routing : Which directs questions to a vector store or web search based on the content Corrective RAG for Fallback : Using this we will introduce a Fallback retrival where if a question isn't relevant to the vector store, we will use a web-search instead. Self RAG for Self Correction : Additonally, we will add self-correction to check generations for hallucinations and relevance, and if they're not suitable, we'll fallback to web search again. Hence what we are aiming to build is a complex RAG flow and demonstrate its reliability and local execution on our system. Background information What is an LLM agent? An LLM-powered agent can be described as a system that leverages a Large Language Model (LLM) to reason through problems, devise plans to solve them, and execute these plans using a set of tools. In essence, these agents possess complex reasoning abilities, memory, and the means to carry out tasks. Building agents with an LLM as the core controller is an exciting concept. Several proof-of-concept demonstrations, such as AutoGPT, GPT-Engineer, and BabyAGI, serve as inspiring examples. The potential of LLMs extends beyond generating well-written text, stories, essays, and programs; they can be framed as powerful general problem solvers. Agent system overview In an LLM-powered autonomous agent system, the LLM functions as the agent’s brain, complemented by several key components: Planning Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks. Reflection and refinement: The agent engages in self-criticism and self-reflection over past actions, learns from mistakes, and refines future steps, thereby improving the quality of final results. Memory Short-term memory: Serves as a dynamic repository of the agent's current actions and thoughts, akin to its \"train of thought,\" as it endeavors to respond to a user's query in real-time. It allows the agent to maintain a contextual understanding of the ongoing interaction, enabling seamless and coherent communication. Long-term memory: Acts as a comprehensive logbook, chronicling the agent's interactions with users over an extended period, spanning weeks or even months. It captures the history of conversations, preserving valuable context and insights gleaned from past exchanges. This repository of accumulated knowledge enhances the agent's ability to provide personalized and informed responses, drawing upon past experiences to enrich its interactions with users. Hybrid memory: It combines the advantages of both STM and LTM to enhance the agent's cognitive abilities. STM ensures that the agent can quickly access and manipulate recent data, maintaining context within a conversation or task. LTM expands the agent's knowledge base by storing past interactions, learned patterns, and domain-specific information, enabling it to provide more informed responses and make better decisions over time. Tool use In the context of LLM (Large Language Model) agents, tools refer to external resources, services, or APIs (Application Programming Interfaces) that the agent can utilize to perform specific tasks or enhance its capabilities. These tools serve as supplementary components that extend the functionality of the LLM agent beyond its inherent language generation capabilities. Tools could also include databases, knowledge bases, and external models. As an illustration, agents can employ a RAG pipeline for producing contextually relevant responses, a code interpreter for addressing programming challenges, an API for conducting internet searches, or even utilize straightforward API services such as those for weather updates or instant messaging applications. Types of LLM agents and use cases Conversational agents : Engage users in natural language dialogues to provide information, answer questions, and assist with tasks. They utilize LLMs to generate human-like responses. Task-oriented agents : Focus on completing specific tasks or objectives by understanding user needs and executing relevant actions. Examples include virtual assistants and automation tools. Creative agents : Generate original content such as artwork, music, or writing. They use LLMs to understand human preferences and artistic styles, producing content that appeals to audiences. Collaborative agents : Work with humans to achieve shared goals by facilitating communication and cooperation. LLMs help these agents assist in decision-making, report generation, and providing insights. Approach: ReAct/Langchain agent vs LangGraph ? Now, let's consider using an agent to build a corrective RAG (Retrieval-Augmented Generation) system, represented by that middle blue component that can bee seen in the diagram above. When people think about agents, they often mention \"ReAct\"—a popular framework for building agents (not to be confused with the React.js framework). The typical flow in a ReAct agent looks like this: The LLM (Language Learning Model) plans by selecting an action, observing the result, reflecting on it, and then choosing the next action. ReAct agents usually leverage memories, such as chat history or a vector store, and can utilize various tools. If we were to implement this flow as a ReAct agent, it would look something like this: The agent would receive a question and perform an action, such as using its vector store to retrieve relevant documents. It would then observe the retrieved documents and decide to grade them. The agent would go back to its action phase and select the grading tool. This process would repeat in a loop, following a defined trajectory until the task is complete. This is how ReAct-based agents typically function. However, this approach can be quite complex and involve a lot of decision-making. Instead, we’ll use a different method to implement this system. Rather than having the agent make decisions at every step in the loop, we’ll define a \"control flow\" in advance. As engineers, we can lay out the exact sequence of steps we want our agent to follow each time it runs, effectively taking the planning responsibility away from the LLM. This predefined control flow allows the LLM to focus on specific tasks within each step. In terms of memory, we can use what’s called a \"graph state\" to persist information across the control flow, making it relevant to the RAG process (e.g., documents and questions). For tool usage, each graph node can utilize a different tool: the Vectorstore Retrieval node (depicted in grey) will use a retriever tool, the Grade Documents node (depicted in blue) will use a grading tool, and the Web Search node (depicted in red) will use a web search tool: This method simplifies decision-making for the LLM, making the system more reliable, especially when using smaller LLMs. Prerequisites Before diving into the code, we need to set up the necessary tools: Elasticsearch : In this tutorial, we’ll use Elasticsearch as our data store because it offers more than just a vector database for a superior search experience. Elasticsearch provides a complete vector database, multiple retrieval methods (text, sparse and dense vector, hybrid), and the flexibility to choose your machine learning model architectures. There’s a reason it’s the world’s most downloaded database! To follow along, you’ll need to deploy an Elasticsearch Cluster, which can be done in under 3 minutes as part of our 14-day free trial (no credit card required). Get started by clicking here . Ollama : Ollama is a platform that simplifies local development with open-source large language models (LLMs). It packages everything you need to run an LLM—model weights and configurations—into a single Modelfile, similar to how Docker works for containers. You can download Ollama for your machine by clicking here . Just one small thing here to note is that the llama3 comes with a particular prompt format , that one needs to pay attention to. After installation, verify it by running the following command: Next, install the llama3 model, which will serve as our local LLM for this tutorial: Tavily search : Tavily's Search API is a specialized search engine designed for AI agents (LLMs), providing real-time, accurate, and factual results with impressive speed. To use this API in your tutorial, you'll need to sign up on the Tavily platform and obtain an API key. The good news is that this powerful tool is free to use. You can get started by clicking here . Great!! So now that your environment is ready, we can move on to the fun part—writing our Python code! Python code 1. Install required packages : To begin, install all the necessary packages by running the following command: 2. Set up the local LLM and the Tavily search API : After the installation is complete, set the variable local_llm to \"llama3\" . This will define the local LLM you’ll be using in this tutorial. Feel free to change this parameter later if you want to experiment with other local LLMs on your system, and also define the Tavily Search API key obtained in the Prerequisites in your environment variable like below: 1. Indexing First we will need to load, process, and index our targetted data into our Vector Store. In this tutorial we will be indexing documents from these respective Blog posts: \" https://lilianweng.github.io/posts/2023-06-23-agent/ \", \" https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/ \", \" https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/ \", Into our vector store, which will then add as a data source for our RAG implementation, as the index is the key component of our RAG flow without which we won't be able to retrive the documents. Code description: A list of URLs is defined, pointing to three different blog posts on Lilian Weng's website. The content from each URL is loaded using WebBaseLoader , and the result is stored in the docs list. The loaded documents are stored as a list of lists (each containing one or more documents). These lists are flattened into a single list using a list comprehension. The RecursiveCharacterTextSplitter is initialized with a specific chunk size (250 characters) and no overlap. This is used to split the documents into smaller chunks. The split chunks are stored in the documents variable. An instance of NomicEmbeddings is created to generate embeddings for the document chunks. The model used is specified as \"nomic-embed-text-v1.5\" , and inference is done locally. The documents, along with their embeddings, are stored in an Elasticsearch database. The connection details (URL, username, password) and the index name are provided. Finally, a retriever object is created from the Elasticsearch database, which can be used to query and retrieve documents based on their embeddings. 2. Retrieval grader Once we index our respective documents into the data store will need to create a grader that evaluates the relevance of our retrieved document to a given user question. Now this is where llama3 comes in, I set my local_llm to llama3 and llama has \"json\" mode which confirm the output from LLM is also json, so my prompt basically says grade a document and return a json with score yes / no Code description: LLM initialization: The ChatOllama model is instantiated with a specific configuration. The model is set to output responses in JSON format with a temperature of 0, meaning the output is deterministic (no randomness). Prompt template: A PromptTemplate is defined, which sets up the instructions that will be sent to the LLM. This prompt instructs the LLM to act as a grader that assesses whether a retrieved document is relevant to a user’s question. The grader’s task is simple: if the document contains keywords related to the user question, it should return a binary score ( yes or no ) indicating relevance. The response is expected in a JSON format with a single key score . Retrieval grader pipeline: The retrieval_grader is created by chaining the prompt , llm , and JsonOutputParser together. This forms a pipeline where the user’s question and the document are first formatted by the PromptTemplate , then processed by the LLM, and finally, the output is parsed by JsonOutputParser . Example usage: A sample question (\"agent memory\") is defined. The retriever.invoke(question) method is used to fetch documents related to the question. The content of the second retrieved document ( docs[1] ) is extracted. The retrieval_grader pipeline is then invoked with the question and document as inputs. The output is the JSON-formatted binary score indicating whether the document is relevant. 3. Generator Moving on we need to script a code that can generate a concise answer to user's question using context from retrieved documents. Code description: prompt : This is a PromptTemplate object that defines the structure of the prompt sent to the language model (LLM). The prompt instructs the LLM to act as an assistant for answering questions. The LLM is provided with a question and context (retrieved documents) and is instructed to generate a concise answer in three sentences or fewer. If the LLM doesn't know the answer, it is told to simply say that it doesn't know. llm : This initializes the LLM using the ChatOllama model with a temperature of 0, which ensures that the output is more deterministic and less random. format_docs(docs) : This function takes a list of document objects and concatenates their content ( page_content ) into a single string, with each document's content separated by a double newline ( \\n\\n ). This formatted string is then used as the context in the prompt. rag_chain : This creates a processing chain that combines the prompt , the LLM ( llm ), and the StrOutputParser . The prompt is filled with the question and context , sent to the LLM for processing, and the output is parsed into a string using StrOutputParser . Running the chain : question : The user's question, in this case, \"agent memory.\" docs : A list of documents retrieved using the retriever.invoke(question) function, which retrieves documents relevant to the question . format_docs(docs) : Formats the retrieved documents into a single string of context, separated by double newlines. rag_chain.invoke({\"context\": format_docs(docs), \"question\": question}) : This line executes the chain. It passes the formatted context and question into the rag_chain , which processes the input through the LLM and returns the generated answer. print(generation) : Outputs the generated answer to the console. 4. Hallucination grader and answer grader This code snippet defines two separate graders—one for assessing hallucination in a generated answer and another for evaluating the usefulness of the answer in resolving a question. Both graders use a language model (LLM) to provide binary scores (\"yes\" or \"no\") based on specific criteria Hallucination grader Code description: LLM Initialization: llm : Initializes the ChatOllama language model with a JSON output format and a temperature of 0, making the model's output deterministic. Prompt Creation: prompt : A PromptTemplate is created to define the structure of the prompt sent to the LLM. The prompt instructs the LLM to assess whether a given answer ( generation ) is grounded in or supported by a set of facts ( documents ). The model is instructed to output a binary score ( \"yes\" or \"no\" ) in JSON format, indicating whether the answer is factual according to the provided documents. Hallucination grader setup: hallucination_grader : This is a pipeline combining the prompt , the LLM, and the JsonOutputParser . The prompt is filled with the input variables ( generation and documents ), processed by the LLM, and the output is parsed into a JSON format by JsonOutputParser . Running the hallucination grader: hallucination_grader.invoke(...) : Executes the hallucination grader by passing in the documents (facts) and the generation (the answer being assessed). The LLM then evaluates whether the answer is grounded in the provided facts and returns a binary score in JSON format. Answer grader Code description: LLM initialization: llm : Similar to the hallucination grader, this initializes the ChatOllama model with the same settings for deterministic output. Prompt creation: prompt : A PromptTemplate is created for evaluating the usefulness of an answer. This prompt instructs the LLM to assess whether a given answer ( generation ) is useful in resolving a specific question ( question ). Again, the LLM outputs a binary score ( \"yes\" or \"no\" ) in JSON format, indicating whether the answer is useful. Answer grader setup: answer_grader : This pipeline combines the prompt , the LLM, and the JsonOutputParser , similar to the hallucination grader. Running the answer grader: answer_grader.invoke(...) : Executes the answer grader by passing in the question and generation (the answer being evaluated). The LLM assesses whether the answer is useful in resolving the question and returns a binary score in JSON format. 5. Router This code snippet defines a \"Router\" system designed to determine whether a user’s question should be directed to a vectorstore or a web search for further information retrieval. Here’s a detailed explanation of each par: Code description: LLM initialization: llm : Initializes the ChatOllama language model with a JSON output format and a temperature of 0, ensuring deterministic (non-random) results from the model. Prompt creation: prompt : A PromptTemplate is created to define the structure of the input prompt sent to the LLM. This prompt instructs the LLM to act as an expert in routing user questions to the appropriate datasource: either a vectorstore or a web search. The decision is based on the content of the question: If the question relates to topics like \"LLM agents,\" \"prompt engineering,\" or \"adversarial attacks,\" it should be routed to a vectorstore. Otherwise, the question should be routed to a web search. The LLM is instructed to return a binary choice: either \"vectorstore\" or \"web_search\" . The response should be in JSON format with a single key \"datasource\" . Router setup: question_router : This is a processing chain that combines the prompt , the LLM, and the JsonOutputParser . The prompt is populated with the question, processed by the LLM to make the routing decision, and the output is parsed into JSON format by the JsonOutputParser . Running the Router: question : The user's query, in this case, \"llm agent memory.\" docs : A list of documents retrieved using the retriever.get_relevant_documents(question) function, which fetches documents relevant to the question. This part of the code appears to retrieve documents but is not directly involved in the routing decision. question_router.invoke({\"question\": question}) : This line executes the router. The question is passed to the question_router , which processes it through the LLM and returns a JSON object with a key \"datasource\" indicating whether the question should be routed to a \"vectorstore\" or \"web_search\" . print(question_router.invoke(...)) : Outputs the routing decision (either \"vectorstore\" or \"web_search\" ) to the console. 6. Web search The code sets up a web search tool that can be used to query the web and retrieve a limited number of search results (in this case, 3). This is useful in scenarios where you want to integrate external web search capabilities into a system, enabling it to fetch information from the internet and use that information for further processing or decision-making. Code description: Imports: TavilySearchResults : This is a class imported from the langchain_community.tools.tavily_search module. It is used to perform web searches and retrieve search results. Web Search Tool Initialization: web_search_tool : This variable is an instance of the TavilySearchResults class. It represents a tool configured to perform web searches. k=3 : This parameter specifies that the tool should return the top 3 search results for any given query. The k value determines how many results are fetched and processed by the search tool. 7. Control flow This code defines a stateful, graph-based workflow for processing user queries. It retrieves documents, generates answers, grades relevance, and routes the process based on the current state. This system is highly modular, allowing each step in the process to be independently defined and controlled, making it flexible and scalable for various use cases involving document retrieval, question answering, and ensuring the quality and relevance of generated content. State definition GraphState : A TypedDict that defines the structure of the state that the graph will manage. It includes: question : The user's query. generation : The answer generated by the LLM. web_search : A flag indicating whether a web search should be added. documents : A list of documents retrieved during the process. Node functions Each of the following functions represents a node in the graph, performing a specific task within the workflow. retrieve(state) Purpose : Retrieves documents from a vectorstore based on the user's question. Returns : Updates the state with the retrieved documents. generate(state) Purpose : Generates an answer using a Retrieval-Augmented Generation (RAG) model on the retrieved documents. Returns : Updates the state with the generated answer. grade_documents(state) Purpose : Grades the relevance of each retrieved document to the question and filters out irrelevant documents. If any document is irrelevant, it sets a flag to indicate that a web search is needed. Returns : Updates the state with the filtered documents and the web search flag. web_search(state) Purpose : Conducts a web search based on the user's question and appends the results to the list of documents. Returns : Updates the state with the web search results. Conditional edges These functions determine the next step in the workflow based on the current state. route_question(state) Purpose : Routes the question to either a web search or vectorstore retrieval based on its content. Returns : The next node to execute, either \"websearch\" or \"vectorstore\" . decide_to_generate(state) Purpose : Decides whether to generate an answer or perform a web search based on the relevance of the graded documents. Returns : The next node to execute, either \"websearch\" or \"generate\" . grade_generation_v_documents_and_question(state) Purpose : Grades the generated answer for hallucinations (whether it is grounded in the provided documents) and checks if the answer addresses the user's question. Returns : The next node to execute, based on whether the answer is grounded and useful. Workflow definition StateGraph : Initializes a graph that will manage the state transitions. add_node : Adds the nodes (functions) to the graph, associating each node with a name that can be used to call it in the workflow. 8. Build graph This code builds the logic and flow of the stateful workflow using a state graph. It determines how the process should move from one node (operation) to the next based on the conditions and results at each step. The workflow starts by deciding whether to retrieve documents from a vectorstore or perform a web search based on the user's question. It then assesses the relevance of the retrieved documents, deciding whether to generate an answer or conduct further web searches if the documents aren't relevant. Finally, it generates an answer and checks whether it is well-supported and useful, repeating steps or ending the workflow based on the outcome. This structure ensures that the workflow is dynamic, able to adjust based on the results at each stage, and ultimately aims to produce a well-supported and relevant answer to the user's question. Code description: Set the Conditional Entry Point set_conditional_entry_point : This method sets the starting point of the workflow based on a conditional decision. route_question : The function that determines whether the question should be routed to a web search or a vectorstore retrieval. \"websearch\": \"websearch\" : If route_question decides that the question should be routed to a web search, the workflow starts with the websearch node. \"vectorstore\": \"retrieve\" : If route_question decides that the question should be routed to the vectorstore, the workflow starts with the retrieve node. Add an Edge Between Nodes add_edge : This method creates a direct transition from one node to another in the workflow. \"retrieve\" -> \"grade_documents\" : After the documents are retrieved in the retrieve node, the workflow moves to the grade_documents node, where the retrieved documents are assessed for relevance. Add conditional edges add_conditional_edges : This method creates conditional transitions between nodes based on the result of a decision function. \"grade_documents\" : The node where the relevance of retrieved documents is assessed. decide_to_generate : The function that decides the next step based on the relevance of the documents. \"websearch\": \"websearch\" : If decide_to_generate determines that a web search is necessary (because the documents are not relevant), the workflow transitions to the websearch node. \"generate\": \"generate\" : If the documents are relevant, the workflow transitions to the generate node, where an answer is generated using the documents. Add an edge between nodes \"websearch\" -> \"generate\" : After performing a web search, the workflow moves to the generate node to generate an answer using the results from the web search. Add conditional edges for final decision \"generate\" : The node where an answer is generated using the documents (retrieved or from the web search). grade_generation_v_documents_and_question : The function that checks whether the generated answer is grounded in the documents and relevant to the question. \"not supported\": \"generate\" : If the generated answer is not well-supported by the documents, the workflow loops back to the generate node to attempt generating a better answer. \"useful\": END : If the generated answer is both grounded in the documents and addresses the question, the workflow ends ( END ). \"not useful\": \"websearch\" : If the generated answer is grounded in the documents but does not address the question adequately, the workflow transitions back to the websearch node to gather more information and try again. All done !! Now that our implementation is complete, let's test the graph by compiling and executing it as a whole, the good thins is this will also print out the steps as we go: Test 1 : Lets write a question which is relevant to the Blog Posts with respect to which we created our index in the data store? Test 2 : Lets write another question related to current affairs i.e completely out of context to the data that we indexed from the blog posts ? What do you see in the output of both these tests? For Test 1 The output shows the step-by-step execution of the workflow and the decisions made at each stage: Routing the question : Output : ---ROUTE QUESTION--- Question : \"What is agent memory?\" Decision : The workflow determines that the question should be routed to the vectorstore based on the question's content. Result : {'datasource': 'vectorstore'} and ---ROUTE QUESTION TO RAG--- . Retrieving documents : Output : ---RETRIEVE--- The workflow retrieves documents related to the question from the vectorstore. Grading document relevance : Output : ---CHECK DOCUMENT RELEVANCE TO QUESTION--- The workflow grades each retrieved document to determine if it is relevant to the question. Results : All retrieved documents are graded as relevant ( ---GRADE: DOCUMENT RELEVANT--- repeated four times). Deciding to generate an answer : Output : ---ASSESS GRADED DOCUMENTS--- Since the documents are relevant, the workflow decides to proceed with generating an answer ( ---DECISION: GENERATE--- ). Generating the answer : Output : ---GENERATE--- The workflow generates an answer using the relevant documents. Checking for hallucinations : Output : ---CHECK HALLUCINATIONS--- The workflow checks if the generated answer is grounded in the documents. Result : The answer is grounded ( ---DECISION: GENERATION IS GROUNDED IN DOCUMENTS--- ). Grading the answer against the question : Output : ---GRADE GENERATION vs QUESTION--- The workflow evaluates whether the generated answer addresses the question. Result : The answer is useful ( {'score': 'yes'} and ---DECISION: GENERATION ADDRESSES QUESTION--- ). Final output : Output : 'Finished running: generate:' Generated answer : For Test 2 This output follows the same workflow as the previous example but with a different question related to the NBA draft and the LA Lakers. Here's a breakdown of what happened during this run: Routing the question : Output : ---ROUTE QUESTION--- Question : \"Who are the LA Lakers expected to draft first in the NBA draft?\" Decision : The workflow determines that the question should be routed to a web search ( 'datasource': 'web_search' ), as it likely requires up-to-date information that isn't stored in the vectorstore. Result : web_search and ---ROUTE QUESTION TO WEB SEARCH--- . Web search : Output : ---WEB SEARCH--- The workflow performs a web search to gather the most current and relevant information regarding the Laker's draft picks. Result : 'Finished running: websearch:' indicates that the web search step is complete. Generating the answer : Output : ---GENERATE--- Using the information retrieved from the web search, the workflow generates an answer to the question. Checking for hallucinations : Output : ---CHECK HALLUCINATIONS--- The workflow checks if the generated answer is grounded in the retrieved web search documents. Result : The answer is well-supported ( ---DECISION: GENERATION IS GROUNDED IN DOCUMENTS--- ). Grading the answer against the question : Output : ---GRADE GENERATION vs QUESTION--- The workflow evaluates whether the generated answer directly addresses the question. Result : The answer is useful and relevant ( {'score': 'yes'} and ---DECISION: GENERATION ADDRESSES QUESTION--- ). Final output : Output : 'Finished running: generate:' Generated Answer : Key points of the workflow for test 1 vs test 2 Routing to web search : The workflow correctly identified that the question needed current information, so it directed the query to a web search rather than a vectorstore. Answer generation : The workflow successfully used the latest information from the web to generate a coherent and relevant response about the Lakers' expected draft pick. Grounded and useful nswer : The workflow validated that the generated answer was both grounded in the search results and directly addressed the question. Conclusion In a relatively short amount of time, we've managed to build a sophisticated Retrieval-Augmented Generation (RAG) workflow that includes routing, retrieval, grading, and various decision points such as fallback to web search and dual-criteria grading of generated content. What’s particularly impressive is that this complex RAG flow, incorporating concepts from multiple research papers, can run reliably on a local machine. The key to achieving this lies in the well-defined control flow, which ensures that the local agent operates smoothly and effectively. We encourage you to experiment with different queries and implementations, as this approach provides a powerful foundation for creating more advanced RAG agents. Hopefully, this serves as a useful guide for developing your own RAG workflows. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to Background information What is an LLM agent? Agent system overview Planning Memory Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "A tutorial on building local agent using LangGraph, LLaMA3 and Elasticsearch vector store from scratch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/local-rag-agent-elasticsearch-langgraph-llama3",
+    "meta_description": "Create a reliable agent using LangGraph, LLaMA3 & Elasticsearch. Follow this LangGraph tutorial to implement agents combining concepts from RAG."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. Vector Database Generative AI Elastic Cloud Serverless Python How To QP By: Quentin Pradet On October 4, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. This blog will show you how to use Eland to import machine learning models to Elasticsearch Serverless, and then how to explore Elasticsearch using a Pandas-like API. NLP in Elasticsearch Serverless & Eland Since Elasticsearch 8.0 , it is possible to use NLP machine learning models directly from Elasticsearch. While some models such as ELSER (for English data) or E5 (for multilingual data) can be deployed directly from Kibana, all other compatible PyTorch models need to be uploaded using Eland. Since Eland 8.14.0, eland_import_hub_model fully supports Serverless. To get the connection details, open your Serverless project in Kibana, select the \"cURL\" client, create an API key, and export the environment variables: You can then use those variables when running eland_import_hub_model : Next, search for \"Trained Models\" in Kibana, which will offer to synchronize your trained models. Once done, you will get the option to deploy your model: Less than a minute later, your model should be deployed and you'll be able to test it directly from Kibana. In this test sentence, the model successfully identified Joe as \"Person\" and \"Reunion Island\" as a location, with high probability. For more details on using Eland for machine learning models (including scikit-learn, XGBoost and LightGBM, not covered here), consider reading the detailed Accessing machine learning models in Elastic blog post and referring to the Eland documentation . Data frames in Eland The other main functionality of Eland is exploring Elasticsearch data using a Pandas-like API. Ingesting test data Let's first index some test data to Elasticsearch. We'll use a fake flights dataset. While uploading using the Python Elasticsearch client is possible, in this post we'll use Kibana's file upload functionality instead, which is enough for quick tests. First, download the dataset https://github.com/elastic/eland/blob/main/tests/flights.json.gz and decompress it ( gunzip flights.json.gz ). Next, type \"File Upload\" in Kibana's search bar and import the flights.json file. Kibana will show you the resulting fields, with \"Cancelled\" detected as a boolean, for example. Click on \"Import\". On the next screen, choose \"flights\" for the index name and click \"Import\" again. As in the screenshot below, you should see that the 13059 documents were successfully ingested in the \"flights\" index. Connecting to Elasticsearch Now that we have data to search, let's setup the Elasticsearch Serverless Python client. (While we could use the main client, the Serverless Elasticsearch Python client is usually easier to use, as it only supports Elasticsearch Serverless features and APIs.) From the Kibana home page, you can select Python which will explain how to install the Elasticsearch Serverless Python client, create an API key, and use it in your code. You should end up with this code: Searching data with Eland Finally, assuming that the above code worked, we can start using Eland. After having installed it with python -m pip install eland>=8.14 , we can start exploring our flights dataset. If you run this code in a notebook, the result will be the following table: AvgTicketPrice Cancelled Carrier Dest DestAirportID DestCityName DestCountry DestLocation.lat DestLocation.lon DestRegion ... Origin OriginAirportID OriginCityName OriginCountry OriginLocation.lat OriginLocation.lon OriginRegion OriginWeather dayOfWeek timestamp 882.982662 False Logstash Airways Venice Marco Polo Airport VE05 Venice IT 45.505299 12.3519 IT-34 ... Cape Town International Airport CPT Cape Town ZA -33.96480179 18.60169983 SE-BD Clear 0 2018-01-01T18:27:00 730.041778 False Kibana Airlines Xi'an Xianyang International Airport XIY Xi'an CN 34.447102 108.751999 SE-BD ... Licenciado Benito Juarez International Airport AICM Mexico City MX 19.4363 -99.072098 MX-DIF Damaging Wind 0 2018-01-01T05:13:00 841.265642 False Kibana Airlines Sydney Kingsford Smith International Airport SYD Sydney AU -33.94609833 151.177002 SE-BD ... Frankfurt am Main Airport FRA Frankfurt am Main DE 50.033333 8.570556 DE-HE Sunny 0 2018-01-01T00:00:00 181.694216 True Kibana Airlines Treviso-Sant'Angelo Airport TV01 Treviso IT 45.648399 12.1944 IT-34 ... Naples International Airport NA01 Naples IT 40.886002 14.2908 IT-72 Thunder & Lightning 0 2018-01-01T10:33:28 552.917371 False Logstash Airways Luis Munoz Marin International Airport SJU San Juan PR 18.43939972 -66.00180054 PR-U-A ... Ciampino___G. B. Pastine International Airport RM12 Rome IT 41.7994 12.5949 IT-62 Cloudy 0 2018-01-01T17:42:53 You can also run more complex queries such as aggregations: which outputs the following: DistanceKilometers AvgTicketPrice sum 9.261629e+07 8.204365e+06 min 0.000000e+00 1.000205e+02 std 4.578614e+03 2.664071e+02 The demo notebook in the documentation has many more examples that use the same dataset and the reference documentation lists all supported operations. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to NLP in Elasticsearch Serverless & Eland Data frames in Eland Ingesting test data Connecting to Elasticsearch Searching data with Eland Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using Eland on Elasticsearch Serverless - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/eland-elasticsearch-serverless",
+    "meta_description": "Learn how to use Eland on Elasticsearch Serverless: import machine learning models using eland_import_hub_model and easily search data with Eland."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog LangChain and Elasticsearch: Building LangGraph retrieval agent template Elasticsearch and LangChain collaborate on a new retrieval agent template for LangGraph for agentic apps Generative AI Agent Integrations JM AT SC By: Joe McElroy , Aditya Tripathi and Serena Chou On September 20, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The new LangGraph retrieval agent template is designed to simplify the development of Generative AI (GenAI) agentic applications that require agents to use Elasticsearch for agentic retrieval. This template comes pre-configured to use Elasticsearch, allowing developers to build agents with LangChain and Elasticsearch quickly. To get started right away, access the project on Github: https://github.com/langchain-ai/retrieval-agent-template What is LangGraph? LangGraph helps developers build stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. There are a few new concepts to learn, like cycles, branching, and persistence – these allow developers to implement loops, conditions, and error handling mechanisms in applications. This makes LangGraph a great choice for creating complex workflows, where agents can pause for user input or correction. For more details you can check the Intro to LangGraph course on LangChain Academy. The new Retrieval Agent Template focuses on question-answering tasks by leveraging knowledge retrieval with Elasticsearch. Users can set up agents capable of retrieving relevant information based on natural language queries. The template provides an easy, configurable interface to Elasticsearch, making it a great starting point for developers looking to build search retrieval-based agents​. About LangGraph’s default Elasticsearch template Elasticsearch Vector Database Capabilities: The template leverages Elasticsearch’s Vector Storage and Search capabilities to enable more precise and relevant knowledge retrieval. Retrieval Agent Capability: This enables an agent to use Retrieval-Augmented Generation (RAG), helping Large Language Models (LLMs) provide more accurate and context-rich answers by retrieving the most relevant information from data stored within Elasticsearch. Integration with LangGraph Studio : With LangGraph Studio, developers can better understand and build complex agentic applications. It provides intuitive visualization and debugging tools in a user-friendly interface, making it easier to develop, optimize, and troubleshoot AI applications. Start building with LangGraph retrieval agent template Elastic and LangChain are excited to give developers a headstart building the next generation of intelligent, knowledge-driven AI agents using this template. Access the retrieval agent template on GitHub , or visit Search Labs for cookbooks using Elasticsearch and LangChain. Happy searching agenting! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Jump to What is LangGraph? About LangGraph’s default Elasticsearch template Start building with LangGraph retrieval agent template Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "LangChain and Elasticsearch: Building LangGraph retrieval agent template - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/langchain-langgraph-retrieval-agent-template",
+    "meta_description": "Explore LangGraph retrieval agent template, which simplifies the development of GenAI agentic apps that require agents to use Elasticsearch for agentic retrieval."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. Elastic Cloud Serverless Ingestion VR MR By: Vishal Raj and Marc Lopez Rubio On September 20, 2024 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Sending Observability data to Elastic Cloud has become increasingly easier with the introduction of Serverless. You can create an Observability serverless project in Elastic Cloud Serverless , and send observability signals directly to Elasticsearch or through what we’ve been calling the Managed Intake Service, a fully-managed solution akin to the Elastic APM Server. Gone are the days where you needed to worry about sizing and configuring different aspects of the APM Server. Instead, you can send data directly to the Managed Intake Service and let our platform do the hard scaling work for you. In this post, we’ll share how we iterated over the existing APM Server architecture to create the Managed Intake Service. APM Server primer APM Server was designed as a single-tenant process that receives and enriches raw observability signals (traces, logs and metrics) and indexes them to Elasticsearch. It also produces multi-interval (1m, 10m and 60m intervals) aggregated metrics (roll ups) based on the received signals to efficiently power the Observability APM UI. More on why aggregations are needed can be perused here . While the current APM Server offering in Elastic Cloud works well for use cases with predictable or known load, it has a few limitations that we aimed to improve in the first iteration of the next-generation ingest service: APM Server doesn’t automatically scale with the incoming load, requiring users to make upfront decisions based on estimated or actual maximum ingestion load. Each APM Server instance should ideally generate 1 aggregated metric document for each unique set of aggregated dimensions. However, with the current model the number of metrics are directly correlated with the number of replicas for APM Server, which impacts the effectiveness for aggregation. Aggregations are performed in-memory, requiring increasing amounts of memory as the number of unique combinations increases. Ingestion of observability data in APM Server and indexing of the data to Elasticsearch are tightly coupled. Any buffer between these 2 processes are memory based, thus, if Elasticsearch is in the process of scaling up or under-provisioned, the push-back could eventually cause ingestion to grind to a halt, returning errors to clients. Keep these points in mind as we iterate through the different implementations we tried. APM Server as a managed product When we started, we asked ourselves, what is the simplest approach we could take? How well would that work? Does it satisfy the previous requirements? That simplest approach would be to provision a single APM Server for each project and use horizontal autoscaling on each project's APM Server. However, that results in an enormous amount of wasted compute resources for observability projects that do not ingest observability signals through the APM Server. Additionally, it didn’t address any of the initial limitations, or improve the overall APM Server experience in Elastic Cloud. It became clear we wanted (and needed) to experiment with a multi-tenant approach. Again, we looked at the simplest approach we could take to shorten the feedback loop as much as possible. Multi Tenant APM Server Our logical next step was to extend the current APM Server codebase. One of the requirements for the multi-tenant APM Server was to be able to distinguish between different tenants and route to the appropriate Elasticsearch instance for that tenant. We came up with a consistent hash ring load balancer to route to the same APM Server for the same tenant. This would satisfy our bounded aggregation requirement (2) to avoid generating multiple aggregation documents for the same unique set of dimensions for an event. However, as we continued designing the remainder of the multi-tenant APM Server, it looked a lot like multiple services rolled up in one box (or a distributed monolith, something we’d want to avoid). Additionally, the memory requirements looked quite large to avoid running out of memory, and perform reasonably well. After the initial prototype for feasibility, it became clear that iterating on the existing APM Server would not meet our expectations. We went back to the drawing board with the goal to design a multi-tenant, distributed system from the groud up in order to achieve the expected level of reliability, scalability, and availability. Break down the monolith Back to the drawing board! This time, we decided to break the APM Server into smaller services according to the bounded context they belonged to. Using this classification, the 3 main APM Server responsibilities are pretty obvious: Ingest signals and enrich them. Generate aggregated metric representations for known time periods Index the raw signals and the aggregated metrics once their time period has ended. Another advantage of breaking apart the APM Server into smaller services is that it allows independent scaling of each service, based on the specific demand(s) of the service. This translates in better resource utilization, simplified reasoning and maintenance of the service. Ingestion Service As the name indicates, the purpose of the ingestion service is to ingest the signals from various agents, like Elastic or OTel clients. The ingestion service also performs simple data enrichment to get the most out of the telemetry signals. The scalability requirement of the ingestion service is directly dependent on the number of clients sending data and the data volume. In addition to ingestion, the service performs distributed rate limiting for each customer sending data to the service. Rate limiting prevents sudden bursts from overwhelming the data processing pipeline. Aggregation Service Aggregations, or data roll-ups, are an essential component of the Elastic observability stack. Rolling-up metrics allows Kibana UIs to display telemetry signals for services more efficiently, allowing you to change that time range from minutes to days or years without incurring significant query performance degradation. In essence, it reduces the total number of documents that Elasticsearch has to aggregate, not dissimilar to materialized views in SQL/Database-land. Traditional APM Server performed in-memory aggregations, however, a memory based approach would be insufficient for a multi-project service with auto-scaling capabilities. Also, in-memory aggregation limits didn’t behave optimally since each aggregation type had individual limits to avoid out of memory issues. Since we wanted to solve both these problems at the same time (and after some initial experimenting with persistence implementations in the aggregation flow), we settled on a Log-Structured Merge(LSM)-tree approach using a key-value database, Pebble . This effort eventually materialized in apm-aggregation , a library to perform aggregations that are mostly constrained by disk size, with much smaller memory requirements. LSM-based aggregations were also released in APM Server from 8.10 onwards. We intentionally kept the library open to share the same improvements for self-managed and hosted APM Server. Index Service The indexing process buffers tens, hundreds, or thousands of events, and sends these in batches to Elasticsearch using the _bulk API. While inherently simple, there are some complexities to the process and it required a lot of engineering effort to get right: Data must be reliably indexed into Elasticsearch. There are two major scenarios where retries are necessary to avoid data loss: a. Temporary _bulk request rejections (the entire _bulk request was rejected by Elasticsearch because it can’t service it). b. Temporary individual document rejections (the _bulk requests succeeded, but one or more documents have not been indexed). Indexing must be fair but also proportional to the data volume for different tenants. The first sub-point (1a) was correctly addressed by the go-elasticsearch client’s built-in request retries on specific HTTP status codes . Retrying individual document rejections required a bit more engineering effort and led us to implement document-level retries in the shared indexing component ( go-docappender ) for both APM Server and the index process in the Managed Intake Service. The second point is addressed by the fundamental sharded architecture and the transport that is used between the Managed Intake Service components. In short, each project has a dedicated number of indexing clients to ensure a baseline quality of service. Glue it back together to create the Managed Intake Service While breaking down the service got us closer to our goal, we still had to decide how services were going to communicate and how data was going to flow from one service to another. Traditionally, most microservices communicate using simple HTTP/RPC-based request/response schemes. However, that requires services to always be up, or assume temporary unavailability, where unhealthy status codes are retried by the application, or using something like a service mesh to route requests to the appropriate application instance and potentially rely on status codes to allow the service mesh to transparently handle retries. While we considered this approach, it seemed unnecessarily complicated and brittle once you start considering different failure modes. We researched event processing systems and, unsurprisingly, we started considering the idea of using an event bus or queue as means of communication. Using an event bus instead of a synchronous RPC-based communication system has a lot of advantages for our use case. The main advantage is that it decouples producers and consumers (producers generate data, while consumers receive it and process it). This decoupling is incredibly advantageous for reliability and resilience, and allows asymmetric processing for a time until auto scaling comes into effect. We spent a significant amount of time vetting different event bus technologies and unsurprisingly decided that the strongest contender in many areas was… Kafka ! Using Kafka Since the tried and tested Kafka would be the glue between services, it gave us a high degree of confidence in being able to offer high availability and reliability. The data persistence offered by the event bus allows us to absorb consuming (and producing traffic spikes) delays and push-back from the persistence layer while keeping external clients happy on the ingest path. The next step was making the data pipeline come together. Our initial attempt resulted in Kafka topics for each signal type. Each topic received specific signal types for multiple projects – undoubtedly the most cost efficient approach with the given stack. Initial testing and closed-beta launches saw good performance; however, pagers started to ring, literally, once the number of projects (and data volume) grew. We were seeing alerts for delayed time-to-glass as well as poor indexing performance. While investigating the issue, we quickly discovered that our multi-tenant topics were creating hotspots and noisy neighbor issues. In addition, the indexing service was struggling to meet our SLOs consistently due to Head-Of-Line blocking issues. Taking a step back, we realized that a single tenant model of Elasticsearch requires a higher level of data isolation to guarantee performance, prevent noisy neighbors and eliminate Head-Of-Line blocking issues. We changed topics from multi-project per-event (per-signal type) topic, to per-project multi-event (multi-signal) i.e. each project would get its own topic. The per-project topics provided improved isolation while Elasticsearch autoscaled without affecting other projects. Additionally, given how Kafka partitioning works, it also allows partition scaling to meet increasing data volumes when single consumers are unable to cope with the load. Observing the system A simpler system is always easy to observe and while splitting our services was driven by necessity it also introduced observability challenges. More services may also translate into more (potential) points of failure. To alleviate the issue, we decided to observe our system based on customer impact. To this end, we decided to monitor our services using Service Level Objectives (SLOs) . SLOs gave us the required framework to objectively reason about the performance of our service, but we didn’t stop here. Since our goal was measuring customer impact, we drew out the critical user journeys and designed our SLOs to cover these. The next challenge was implementing SLOs. Fortunately for us, the broader Observability team was working on launching Service Level Objectives (SLOs) and we became one of the first users. To power the Service Level Indicators (SLIs) that power the SLOs, we carefully instrumented our services using a combination of metrics, logs and traces (surprise!). The majority of our SLOs are powered by metrics, since our services are fairly high-throughput services but lower-throughput SLOs are also powered by log sources. Since we focused on the customer’s impact and the different areas where things could go wrong, we had a very broad (and deep) initial instrumentation from the beginning. It greatly facilitated investigating pages and recurring issues in our new ingestion platform. Today, all our user journeys are monitored using SLOs end-to-end allowing us to proactively detect and resolve any issues before it bothers you, our customers. Level up your observability stack with the Managed Intake Service Managed Intake Service aims to level-up Elastic's observability offering by providing a seamless interface for our users to ingest their telemetry signals without thinking about the scale of data or spending business hours computing the infrastructure requirements to reliably host their current and near-future data. The service is live on Elastic Cloud and available to you when you create an Observability project in our serverless offering. Redesigning ingest for our observability platform has been a lot of fun for us and we hope it will help you level up your observability stack. While this blog post covered high-level architecture of Managed Intake Services, there is still much more to talk about. Keep an eye out for future posts where we will delve deeper into individual components. Report an issue Related content Integrations Ingestion +1 March 7, 2025 Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. JR By: Jeffrey Rengifo Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Integrations Ingestion +1 February 19, 2025 Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. AK By: Amit Khandelwal Integrations Ingestion +1 February 18, 2025 Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. JR TM By: Jeffrey Rengifo and Tomás Murúa Ingestion How To February 4, 2025 How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. AL By: Andre Luiz Jump to APM Server primer APM Server as a managed product Multi Tenant APM Server Break down the monolith Ingestion Service Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Architecting the next-generation of Managed Intake Service - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/architecting-managed-intake-service",
+    "meta_description": "Learn how Elastic iterated over the existing APM Server architecture to create the Managed Intake Service. Explore improvements made over the APM Server."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Leveraging Kubernetes controller patterns to orchestrate Elastic workloads globally Understand how Kubernetes controller primitives are used at very large scale to power Elastic Cloud Serverless. Elastic Cloud Serverless SG By: Sebastien Guilloux On September 3, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. Let's dive into how Kubernetes controller primitives are used at very large scale to power Elastic Cloud Serverless. As part of engineering our Elastic Cloud Serverless offering, we have been through a major redesign of our Cloud Architecture . The new architecture allows us to leverage the Kubernetes ecosystem for a resilient and scalable offering across 3 Cloud Services Providers, in many regions across the globe. Early on we decided to embrace Kubernetes, as the orchestration backend to run millions of containers for Serverless projects, but also as a composable platform that can be easily extended by our engineers to support various use cases. For example, provisioning new projects on demand, configuring Elasticsearch with object storage accounts, auto-scaling Elasticsearch and Kibana in the most performant and cost-effective way, provisioning fast local SSD volumes on-demand, metering resource usage, authenticating requests, scaling the fleet with more regions and clusters, and many more. Many of the Kubernetes design principles have guided us in that process. We naturally ended up building a set of APIs and controllers that follow Kubernetes conventions, but we also adapted the controller paradigm at a higher level, as the way we orchestrate resources globally. What does it look like in practice? Let's dive deep into some important aspects. Global control plane, regional data plane At a high-level, we can distinguish two high-level components: The global control plane services, responsible for global orchestration of projects across regions and clusters. It also acts as a scheduler and decides which regional Kubernetes cluster will host which Elastic Cloud Serverless project. The regional data plane services, responsible for orchestrating and running workloads at the scope of a single Kubernetes cluster. Each region can be made of several Kubernetes clusters, which we treat as disposable: they can be reconstructed from scratch at any time. This includes resources stored in the Kubernetes API Server, derived from the global control plane state. Both components include a variety of services, though many of them are in practice implemented as APIs and controllers: At the regional data plane level, they materialize as Custom Resource Definitions (CRDs) to declaratively specify entities manipulated by the system. They are stored as Custom Resources in the Kubernetes API Server, continuously reconciled by our custom controllers. At the global control plane level, they materialize as HTTP REST APIs that persist their data in a globally-available, scalable and resilient datastore, alongside controllers that continuously reconcile those resources. Note that, while they look and feel like “normal” Kubernetes controllers, global controllers take their inputs from Global APIs and a global datastore, rather than from the Kubernetes API Server and etcd ! Kubernetes controllers Kubernetes controllers behave in a simple, repeatable and predictable way. They are programmed to watch a source of events. On any event (creation, update, deletion), they always trigger the same reconciliation function. That function is responsible for comparing the desired state of a resource (for example, a 3-nodes Elasticsearch cluster) with the actual state of that resource (the Elasticsearch cluster currently has 2 nodes), and take action to reconcile the actual state towards the desired state (increase the number of replicas of that Elasticsearch deployment). The controller pattern is convenient to work with for various reasons: The level-triggered paradigm is simple to reason about. Asynchronous flows in the system are always encoded in terms of moving towards a declarative desired state, no matter what was the event that led to that desired state. This contrasts with an edge-triggered flow that considers every single variation of state (a configuration change, a version upgrade, a scale-up, a resource creation, etc.), and sometimes leads to modeling complex state machines. It is resilient to failures and naturally leads to a design where resources get self-repaired and self-healed automatically. Missing an intermediate event does not matter, as long as we can guarantee the latest desired state will get processed. Part of the Kubernetes ecosystem, the controller-runtime library comes with batteries included to easily build new controllers, as it abstracts away a number of important technical considerations: interacting with custom resources, watching through the API Server efficiently, caching objects for cheap reads, and enqueueing reconciliations automatically through a workqueue. The workqueue itself holds some interesting properties: Items are de-duplicated. If an item needs to be reconciled twice, due to two consecutive watch events, we just need to ensure the latest version of the object has been processed at least once. Failed reconciliations can easily be retried by appending the item again to the same workqueue. Those retries are automatically rate-limited, with an exponential backoff. The workqueue is populated automatically with all existing resources at controller startup. This ensures all items have been reconciled at least once, and covers for any missed event while the controller was unavailable. Global controller and global datastore The controller pattern and its internal workqueue fit very nicely with the needs of our regional data plane controllers. Conceptually, it would also apply quite well to our global control plane controllers! However, things get a bit more complicated there: As an important design principle for our backend, Kubernetes clusters and their state stored in etcd should be disposable. We want the operational ability to easily recreate and repopulate a Kubernetes cluster from scratch, with no important data or metadata loss. This led us to a strong requirement for our global state datastore: in case of regional failure, we want a multi-region failover with Recovery Point Objective (RPO) of 0, to ensure no data loss for our customers. The apiserver and etcd do not guarantee this by default. Additionally, while the regional data plane Kubernetes clusters have a strict scalability upper bound (they can’t grow larger than we want them to !), data in the global control plane datastore is conceptually unbounded. We want to support running millions of Serverless projects on the platform, and therefore require the datastore to scale along with the persisted data and query load, for a total amount of data that can be much larger than the amount of RAM on a machine. Finally, it is convenient for Global API services to work with the exact same data that global controllers are watching, and be able to serve arbitrarily complex SQL-like queries to fetch that data. The Kubernetes apiserver and etcd, as a key-value store, are not primarily designed for this use case. With these requirements in mind, we decided to not persist the global control plane data in the Kubernetes API, but rather in an external strongly consistent, highly-available and scalable general-purpose database. Fortunately, controller-runtime allows us to easily customize the source stream of events that trigger reconciliations. In just a few lines of code we were able to pipe our own logic of watching events from the global datastore into the controller. With this, our global controllers code largely look like regular Kubernetes controllers, while interacting with a completely different datastore. Workqueue optimizations at large scale What happens once we have 200,000 items in the global datastore that need to be reconciled at startup of a global controller? We can control the concurrency of reconciliations ( MaxConcurrentReconciles=N) , to consume the workqueue in parallel, with uniqueness guarantees that avoid concurrent processing of the same item. The degree of parallelism needs to be carefully thought through. If set too low, processing all items will take a very long time. For example, 200,000 items with an average of 1 second reconciliation duration and MaxConcurrentReconciles=1 means all items will only be processed after 2.3 days. Worse, if a new item gets created during that time, it may only be processed for the first time 2.3 days after creation! On the other hand, if MaxConcurrentReconciles is set too high, processing a very large number of items concurrently will dramatically increase CPU and IO usage, which generally also means increasing infrastructure costs. Can the global datastore handle 200,000 concurrent requests? How much would it then cost? To better address the trade-off, we decided to categorize reconciliations into 3 buckets: Items that need to be reconciled as soon as possible. Because they have been recently created/updated/deleted, and never successfully reconciled since then. These fit in a “high-priority reconciliations” bucket. Items that have already been reconciled successfully at least once with their current specification. The main reason why those need to be reconciled again is to ensure any code change in the controller will eventually be reflected through a reconciliation on existing resources. These can be processed reasonably slowly over time, since there should be no customer impact from delaying their reconciliation. These fit in a “low-priority reconciliations” bucket. Items that we know need to be reconciled at a particular point in time in the future. For example, to respect a soft-deletion period of 30 days, or to ensure their credentials are rotated every 24h. Those fit in a “scheduled reconciliations” bucket. This can be implemented through a similar mechanism as Generation and ObservedGeneration in some Kubernetes resources. On any change in the specification of a project, we persist the revision of that specification (a monotonically increasing integer). At the end of a successful reconciliation, we persist the revision that was successfully reconciled. To know whether an item deserves to be reconciled immediately, hence be put in the high-priority bucket, we can compare both revisions values. In case the reconciliation failed, it is enqueued again for being retried. We can then plug the controller reconciliation event handling logic to append the item in the appropriate workqueue. A separate low-priority workqueue is consumed asynchronously at a fixed rate (for example, 300 items per minute). Those consumed items then get appended to the main high-priority workqueue for immediate processing by the available controller workers. The controller then in practice works with two different workqueues. Since both rely on regular controller-runtime workqueues implementations, we benefit from built-in Prometheus metrics. Those allow monitoring the depth of the low-priority workqueue, for example, as a good signal of how many controller startup reconciliations are still pending. And the additions rate to the high priority workqueue, which indicates how much “urgent” work we're asking from the controller. Conclusion Kubernetes is a fascinating project. We have taken a lot of inspiration from its design principles and extension mechanisms. For some of them, extending their scope beyond the ability to manage resources in a single Kubernetes cluster, towards the ability to work with a highly-scalable datastore, with controllers able to reconcile resources in thousands of Kubernetes clusters across Cloud Service Providers regions. It has proven to be a fundamental part of our internal platform at Elastic, and allows us to develop and deliver new services and features to Elastic users every day. Stay tuned for more technical details in future posts. You can also check out talks by Elastic engineers at the last Kubecon + CloudNativeCon 2024: Building a Large Scale Multi-Cloud Multi-Region Saas Platform with Kubernetes Controllers Platform Engineering with the Argo Ecosystem: the Elastic story . Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Global control plane, regional data plane Kubernetes controllers Global controller and global datastore Workqueue optimizations at large scale Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Leveraging Kubernetes controller patterns to orchestrate Elastic workloads globally - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/kubernetes-controller-power-elastic-serverless-workloads",
+    "meta_description": "Understand how Kubernetes controller primitives are used at very large scale to power Elastic Cloud Serverless."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. Search Relevance DW By: Daniel Wrigley On May 26, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. The creation of judgement lists is a crucial step in optimizing search result quality, but it can be a complicated and difficult task. A judgement list is a curated set of search queries paired with relevance ratings for their corresponding results, also known as a test collection. Metrics computed using this list act as a benchmark for measuring how well a search engine performs. To help streamline the process of creating judgement lists, the OpenSource Connections team developed Quepid . Judgement can either be explicit or based on implicit feedback from users. This blog will guide you through setting up a collaborative environment in Quepid to effectively enable human raters to do explicit judgements, which is the foundation of every judgement list. Quepid supports search teams in the search quality evaluation process: Build query sets Create judgement lists Calculate search quality metrics Compare different search algorithms/rankers based on calculated search quality metrics For our blog, let's assume that we are running a movie rental store and have the goal of improving our search result quality. Prerequisites This blog uses the data and the mappings from the es-tmdb repository . The data is from The Movie Database . To follow along, set up an index called tmdb with the mappings and index the data. It doesn’t matter if you set up a local instance or use an Elastic Cloud deployment for this - either works fine. We assume an Elastic Cloud deployment for this blog. You can find information about how to index the data in the README of the es-tmdb repository . Do a simple match query on the title field for rocky to confirm you have data to search in: You should see 8 results. Log into Quepid Quepid is a tool that enables users to measure search result quality and run offline experiments to improve it. You can use Quepid in two ways: either use the free, publicly available hosted version at https://app.quepid.com , or set up Quepid on a machine you have access to. This post assumes you are using the free hosted version. If you want to set up a Quepid instance in your environment, follow the Installation Guide . Whichever setup you choose, you’ll need to create an account if you don’t already have one. Set up a Quepid Case Quepid is organized around \"Cases.\" A Case stores queries together with relevance tuning settings and how to establish a connection to your search engine. For first-time users, select Create Your First Relevancy Case . Returning users can select Relevancy Cases from the top-level menu and click + Create a case . Name your case descriptively, e.g., \"Movie Search Baseline,\" as we want to start measuring and improving our baseline search. Confirm the name by selecting Continue . Next, we establish a connection from Quepid to the search engine. Quepid can connect to a variety of search engines, including Elasticsearch. The configuration will differ depending on your Elasticsearch and Quepid setup. To connect Quepid to an Elastic Cloud deployment, we need to enable and configure CORS for our Elastic Cloud deployment and have an API key ready. Detailed instructions are in the corresponding how-to on the Quepid docs . Enter your Elasticsearch endpoint information ( https://YOUR_ES_HOST:PORT/tmdb/_search ) and any additional information necessary to connect (the API key in case of an Elastic Cloud deployment in the Advanced configuration options), test the connection by clicking on ping it and select Continue to move to the next step. Now we define which fields we want to see displayed in the case. Select all that help our human raters later assess the relevance of a document for a given query. Set title as the Title Field , leave _id as the ID Field , and add overview, tagline, cast, vote_average, thumb:poster_path as Additional Display Fields . The last entry displays small thumbnail images for the movies in our results to visually guide us and the human raters. Confirm the display settings by selecting the Continue button. The last step is adding search queries to the case. Add the three queries star wars , harrison ford , and best action movie one by one via the input field and Continue . Ideally, a case contains queries that represent real user queries and illustrate different types of queries. For now, we can imagine star wars being a query representing all queries for movie titles, harrison ford a query representing all queries for cast members, and best action movie a query representing all queries that search for movies in a specific genre. This is typically called a query set. In a production scenario, we would sample queries from event tracking data by applying statistical techniques like Probability-Proportional-to-Size sampling and import these sampled queries into Quepid to include queries from the head (frequent queries) and tail (infrequent queries) relative to their frequency, which means we bias towards more frequent queries without excluding rare ones. Finally, select Finish and you will be forwarded to the case interface where you see the three defined queries. Queries and Information Needs To arrive at our overarching goal of a judgement list, human raters will need to judge a search result (typically a document) for a given query. This is called a query/document pair. Sometimes, it seems easy to know what a user wanted when looking at the query. The intention behind the query harrison ford is to find movies starring Harrison Ford, the actor. What about the query action ? I know I’d be tempted to say the user’s intention is to find movies belonging to the action genre. But which ones? The most recent ones, the most popular ones, the best ones according to user ratings? Or does the user maybe want to find all movies that are called “Action”? There are at least 12 (!) movies called “Action” in The Movie Database and their names mainly differ in the number of exclamation marks in the title. Two human raters may differ in interpreting a query where the intention is unclear. Enter the Information Need: An Information Need is a conscious or unconscious desire for information. Defining an information need helps human raters judge documents for a query, so they play an important role in the process of building judgement lists. Expert users or subject matter experts are good candidates for specifying information needs. It is good practice to define information needs from the perspective of the user, as it's their need the search results should fulfill. Information needs for the queries of our “Movies Search Baseline” case: star wars : The user wants to find movies or shows from the Star Wars franchise. Potentially relevant are documentaries about Star Wars. harrison ford : The user wants to find movies starring the actor Harrison Ford. Potentially relevant are movies where Harrison Ford has a different role, like narrator. best action movie : The user wants to find action movies, preferably the ones with high average user votes. Define Information Needs in Quepid To define an information need in Quepid, access the case interface: 1. Open a query (for example s tar wars ) and select Toggle Notes. 2. Enter the Information Need in the first field and any additional notes in the second field: 3. Click Save . For a handful of queries, this process is fine. However, when you expand your case from three to 100 queries (Quepid cases are often in the range of 50 to 100 queries) you may want to define information needs outside of Quepid (for example, in a spreadsheet) and then upload them via Import and select Information Needs . Create a Team in Quepid and Share your Case Collaborative judgements enhance the quality of relevance assessments. To set up a team: 1. Navigate to Teams in the top-level menu. 2. Click + Add New , enter a team name (for example, \"Search Relevance Raters\"), and click Create . 3. Add members by typing their email addresses and clicking Add User . 4. In the case interface, select Share Case . 5. Choose the appropriate team and confirm. Create a Book of Judgements A Book in Quepid allows multiple raters to evaluate query/document pairs systematically. To create one: 1. Go to Judgements in the case interface and click + Create a Book . 2. Configure the book with a descriptive name, assign it to your team, select a scoring method (for example, DCG@10), and set the selection strategy (single or multiple raters). Use the following settings for the Book: Name : “Movies Search 0-3 Scale” Teams to Share this Book With : Check the box with the Team you created Scorer : DCG@10 3. Click Create Book. The name is descriptive and contains information about what is searched in (“Movies”) and also the scale of the judgements (“0-3”). The selected Scorer DCG@10 defines the way the search metric will be calculated. “DCG” is short for Discounted Cumulative Gain and“@10” is the number of results from the top taken into consideration when the metric is calculated. In this case, we are using a metric that measures the information gain and combines it with positional weighting. There may be other search metrics that are more suitable for your use case and choosing the right one is a challenge in itself . Populate the Book with Query/Document Pairs In order to add query/document pairs for relevance assessment, follow these steps: 1. In the case interface, navigate to \"Judgements.\" 2. Select your created book. 3. Click \"Populate Book\" and confirm by selecting \"Refresh Query/Doc Pairs for Book.\" This action generates pairs based on the top search results for each query, ready for evaluation by your team. Let your Team of Human Raters Judge So far, the completed steps were fairly technical and administrative. Now that this necessary preparation is done, we can let our team of judges do their work. In essence, the judge’s job is to rate the relevance of one particular document for a given query. The result of this process is the judgement list that contains all relevance labels for the judged query document pairs. Next, this process and the interface for it are explained in further detail. Overview of the Human Rating Interface Quepid's Human Rating Interface is designed for efficient assessments: Query: Displays the search term. Information Need: Shows the user's intent. Scoring Guidelines: Provides instructions for consistent evaluations. Document Metadata: Presents relevant details about the document. Rating Buttons: Allows raters to assign judgements with corresponding keyboard shortcuts. Using the Human Rating Interface As a human rater, I access the interface via the book overview: 1. Navigate to the case interface and click Judgements . 2. Click on More Judgements are Needed! . The system will present a query/document pair that has not been rated yet and that requires additional judgements. This is determined by the Book’s selection strategy: Single Rater : A single judgement per query/doc pair. Multiple Raters : Up to three judgements per query/doc pair. Rating Query/Doc Pairs Let’s walk through a couple of examples. When you are following this guide, you will most likely be presented with different movies. However, the rating principles stay the same. Our first example is the movie “Heroes” for the query harrison ford : We first look at the query, followed by the information need and then judge the movie based on the metadata given. This movie is a relevant result for our query, since Harridson Ford is in its cast. We may regard more recent movies as more relevant subjectively but this is not part of our information need. So we rate this document with “Perfect” which is a 3 in our graded scale. Our next example is the movie “Ford v Ferrari” for the query harrison ford : Following the same practice, we judge this query/doc by looking at the query, the information need and then how well the document’s metadata matches the information need. This is a poor result. We probably see this result as one of our query terms, “ford”, matches in the title. But Harrison Ford plays no role in this movie, nor any other role. So we rate this document “Poor” which is a 0 in our graded scale. Our third example is the movie “Action Jackson” for the query best action movie : This looks like an action movie, so the information need is at least partially met. However, the vote average is 5.4 out of 10. And that makes this movie probably not the best action movie in our collection. This would lead me as a judge to rate this document “Fair,” which is a 1 in our graded scale. These examples illustrate the process of rating query/doc pairs with Quepid in particular, on a high level and also in general. Best Practices Human Raters The shown examples might make it seem straightforward to get to explicit judgements. But setting up a reliable human rating program is no easy feat. It’s a process filled with challenges that can easily compromise the quality of your data: Human raters can become fatigued from repetitive tasks. Personal preferences may skew judgements. Levels of domain expertise vary from judge to judge. Raters often juggle multiple responsibilities. The perceived relevance of a document may not match its true relevance to a query. These factors can result in inconsistent, low-quality judgements. But don’t worry - there are proven best practices that can help you minimize these issues and build a more robust and reliable evaluation process: Consistent Evaluation: Review the query, information need, and document metadata in order. Refer to Guidelines: Use scoring guidelines to maintain consistency. Scoring guidelines can contain examples of when to apply which grade which illustrates the process of judging. Having a check in with human raters after the first batch of judgements proved to be a good practice to learn about challenging edge cases and where additional support is needed. Utilize Options: If uncertain, use \"I Will Judge Later\" or \"I Can’t Tell,\" providing explanations when necessary. Take Breaks: Regular breaks help maintain judgement quality. Quepid helps with regular breaks by popping confetti whenever a human rater finishes a batch of judgements. By following these steps, you establish a structured and collaborative approach to creating judgement lists in Quepid, enhancing the effectiveness of your search relevance optimization efforts. Next Steps Where to go from here? Judgement lists are but one foundational step towards improving search result quality. Here are the next steps: Calculate Metrics and Start Experimenting Once judgement lists are available, leveraging the judgements and calculating search quality metrics is a natural progression. Quepid automatically calculates the configured metric for the current case when judgements are available. Metrics are implemented as “Scorers” and you can provide your own when the supported ones do not include your favorite! Go to the case interface, navigate to Select Scorer , choose DCG@10 and confirm by clicking on Select Scorer . Quepid will now calculate DCG@10 per query and also average overall queries to quantify the search result quality for your case. Now that your search result quality is quantified, you can run first experiments. Experimentation starts with generating hypotheses. Looking at the three queries in the screenshot after doing some rating makes it obvious that the three queries perform very differently in terms of their search quality metric: star wars performs pretty well, harrison ford looks alright but the greatest potential lies in best action movie . Expanding this query we see its results and can dive into the nitty gritty details and explore why documents matched and what influences their scores: By clicking on “Explain Query” and entering the “Parsing” tab we see that the query is a DisjunctionMaxxQuery searching across three fields: cast , overview and title : Typically, as search engineers we know some domain-specifics about our search platform. In this case, we may know that we have a genres field. Let’s add that to the query and see if search quality is improved. We use the Query Sandbox that opens when selecting Tune Relevance in the case interface. Go ahead and explore this by adding the genres field you search in: Click Rerun My Searches! And view the results. Have they changed? Unfortunately not. We now have a lot of options to explore, basically all query options Elasticsearch offers: We could increase the field weight on the genres field. We could add a function that boosts documents by their vote average. We could create a more complex query that only boosts documents by their vote average if there is a strong genres match. … The best thing about having all these options and exploring them in Quepid is that we have a way of quantifying the effects not only on the one query we try to improve but all queries we have in our case. That prevents us from improving one underperforming query by sacrificing search result quality for others. We can iterate fast and cheap and validate the value of our hypothesis without any risk, making offline experimentation a fundamental capability of all search teams. Measure Inter-Rater Reliability Even with task descriptions, information needs, and a human rater interface like the one Quepid provides, human raters can disagree. Disagreement per se is no bad thing, quite the contrary: measuring disagreement can surface issues that you may want to tackle. Relevance can be subjective, queries can be ambiguous, and data can be incomplete or incorrect. Fleiss’ Kappa is a statistical measure for the agreement among raters and there is an example notebook in Quepid you can use. To find it, select Notebooks in the top-level navigation and select the notebook Fleiss Kappa.ipynb in the examples folder. Conclusion Quepid empowers you to tackle even the most complex search relevance challenges and continues to evolve: as of version 8 Quepid supports AI-generated judgements , which is particularly useful for teams who want to scale their judgement generation process. Quepid workflows enable you to efficiently create judgement lists that are scalable–which ultimately results in search results that truly meet user needs. With judgement lists established, you have a robust foundation for measuring search relevance, iterating on improvements, and driving better user experiences. As you move forward, remember that relevancy tuning is an ongoing process. Judgement lists allow you to systematically evaluate your progress, but they are most powerful when paired with experimentation, metric analysis, and iterative improvements. Further Reading Quepid docs: Relevancy is a Team Sport Quepid for Human Raters How to Connect Quepid to Elastic Cloud Quepid Github repository Meet Pete, a blog series on improving e-commerce search Relevance Slack : join the #quepid channel Partner with Open Source Connections to transform your search and AI capabilities and empower your team to continuously evolve them. Our proven track record spans the globe, with clients consistently achieving dramatic improvements in search quality, team capability, and business performance. Contact us today to learn more. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to Prerequisites Log into Quepid Set up a Quepid Case Queries and Information Needs Define Information Needs in Quepid Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Creating Judgement Lists with Quepid - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/quepid-judgement-lists",
+    "meta_description": "Creating judgement lists in Quepid with a collaborative human rater process."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Excluding Elasticsearch fields from indexing Explaining how to configure Elasticsearch to exclude fields, why you might want to do this, and best practices to follow. How To KB By: Kofi Bartlett On May 12, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In Elasticsearch, indexing refers to the process of storing and organizing data in a way that makes it easily searchable. While indexing all fields in a document can be useful in some cases, there are situations where you might want to exclude certain fields from being indexed. This can help improve performance, reduce storage costs, and minimize the overall size of your Elasticsearch index. In this article, we will discuss the reasons for excluding fields from indexing, how to configure Elasticsearch to exclude specific fields, and some best practices to follow when doing so. Reasons for excluding fields from indexing Performance: Indexing all fields in a document can lead to increased indexing time and slower search performance. By excluding fields that are not required for search or aggregation, you can improve the overall performance of your Elasticsearch cluster. Storage: Indexing fields consumes storage space. Excluding fields that are not needed for search or aggregation can help reduce the storage requirements of your Elasticsearch cluster. Index size: The size of an Elasticsearch index is directly related to the number of fields indexed. By excluding unnecessary fields, you can minimize the size of your index, which can lead to faster search and indexing performance. Configuring Elasticsearch to exclude fields To exclude a field from being indexed in Elasticsearch, you can use the “index” property in the field’s mapping. By setting the “index” property to “false”, Elasticsearch will not index the field, and it will not be searchable or available for aggregations. Here’s an example of how to exclude a field from indexing using the Elasticsearch mapping: In this example, we’re creating a new index called “my_index” with a single field called “field_to_exclude”. By setting the “index” property to “false”, we’re telling Elasticsearch not to index this field. The field will still be available in the source document, though. Best practices for excluding fields from indexing Analyze your data: Before excluding fields from indexing, it’s essential to analyze your data and understand which fields are necessary for search and aggregation. This will help you make informed decisions about which fields to exclude. Test your changes: When excluding fields from indexing, it’s crucial to test your changes to ensure that your search and aggregation functionality still work as expected. This can help you avoid any unexpected issues or performance problems. Monitor performance: After excluding fields from indexing, monitor the performance of your Elasticsearch cluster to ensure that your changes have had the desired effect. This can help you identify any additional optimizations that may be required. Use source filtering: If you need to store a field in Elasticsearch but don’t want it to be searchable or available for aggregations, consider using source filtering. This allows you to store the field in the _source field but exclude it from the index. Conclusion Excluding fields from indexing in Elasticsearch can help improve performance, reduce storage costs, and minimize the overall size of your index. By carefully analyzing your data and understanding which fields are necessary for search and aggregation, you can make informed decisions about which fields to exclude. Always test your changes and monitor the performance of your Elasticsearch cluster to ensure that your optimizations have the desired effect. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Reasons for excluding fields from indexing Configuring Elasticsearch to exclude fields Best practices for excluding fields from indexing Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Excluding Elasticsearch fields from indexing - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/excluding-elasticsearch-fields-from-indexing",
+    "meta_description": "Explaining how to configure Elasticsearch to exclude fields, why you might want to do this, and best practices to follow."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. Elastic Cloud Serverless DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more On December 2, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. The advent of Elastic Cloud Serverless has reshaped how businesses can harness the power of Elasticsearch without the need to manage clusters, nodes, or resource scaling. A key innovation within Elastic Cloud Serverless is its autoscaling feature, which adapts to changes in workload and traffic in real-time. This post explores the technicalities behind autoscaling, the performance of Elastic Cloud Serverless under load, and the results from extensive stress testing. What is Elastic Cloud Serverless? Elastic Cloud Serverless offers an automated, managed version of Elasticsearch that scales based on demand. Unlike traditional Elasticsearch deployments, where users must provision and manage hardware or cloud instances, Elastic Cloud Serverless manages infrastructure scaling and resource allocation. This is particularly beneficial for organizations with variable workloads, where scaling infrastructure up and down manually can be cumbersome and error-prone. The system’s built-in autoscaling feature accommodates heavy ingestion tasks, search queries, and other operations without manual intervention. Elastic Cloud Serverless operates with two distinct tiers, the search and indexing tiers, each optimized for specific workloads. The search tier is dedicated to handling query execution, ensuring fast and efficient responses for search requests. Meanwhile, the indexing tier is responsible for ingesting and processing data, managing write operations, and ensuring data is properly stored and searchable. By decoupling these concerns, Elastic Cloud Serverless allows each tier to scale independently based on workload demands. This separation improves resource efficiency, as compute and storage needs for indexing (e.g., handling high-throughput ingestion) do not interfere with query performance during search operations. Similarly, search tier resources can be scaled up to handle complex queries or spikes in traffic without impacting the ingestion process. This architecture ensures optimal performance, cost-efficiency, and resilience, allowing Elastic Cloud Serverless to adapt dynamically to fluctuating workloads while maintaining consistent user experiences. You can read more about the architecture of Elastic Cloud Serverless in the following blog post . Stress testing Elastic Cloud Serverless Comprehensive stress tests assessed Elastic Cloud Serverless’s capability to handle large, fluctuating workloads. These tests were designed to measure the system’s ability to ingest data, handle search queries, and maintain performance under extreme conditions. It should be noted that the system can perform beyond what we present here, depending on factors such as client count and bulk index sizes. Here, we’ll walk through the approach and findings of these tests. Testing scope and approach The primary objective of our stress testing was to answer key questions: How well does Elastic Cloud Serverless handle large-scale ingestion and search queries with a high number of concurrent clients? Can it scale dynamically to accommodate sudden spikes in workload? Does the system maintain stability over extended periods? Stress testing a search use case In Elastic Cloud Serverless, you can choose from three project types: Elasticsearch, Observability, and Security. We began our stress test journey on search use cases for Elasticsearch, using a Github Archive dataset and simulating likely ingest and search behaviors. Before testing, we prepared the system by ingesting a base corpus of 186GB / 43 million documents. We then gradually added clients over ten minutes to allow Elasticsearch the time to scale appropriately. The data was ingested using Datastreams via the Bulk APIs. Stress testing the indexing tier. Firstly, let's talk about indexing data (ingest). Ingest autoscaling in Elastic Cloud Serverless dynamically adjusts resources to match data ingestion demands, ensuring optimal performance and cost-efficiency. The system continuously monitors metrics such as ingestion throughput, resource utilization (CPU, memory, and network), and response latencies. When these metrics exceed predefined thresholds, the autoscaler provisions additional capacity proportionally to handle current and anticipated demand while maintaining a buffer for unexpected spikes. The complexity of data pipelines and system-imposed resource limits also influences scaling decisions. By dynamically adding or removing capacity, ingest autoscaling ensures seamless scaling without manual intervention. In autoscaled systems like Elastic Cloud Serverless, where resource efficiency is optimized, there may be situations where a sudden, massive increase in workload exceeds the capacity of the system to scale immediately. In such cases, clients may receive HTTP 429 status codes, indicating that the system is overwhelmed. To handle these situations, clients should implement an exponential backoff strategy, retrying requests at progressively longer intervals. During stress testing, we actively track 429 responses to assess how the system reacts under high demand, providing valuable insights into autoscaling effectiveness.You can read a more in-depth blog post on how we autoscale indexing here . Now, let’s look at some of the results we encountered in our stress testing of the indexing tier. Indexing while scaling up: Corpus Bulk Size Actual Volume Indexing Period (minutes) Volume / hr Median Throughput (docs/s) 90th PCT Indexing latency (seconds) Avg. % Error Rate (429s, other) 1TB 2500 1117.43 GB 63 1064.22 GB 70,256.96 7.095 0.05% 2TB 2500 2162.02 GB 122 1063.29 GB 68,365.23 8.148 0.05% 5TB 2500 5254.84 GB 272 1159.16 GB 74,770.27 7.46 0 For initial tests with 1TB and 2TB corpus , we achieved a throughput of 1064 GB/hr and 1063 GB/hr , respectively. For 5TB we achieved higher at 1160 GB / hr ingest , as we observed the ingest tier continued to scale up, providing a better throughput. Indexing while fully scaled: Clients Bulk Size Actual Volume Duration Volume / hr Median Throughput (docs/s) 99th PCT Indexing latency (seconds) Avg. % Error Rate (429s, other) 3,000 2,000 1 TB 8 minutes 7.5 TB 499,000 33.5 0.0% When working with a maximally scaled indexing tier, ECS ingested 1TB of data in 8 minutes , at a rate of ~499K docs/s indexed per second. This equates to an extrapolated capacity of 180TB daily . Indexing from minimal scale to maximum scale: Clients Bulk Size Actual Volume Duration Volume / hr Median Throughput (docs/s) 99th PCT Indexing latency (seconds) Avg. % Error Rate (429s, other) 2,048 1,000 13 TB 6 hours 2.1 TB 146,478 55.5 1.55% During tests with 2TB of data , we gradually scaled up to 2048 clients and managed to ingest data at a rate of 146K docs/s , completing 2TB of data in 1 hour . Extrapolated, this would result in 48TB per day . 72-Hour Stability Test: Clients Bulk Size Actual Volume Indexing Period (hours) Volume / hr Median Throughput (docs/s) 99th PCT Indexing latency (seconds) Avg. % Error Rate (429s, other) 128 500 61 TB 72 ~868.6 GB 51,700 7.7 <0.05% In a 72-hour stability test , we ingested 60TB of data with 128 clients . Elasticsearch maintained an impressive 870GB/hr throughput with minimal error rates while scaling the indexing and search tiers. This demonstrated Elasticsearch’s ability to sustain high throughput over extended periods with low failure rates. Stress testing the search tier. Search tier autoscaling in Elastic Cloud Serverless dynamically adjusts resources based on dataset size and search load to maintain optimal performance. The system classifies data into two categories: boosted and non-boosted. Boosted data includes time-based documents (documents with an @timestamp field) within a user-defined boost window and all non-time-based documents, while non-boosted data falls outside this window. Users can set a boost window to define the time range for boosted data and select a search power level—On-demand, Performant, or High-throughput—to control resource allocation. You can read more about configuring Search Power & Search Boost Windows here . The autoscaler monitors metrics such as query latency, resource utilization (CPU and memory), and query queue lengths. When these metrics indicate increased demand, the system scales resources accordingly. This scaling is performed on a per-project basis and is transparent to the end user. Search stability under load: Corpus Actual Volume (from corpus tab) Duration Average Search Rate (req/s) Max Search Rate (req/s) Response Time (P50) Response Time (P99) 5TB 5254.84 GB 120 minutes 891 3,158 36 ms 316 ms With 5TB of data , we tested a set of 8 searches running over 2 hours, including complex queries, aggregations & ES|QL. Clients were ramped up from 4 to 64 clients per search. In total there were between 32 and 512 clients performing searches. Performance remained stable as the number of clients increased from 32 to 512. When running with 512 clients, we observed a search request rate of 3,158 queries per second with a P50 response time of 36ms . Throughout the test we observed the search tier scaling as expected to meet demand. 24-hour search stability test: Corpus Actual Volume Duration Average Search Rate (req/s) Max Search Rate (req/s) Response Time (P50) Response Time (P99) 40TB 60 TB 24 hours 183 250 192 ms 520 ms A set of 7 searches, aggregations, and an ES|QL query were used to query 40TB of (mainly) boosted data. The number of clients was ramped up from 1 to 12 per search, totaling 7 to 84 search clients. With Search Power set to balanced, we observed 192ms (P50) response time. You can read more about configuring Search Power & Search Boost Windows here . Concurrent index and search In tests that ran simultaneous indexing and searching , we aimed to ingest 5TB in 6 “chunks.” We ramped up from 24 to 480 clients ingesting data with a bulk size of 2500 documents. For search, clients were ramped up from 2 to 40 per search. In total, between 16 and 320 clients performed searches. We observed both tiers autoscaling and saw search latencies consistently around 24ms (p50) and 1359ms (p99). The system’s ability to index and search concurrently while maintaining performance is critical for many use cases. Conclusion The stress tests discussed above focused on a search use case in an Elasticsearch project designed with a specific configuration of field types, number of fields, clients, and bulk sizes. These parameters were tailored to evaluate Elastic Cloud Serverless under well-defined conditions relevant to the use case, providing valuable insights into its performance. However, it's important to note that the results may not directly reflect your workload, as performance depends on various factors such as query complexity, data structure, and indexing strategies. These benchmarks serve as a baseline, but real-world outcomes will vary depending on your unique use case and requirements. It should also be noted that these results do not represent an upper performance bound. The key takeaway from our stress testing is that Elastic Cloud Serverless demonstrates remarkable robustness. It can ingest hundreds of terabytes of data daily while maintaining strong search performance. This makes it a powerful solution for large-scale search workloads, ensuring reliability and efficiency at high data volumes. In upcoming posts, we will expand our exploration into stress testing Elastic Cloud Serverless for observability and security use cases, highlighting its versatility across different application domains and providing deeper insights into its capabilities. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Elastic Cloud Serverless Ingestion September 20, 2024 Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. VR MR By: Vishal Raj and Marc Lopez Rubio Jump to What is Elastic Cloud Serverless? Stress testing Elastic Cloud Serverless Testing scope and approach Stress testing a search use case Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-serverless-performance-stress-testing",
+    "meta_description": "Dive into Elasticsearch Cloud Serverless, explore its performance under real-world conditions and see the results from extensive stress testing."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Inside Elastic Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Inside Elastic August 9, 2024 GenAI for Customer Support — Part 3: Designing a chat interface for chatbots... for humans This series gives you an inside look at how we’re using generative AI in customer support. Join us as we share our journey in designing a GenAI chatbot interface. IM By: Ian Moersen Inside Elastic July 22, 2024 GenAI for Customer Support — Part 2: Building a Knowledge Library This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! CM By: Cory Mangini Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Inside Elastic - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/inside-elastic",
+    "meta_description": "Inside Elastic articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Search Relevance Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance How To March 27, 2025 How to automate synonyms and upload using our Synonyms API Discover how LLMs can be used to identify and generate synonyms automatically, allowing terms to be programmatically loaded into the Elasticsearch synonym API. AL By: Andre Luiz Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Search Relevance Vector Database +1 March 18, 2025 Searching complex documents with ColPali - part 1 The article introduces the ColPali model, a late-interaction model that simplifies the process of searching complex documents with images and tables, and discusses its implementation in Elasticsearch. PS BT By: Peter Straßer and Benjamin Trent 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Search Relevance - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/search-relevance",
+    "meta_description": "Search Relevance articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. How To KB By: Kofi Bartlett On May 16, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Disk management is important in any database, and Elasticsearch is no exception. If you don’t have enough disk space available, Elasticsearch will stop allocating shards to the node. This will eventually prevent you from being able to write data to the cluster, with the potential risk of data loss in your application. On the other hand, if you have too much disk space, then you are paying for more resources than you need. Background on watermarks There are various “watermark” thresholds on your Elasticsearch cluster which help you track the available disk space. As the disk fills up on a node, the first threshold to be crossed will be the “low disk watermark”. The second threshold will then be the “high disk watermark threshold”. Finally, the “disk flood stage” will be reached. Once this threshold is passed, the cluster will then block writing to ALL indices that have one shard (primary or replica) on the node which has passed the watermark. Reads (searches) will still be possible. How to prevent and handle cases when disk is too full (over utilization) There are various methods for handling cases when your Elasticsearch disk is too full: Delete old data: Usually, data should not be kept indefinitely. One way to prevent and solve disk being too full is by ensuring that when data reaches a certain age, it gets reliably archived and deleted. One way to do this is to use ILM . Add storage capacity: If you cannot delete the data, you might want to add more data nodes or increase the disk sizes in order to retain all the data without negatively affecting performance. If you need to add storage capacity to the cluster, you should consider whether you need to add just storage capacity alone, or both storage capacity and also RAM and CPU resources in proportion (see section on ratio of disk size, RAM and CPU below). How to add storage capacity to your Elasticsearch cluster Increase the number of data nodes: Remember that the new nodes should be of the same size as existing nodes, and of the same Elasticsearch version. Increase the size of existing nodes: In cloud-based environments, it is usually easy to increase disk size and RAM/CPU on existing nodes. Increase only the disk size: In cloud-based environments, it is often relatively easy to increase disk size. Snapshot and restore : If you are willing to allow old data to be retrieved upon request in an automated process from backups, you can snapshot old indices, delete them and restore data temporarily upon request from the snapshots. Reduce replicas per shard: Another option to reduce data is to reduce the number of replicas of each shard. For high availability, you would like to have one replica per shard, but when data grows older, you might be able to work without replicas. This could usually work if the data is persistent, or you have a backup to restore if needed. Create alerts: In order to prevent disks from filling up in the future and act proactively, you should create alerts based on disk usage that will notify you when the disk starts filling up. How to prevent and handle cases when the disk capacity is underutilized If your disk capacity is underutilized, there are various options to reduce the storage volume on your cluster. How to reduce the storage volume on an Elasticsearch cluster There are various methods for how to reduce the storage volume of a cluster. 1. Reduce the number of data nodes If you want to reduce data storage and also reduce RAM and CPU resources in the same proportion, then this is the easiest strategy. Decommissioning unnecessary nodes is likely to provide the greatest cost savings. Before decommissioning the node, you should: Ensure that the node to be decommissioned is not necessary as a MASTER node. You should always have at least three nodes with the MASTER node role. Migrate the data shards away from the node to be decommissioned. 2. Replace existing nodes with smaller nodes If you cannot further reduce the number of nodes (usually 3 would be a minimum configuration), then you may want to downsize existing nodes. Remember that it is advisable to ensure that all data nodes are of the same RAM memory and disk size, since the shards balance on the basis of number of shards per node. The process would be: Add new, smaller nodes to the cluster Migrate the shards away from the nodes to be decommissioned Shut down the old nodes 3. Reduce disk size on nodes If you ONLY want to reduce disk size on the nodes without changing the cluster’s overall RAM or CPU, then you can reduce the disk size for each node. Reducing disk size on an Elasticsearch node is not a trivial process. The easiest way to do so would usually be to: Migrate shards from the node Stop the node Mount a new data volume to the node with appropriate size Copy all data from old disk volume to new volume Detach old volume A Start node and migrate shards back to node This requires that you have sufficient capacity on the other nodes to temporarily store the extra shards from the node during this process. In many cases, the cost of managing this process may exceed the potential savings in disk usage. For this reason, it may be simpler to replace the node altogether with a new node with the desired disk size (see “Replace existing nodes with smaller nodes” above). When paying for unnecessary resources, cost can obviously be reduced by optimizing your resource utilization. The relationship between disk size, RAM and CPU The ideal ratio of disk capacity to RAM in your cluster will depend on your particular use case. For this reason, when considering changes to your storage capacity, you also should consider whether your current Disk/RAM/CPU ratios are suitably balanced and whether as a consequence you also need to add/reduce RAM/CPU in the same proportion. RAM and CPU requirements depend on the volume of indexing activity, the number and type of queries, and also the amount of data that is being searched and aggregated. This is often in proportion to the amount of data being stored on the cluster, and therefore should also be related to disk size. The ratio between the disk capacity and the RAM can change based on the use case. See a few examples here: Index activity Retention Search activity Disk capacity RAM Enterprise search app Moderate log ingestion Long Light 2TB 32GB App monitoring Intensive log ingestion Short Light 1TB 32GB E-commerce Light data indexing Indefinite Heavy 500GB 32GB Remember that modifying the configuration of node machines must be done with care, since it may involve node downtime and you need to ensure that shards do not start to migrate to your other already over-stretched nodes. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 14, 2025 Elasticsearch Index Number_of_Replicas Explaining how to configure the number_of_replicas, its implications and best practices. KB By: Kofi Bartlett Jump to Background on watermarks How to prevent and handle cases when disk is too full (over utilization) How to add storage capacity to your Elasticsearch cluster How to prevent and handle cases when the disk capacity is underutilized How to reduce the storage volume on an Elasticsearch cluster Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to optimize Elasticsearch disk space and usage - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/optimize-elasticsearch-disk-space-and-usage",
+    "meta_description": "Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. Integrations EZ FB By: Enrico Zimuel and Florian Bernd On May 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In collaboration with the Microsoft Semantic Kernel team, we are announcing the availability of hybrid search capabilities in the .NET Elasticsearch Semantic Kernel connector – the first vector database to implement this capability. Microsoft Semantic Kernel recently announced support of hybrid search use cases, which opened the door for customers to use Elasticsearch for a broader set of applications. Elasticsearch has supported hybrid search since version 8.8.0, and in this article we will walk through how to use hybrid search with Elasticsearch and Semantic Kernel. You can find the latest version of the Elasticsearch Semantic Kernel connector with Hybrid Search support here . If you are not familiar with the Elasticsearch integration in Semantic Kernel for .NET, we suggest reading this article that we previously published. What is Hybrid Search? Hybrid Search is a powerful information retrieval strategy that combines two or more search techniques into a search algorithm. A typical use case is the combination of lexical search (i.e. BM25 ) combined with semantic search (i.e. kNN ). By running these two strategies in parallel customers can get the most significant results, which feeds into better answer quality overall (Figure 1). Figure 1: Hybrid search as the intersection between lexical and semantic search In order to combine the results we can use different strategies. Each result in Elasticsearch produces a list of relevant documents, ordered by a score value. A score is a floating point number that represents the relevance of a document. Higher numbers mean better relevance. If we have two lists of results, one coming from lexical and another from semantic how can we combine them? One strategy is to use the Reciprocal Rank Fusion (RRF) algorithm. This algorithm rearranges the score of each document using the following algorithm: Where: k is a ranking constant q is a query in the set of queries (e.g. lexical and semantic) d is a document in the result set of q result(q) is the result set of q rank(result(q), d) is the position (ranking) of document d in the results of query q For instance, imagine we run a Hybrid Search query to get the top-3 significant documents. We use lexical and semantic queries and we set k=1. The results for lexical query are (in order): Doc4 Doc3 Doc2 Doc1 That means the most relevant document is Doc4 followed by Doc3, Doc2 and Doc1. The results for semantic query are (in order): Doc3 Doc2 Doc1 Doc5 We can then calculate the RRF scores using the previous algorithm. In the following table, we calculated the scores for the lexical and semantic results, and then summed the two values to obtain the final RRF score. Documents Lexical Semantic RRF Doc1 1/(1+4) 1/(1+3) ⅕ + ¼ = 0.4500 Doc2 1/(1+3) 1/(1+2) ¼ + ⅓ = 0.5833 Doc3 1/(1+2) 1/(1+1) ⅓ + ½ = 0.8333 Doc4 1/(1+1) 0 ½ = 0.5 Doc5 0 1/(1+4) ⅕ = 0.2 Sorting the RRF scores gives us the following results: Doc3 Doc2 Doc4 Doc1 Doc5 Finally, the top-3 results are: Doc3, Doc2 and Doc4. The RRF algorithm is used by default with the hybrid search Elasticsearch integration for Semantic Kernel. The Hybrid Search integration in Semantic Kernel The latest version of the Elasticsearch Semantic Kernel connector implements the brand new IHybridSearch<TRecord> interface in the ElasticsearchVectorStoreRecordCollection<TKey, TRecord> type. This interface extends the existing functionality with a new method that looks like this: Where: vector is the TVector for the semantic search (using kNN); keywords contain a collection of strings to be used in the lexical search terms query of Elasticsearch (the terms in the collection are treated as OR conditions); top indicates the maximum number of documents to return; options options like e.g. the vector property/field to use for the vector search operation, the property/field to use for the lexical search operation, or an additional pre-filter specified in .NET expression tree syntax; cancellationToken the CancellationToken used to cancel the asynchronous operation; For instance, imagine we reuse the hotel dataset introduced in the previous article How to use Elasticsearch Vector Store Connector for Microsoft Semantic Kernel for AI Agent development . We can execute an Hybrid Search query to retrieve the top-5 hotels containing the keywords “downtown” or “luxury” combined with a semantic search using the vector {1, 2, 3}: If we want to apply a filter before executing the Hybrid Search, we can do that by using the HybridSearchOptions . For instance, imagine we want to consider only the hotels that are beachfront, we can add a filter using the expression Filter = x => x.Description.Contains(\"beachfront\") as follows: In this way, the search will consider only the beachfront hotels and then apply the previous Hybrid Search criteria (hint: expression tree-based filtering is also available for the regular vector search in Semantic Kernel). The support for expression tree-based filtering in recent versions of Semantic Kernel is a nice improvement over the previous filtering API. Right now, the Elasticsearch Semantic Kernel connector only supports comparison (=, !=, <, <=, >, >=) and boolean (!, &&, ||) operators. More operations like collection.Contains() will be implemented soon. Hybrid search for .NET apps, with Elasticsearch and Semantic Kernel In this article, we showed how to use Semantic Kernel’s Hybrid Search features with Elasticsearch integration. We illustrated how to combine lexical and semantic search to improve the retrieval results. This technique can be used for improving information retrieval systems, such as Retrieval-augmented generation (RAG). Moreover, we also looked at applying pre-filtering using the HybridSearchOptions object. The filtering condition can be expressed using the .NET expression tree syntax. While Reciprocal Rank Fusion provides a robust default for combining lexical and semantic scores in hybrid search—as we saw in this blog with Semantic Kernel, Elasticsearch also more broadly supports other retriever styles . This includes options like the Linear Retriever , providing simple customization of combination strategies beyond the RRF default, enabling users to fine-tune search relevance with hybrid approaches. In the future, we will continue to expand support for Semantic Kernel with the latest features within Elasticsearch. Happy (hybrid) searching! Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Jump to What is Hybrid Search? The Hybrid Search integration in Semantic Kernel Hybrid search for .NET apps, with Elasticsearch and Semantic Kernel Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "First to hybrid search: with Elasticsearch and Semantic Kernel - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/hybrid-search-support-elasticsearch-vector-database-semantic-kernel",
+    "meta_description": "Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Search Analytics Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Search Analytics January 10, 2025 Filtering in ES|QL using full text search 8.17 included match and qstr functions in ES|QL, that can be used to perform full text filtering. 8.18 removed limitations on their usage. This article describes what they do, how they can be used, the difference with the existing text filtering methods, current limitations and future improvements. CD By: Carlos Delgado Search Analytics How To June 10, 2024 Storage wins for time-series data in Elasticsearch Explore Elasticsearch's storage improvements for time series data and best practices for configuring a TSDS with storage efficiency. MG KK By: Martijn Van Groningen and Kostas Krikellas Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Search Analytics - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/search-analytics",
+    "meta_description": "Search Analytics articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch Index Number_of_Replicas Explaining how to configure the number_of_replicas, its implications and best practices. How To KB By: Kofi Bartlett On May 14, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch is designed to be a distributed system that can handle a large amount of data and provide high availability. One of the key features that enable this is the concept of index replication, which is controlled by the number_of_replicas setting. This article will delve into the details of this setting, its implications, and how to properly configure it. The role of replicas in Elasticsearch In Elasticsearch, an index is a collection of documents that are partitioned across multiple primary shards. Each primary shard is a self-contained Apache Lucene index, and the documents within an index are distributed among all primary shards. To ensure high availability and data redundancy, Elasticsearch allows each shard to have one or more copies, known as replicas. The number_of_replicas setting controls the number of replica shards (copies) that Elasticsearch creates for each primary shard in an index. By default, Elasticsearch creates one replica for each primary shard, but this can be changed according to the requirements of your system. Configuring the number_of_replicas The number_of_replicas setting can be configured at the time of index creation or updated later. Here’s how you can set it during index creation: In this example, Elasticsearch will create two replicas for each primary shard in the my_index index. To update the number_of_replicas setting for an existing index, you can use the _settings API: This command will update the my_index index to have three replicas for each primary shard. Implications of the number_of_replicas setting The number_of_replicas setting has a significant impact on the performance and resilience of your Elasticsearch cluster . Here are some key points to consider: Data Redundancy and Availability: Increasing the number_of_replicas enhances the availability of your data by creating more copies of each shard. If a node fails, Elasticsearch can still serve data from the replica shards on the remaining nodes . Search Performance: Replica shards can serve read requests, so having more replicas can improve search performance by distributing the load across more shards. Write Performance: However, each write operation must be performed on every copy of a shard. Therefore, a higher number_of_replicas can slow down indexing performance as it increases the number of operations that must be performed for each write. Storage Requirements: More replicas mean more storage space. You should ensure that your cluster has enough capacity to store the additional replicas. Resilience to Node Failure: The number_of_replicas should be set considering the number of nodes in your cluster. If the number_of_replicas is equal to or greater than the number of nodes, your cluster can tolerate the failure of multiple nodes without data loss. Best practices for setting number_of_replicas The optimal number_of_replicas setting depends on the specific requirements of your system. However, here are some general best practices: For a single-node cluster, number_of_replicas should be set to 0, as there are no other nodes to hold replicas. For a multi-node cluster, number_of_replicas should be set to at least 1 to ensure data redundancy and high availability. If search performance is a priority, consider increasing the number_of_replicas . However, keep in mind the trade-off with write performance and storage requirements. Always ensure that your cluster has enough capacity to store the additional replicas. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to The role of replicas in Elasticsearch Configuring the number_of_replicas Implications of the number_of_replicas setting Best practices for setting number_of_replicas Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch Index Number_of_Replicas - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-index-number-of_replicas",
+    "meta_description": "Explaining how to configure the number_of_replicas, its implications and best practices.\n"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Generative AI Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Generative AI How To March 17, 2025 How to optimize RAG retrieval in Elastisearch with DeepEval Learn how to optimize the Elasticsearch retriever in a RAG pipeline using DeepEval. KV By: Kritin Vongthongsri Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Generative AI How To March 11, 2025 Building a Multimodal RAG system with Elasticsearch: The story of Gotham City Learn how to build a Multimodal Retrieval-Augmented Generation (RAG) system that integrates text, audio, video, and image data to provide richer, contextualized information retrieval. AS By: Alex Salgado Generative AI Search Relevance +1 March 5, 2025 How to build autocomplete feature on search application automatically using LLM generated terms Learn how to enhance your search application with an automated autocomplete feature in Elastic Cloud using LLM-generated terms for smarter, more dynamic suggestions. MS By: Michael Supangkat Generative AI Integrations +1 February 26, 2025 Embeddings and reranking with Alibaba Cloud AI Service Using Alibaba Cloud AI Service features with Elastic. TM By: Tomás Murúa 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Generative AI - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/generative-ai",
+    "meta_description": "Generative AI articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Elastic Cloud Hosted Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Vector Database Generative AI +2 May 21, 2024 Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures Benchmarking in preview provides Elasticsearch up to 37% better performance on Azure Cobalt 100 Arm-based VMs. YG HM By: Yuvraj Gupta and Hemant Malik Vector Database Generative AI +1 May 21, 2024 Elastic Cloud adds Elasticsearch Vector Database optimized profile to Microsoft Azure Elasticsearch added a new vector search optimized profile to Elastic Cloud on Microsoft Azure. Get started and learn how to use it here. SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta Vector Database Generative AI +1 April 25, 2024 Elastic Cloud adds Elasticsearch Vector Database optimized instance to Google Cloud Elasticsearch's vector search optimized profile for GCP is available. Learn more about it and how to use it in this blog. SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elastic Cloud Hosted - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/elastic-cloud-hosted",
+    "meta_description": "Elastic Cloud Hosted articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. Search Relevance VB By: Vincent Bosc On April 11, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Sparse vectors are a key component in ELSER , but their usefulness extends far beyond that. In this post, we’ll explore how sparse vectors can enhance search relevance in an e-commerce setting: boosting documents based on search behavior (like clicks) and user preferences. What exactly are sparse vectors? Vector search is a hot topic right now, but most conversations focus on dense vectors: compact numerical representations used in machine learning and neural search. Sparse vectors, on the other hand, take a different path. Unlike dense vectors that pack data tightly, sparse vectors store information in a more interpretable and structured format, often with many zeros. Though less hyped, they can be incredibly powerful in the right context. 💡 Fun fact: Sparse vectors and inverted indexes both leverage sparsity to efficiently represent and retrieve information. In Elasticsearch, you can store sparse vectors using the sparse_vector field type : no surprises there. Querying with sparse vectors Searching with sparse vectors in Elasticsearch feels similar to traditional keyword search, but with a twist. Rather than matching terms directly, sparse vector queries use weighted terms and the dot product to score documents based on how well they align with the query vector. Use case 1: Signal boosting for better search ranking Signal boosting refers to emphasizing certain features or terms to improve search ranking. This is especially useful when business logic or user behavior suggests that some results should appear higher. Let’s say we’re working with a simple e-commerce index: Now, let’s index two documents only using traditional full text type: A basic search for “playstation” will return the controller first, not because it’s more relevant, but because BM25, the default lexical scoring algorithm, tends to favor shorter fields, causing the controller’s concise title to rank higher. But we want to boost the console result, especially since it has a special offer! One way to do this is by embedding boosting signals directly into the document via sparse vectors: This document now carries extra weight for the search queries “playstation” and “game console”. We can adjust our query to incorporate this sparse vector boost: Thanks to the added score from the sparse vector match, the console now ranks above the controller, which is exactly what we want! This approach offers an alternative to traditional boosting techniques, such as function_score queries or field-level weight tuning. By storing boosting information directly in the document using sparse vectors, you gain more flexibility and transparency in how relevance is adjusted. It also decouples business logic from query logic. However, it’s worth noting the tradeoffs: traditional boosting can be simpler to implement for straightforward use cases and may have performance advantages in some scenarios. Sparse vectors shine when you need fine-grained, multi-dimensional control over boosting. Reminder : The must clause filters and contributes to scoring, while the should clause adds to the score if the condition matches. Use case 2: Personalization using sparse vectors Sparse vectors also enable personalization. You can assign weights to customer traits or personas and use them to surface the most relevant products for individual users. Here’s an example: Let’s say Jim is a customer who prefers healthy, sustainable options: We can tailor the search experience to reflect Jim’s preferences: As a result, the healthier snack bar floats to the top of the search results because that’s what Jim is more likely to buy. This method of personalization via sparse vectors builds on ideas like static segment tags, but makes them more dynamic and expressive. Instead of assigning a user to a single segment like \"tech-savvy\" or \"healthy-conscious\", sparse vectors allow you to represent multiple affinities with varying weights, all in a way that integrates directly into the search ranking process. Using a function_score query to incorporate user preferences is a flexible alternative for personalization, but it can become complex and difficult to maintain as logic grows. Another common approach, collaborative filtering , relies on external systems to compute user-item similarities and typically requires additional infrastructure. Learning to Rank (LTR) can also be applied to personalization , offering powerful ranking capabilities, but it demands a high level of maturity, both in terms of feature engineering and model training. Wrapping up Sparse vectors are a versatile addition to your search toolbox. We’ve covered just two practical examples: boosting search results and personalizing based on user profiles. But the possibilities are broad. By embedding structured, weighted information directly into your documents, you unlock smarter, more relevant search experiences with minimal complexity. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to What exactly are sparse vectors? Querying with sparse vectors Use case 1: Signal boosting for better search ranking Use case 2: Personalization using sparse vectors Wrapping up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Enhancing relevance with sparse vectors - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-sparse-vector-boosting-personalization",
+    "meta_description": "Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elastic Cloud adds Elasticsearch Vector Database optimized profile to Microsoft Azure Elasticsearch added a new vector search optimized profile to Elastic Cloud on Microsoft Azure. Get started and learn how to use it here. Vector Database Generative AI Elastic Cloud Hosted SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta On May 21, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elastic Cloud Vector Search optimized hardware profile is available for Elastic Cloud on Microsoft Azure users. This hardware profile is optimized for applications that use Elasticsearch as a vector database to store dense or sparse embeddings for search and Generative AI use cases powered by RAG (retrieval augmented generation). Vector Search optimized hardware profile: what you need to know Elastic Cloud users benefit from having Elastic managed infrastructure across all major cloud providers (Azure, GCP and AWS) along with wide region support for Microsoft Azure users. This release follows the previous announcement of a Vector Search optimized hardware profile for GCP . AWS users have had access to the Vector Search optimized profile since November 2023. For more specific details on the instance configuration for this Azure hardware profile, refer to our documentation for instance type: azure.es.datahot.lsv3 Vector Search, HNSW, and Memory Elasticsearch uses the Hierarchical Navigable Small World graph (HNSW) data structure to implement its Approximate Nearest Neighbor search (ANN). Because of its layered approach, HNSW's hierarchical aspect offers excellent query latency. To be most performant, HNSW requires the vectors to be cached in the node's memory. This caching is done automatically and uses the available RAM not taken up by the Elasticsearch JVM. Because of this, memory optimizations are important steps for scalability. Consult our vector search tuning guide to determine the right setup for your vector search embeddings and whether you have adequate memory for your deployment. With this in mind, the Vector Search optimized hardware profile is configured with a smaller than standard Elasticsearch JVM heap setting. This provides more RAM for caching vectors on a node, allowing users to provision fewer nodes for their vector search use cases. If you’re using compression techniques like scalar quantization , the memory requirement is lowered by a factor of 4 . To store quantized embeddings (available in versions Elasticsearch 8.12 and later) simply ensure that you’re storing in the correct element_type: byte . To utilize our automatic quantization of float vectors update your embeddings to use index type: int8_hnsw like in the following mapping example. In upcoming versions, Elasticsearch will provide this as the default mapping, removing the need for users to adjust their mapping. For further reading, we provide an evaluation of scalar quantization in Elasticsearch in this blog . Combining this optimized hardware profile with Elasticsearch’s automatic quantization are two examples where Elastic is focused on vector search and our vector database to be cost-effective while still being extremely performant. Getting started with Elastic Cloud Vector Search optimized hardware profile Start a free trial on Elastic Cloud and simply select the new Vector Search optimized profile to get started. Migrating existing Elastic Cloud deployments Migrating to this new Vector Search optimized hardware profile is a few clicks away. Simply navigate to your Elastic Cloud management UI, click to manage the specific deployment, and edit the hardware profile. In this example, we are migrating from a ‘Storage optimized’ profile to the new ‘Vector Search’ optimized profile. When choosing to do so, there is a small reduction to the available storage, but what is gained is the ability to store more vectors per memory with vector search at a lower cost. Migrating to a new hardware profile uses the grow and shrink approach for deployment changes. This approach adds new instances, migrates data from old instances to the new ones, and then shrinks the deployment by removing the old instances. This approach allows for high availability during configuration changes even for single availability zones. The following image shows a typical architecture for a deployment running in Elastic Cloud, where vector search will be the primary use case. This example deployment uses our new Vector Search optimized hardware profile, now available in Azure. This setup includes: Two data nodes in our hot tier with our vector search profile One Kibana node One Machine Learning node One integration server One master tiebreaker By deploying these two “full-sized” data nodes with the Vector Search optimized hardware profile and while taking advantage of Elastic’s automatic dense vector scalar quantization , you can index roughly 60 million vectors, including one replica (with 768 dimensions). Conclusion Vector search is a powerful tool when building modern search applications, be it for semantic document retrieval on its own or integrating with an LLM service provider in a RAG setup . Elasticsearch provides a full-featured vector database natively integrated with a full-featured search platform. Along with improving vector search feature set and usability, Elastic continues to improve scalability. The vector search node type is the latest example, allowing users to scale their search application. Elastic is committed to providing scalable, price effective infrastructure to support enterprise grade search experiences. Customers can depend on us for reliable and easy to maintain infrastructure and cost levers like vector compression, so you benefit from the lowest possible total cost of ownership for building search experiences powered by AI. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Vector Search optimized hardware profile: what you need to know Vector Search, HNSW, and Memory Getting started with Elastic Cloud Vector Search optimized hardware profile Migrating existing Elastic Cloud deployments Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elastic Cloud adds Elasticsearch Vector Database optimized profile to Microsoft Azure - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-vector-profile-azure",
+    "meta_description": "Elasticsearch added a new vector search optimized profile to Elastic Cloud on Microsoft Azure. Get started and learn how to use it here."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Searching complex documents with ColPali - part 1 The article introduces the ColPali model, a late-interaction model that simplifies the process of searching complex documents with images and tables, and discusses its implementation in Elasticsearch. Search Relevance Vector Database How To PS BT By: Peter Straßer and Benjamin Trent On March 18, 2025 Part of Series The ColPali model series Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. When building search applications, we often need to deal with documents that have complex structures—tables, figures, multiple columns, and more. Traditionally, this meant setting up complicated retrieval pipelines including OCR (optical character recognition), layout detection, semantic chunking, and other processing steps. In 2024, the model ColPali was introduced to address these challenges and simplify the process. Source: https://huggingface.co/blog/manu/colpali From Elasticsearch version 8.18 onwards, we added support for late-interaction models such as ColPali as a tech preview feature. In this blog, we will take a look at how we can use ColPali to search through documents in Elasticsearch. Does this work? While we have many benchmarks that are based on previously cleaned up text data to compare different retrieval strategies, the authors of the ColPali paper argue that real-world data in many organizations is messy and not always available in a nice, cleaned-up format. Example documents from the ColPali paper: https://arxiv.org/pdf/2407.01449 To better represent these scenarios, the ViDoRe benchmark was released alongside the ColPali model. This benchmark includes a diverse set of document images from sectors such as government, healthcare, research, and more. A range of different retrieval methods, including complex retrieval pipelines or image embedding models, were compared with this new model. The following table shows that ColPali performs exceptionally well on this dataset and is able to retrieve relevant information from these messy documents reliably. Source: https://arxiv.org/pdf/2407.01449 Table 2 How does it work? As teased in the beginning, the idea of ColPali is to just embed the image instead of extracting the text via complicated pipelines. ColPali builds on the vision capabilities of the PaliGemma model and the late-interaction mechanism introduced by ColBERT. Source: https://arxiv.org/pdf/2407.01449 Figure 1 Let’s first take a look at how we index our documents. Instead of converting the document into a textual format, ColPali processes documents by dividing a screenshot into small rectangles and converts each into a 128-dimensional vector. This vector represents the contextual meaning of this patch within the document. In practice, a 32x32 grid generates 1024 vectors per document. For our query, the ColPali model creates a vector for each token. To score documents during search, we calculate the distance between each query vector and each document vector. We keep only the highest score per query vector and sum those scores for a final document score. Late interaction mechanism for scoring ColBERT Interpretability Vector search with bi-encoders struggle with the fact that the results are sometimes not very interpretable—meaning we don’t know why a document matched. Late interaction models are different: we know how well each document vector matches our query vectors, therefore we can determine where and why a document matches. A heatmap of where the word “hour” matches in this document. Source: https://arxiv.org/pdf/2407.01449 Searching with ColPali in Elasticsearch We will be taking a subset of the ViDoRe test set to take a look at how to index documents with ColPali in Elasticsearch. The full code examples can be found on GitHub . To index the document vectors, we will be defining a mapping with the new rank_vectors field. We now have an index ready to be searched full of ColPali vectors. To score our documents, we can use the new maxSimDotProduct function. Conclusion ColPali is a powerful new model that can be used to search complex documents with high accuracy. Elasticsearch makes it easy to use as it provides a fast and scalable search solution. Since the initial release, other powerful iterations such as ColQwen have been released. We encourage you to try these models for your own search applications and see how they can improve your results. Before implementing what we covered here in production environments, we highly recommend that you check out part 2 of this article. Part 2 explores advanced techniques, such as bit vectors and token pooling, which can optimize resource utilization and enable effective scaling of this solution. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to Does this work? How does it work? Interpretability Searching with ColPali in Elasticsearch Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Searching complex documents with ColPali - part 1 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastiacsearch-colpali-document-search",
+    "meta_description": "Learn about ColPali and explore how to use it to search through complex documents in Elasticsearch, including tables, figures, multiple columns & more."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. Search Relevance Vector Database How To PS BT By: Peter Straßer and Benjamin Trent On March 20, 2025 Part of Series The ColPali model series Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our previous blog on ColPali , we explored how to create visual search applications with Elasticsearch. We primarily focused on the value that models such as ColPali bring to our applications, but they come with performance drawbacks compared to vector search with bi-encoders such as E5. Building on the examples from part 1 , this blog explores how to use different techniques and Elasticsearch's powerful vector search toolkit in order to make late interaction vectors ready for large-scale production workloads. The full code examples can be found on GitHub. Problem ColPali creates over 1000 vectors per page for the documents in our index. This results in two challenges when working with late interaction vectors: Disk space: Saving all these vectors on disks will incur a serious amount of storage usage, which will be expensive at scale. Computation: When ranking our documents with the maxSimDotProduct() comparison, we need to compare all of these vectors for each of our documents with the N vectors of our query. Let’s look at some techniques on how to address these issues. Bit vectors In order to reduce disk space, we can compress the images into bit vectors. We can use a simple Python function to transform our multi-vectors into bit vectors: The function's core concept is straightforward: values above 0 become 1, and values below 0 become 0. This results in an array of 0s and 1s, which we then transform into a hexadecimal string representing our bit vector. For our index mapping, we set the element_type parameter to bit : After having written all of our new bit vectors to our index, we can rank our bit vectors using the following code: Trading off a bit of accuracy, this allows us to use hamming distance ( maxSimInvHamming(...) ), which is able to leverage optimizations such as bit-masks, SIMD, etc. Learn more about bit vectors and hamming distance in our blog . Alternatively, we can not convert our query vector to bit vectors and search with the full-fidelity late interaction vector: This will compare our vectors using an asymmetric similarity function. Let’s think about a regular hamming distance between two bit vectors. Suppose we have a document vector D: And a query vector Q: Simple binary quantization will transform D into 10101101 and Q into 11111011 . For hamming distance, we need direct bit math—it's extremely fast. In this case, the hamming distance is 01010110 , which is 86 . So, scoring then becomes the inverse of that hamming distance. Remember, more similar vectors have a SMALLER hamming distance, so inverting it allows for more similar vectors to be scored higher. Specifically here, the score would be 0.012 . However, note how we lose the magnitude of each dimension. A 1 is a 1 . So, for Q , the difference between 0.01 and 0.79 disappears. Since we are simply quantizing according to >0 , we can do a small trick where the Q vector isn’t quantized. This doesn’t allow for the extremely fast bitwise math, but it does keep the storage cost low as D is still quantized. In short, this retains the information provided in Q , thus increasing the distance estimation quality and keeping the storage low. Using bit vectors allows us to save significantly on disk space and computational load at query time. But there is more that we can do. Average vectors To scale our search across hundreds of thousands of documents, even the performance benefits that bit vectors give us will not be enough. In order to scale to these types of workloads, we will want to leverage Elasticsearch’s HNSW index structure for vector search. ColPali generates around a thousand vectors per document, which is too many to add to our HNSW graph. Therefore, we need to reduce the number of vectors. To do this, we can create a single representation of the document's meaning by taking the average of all the document vectors produced by ColPali when we embed our image. We simply take the average vector over all late interaction vectors. As of now, this is not possible within the Elastic itself and we will need to preprocess the vectors before ingesting them in Elasticsearch. We can do this with Logstash or Ingest pipelines, but here we will use a simple Python function: We are also normalizing the vector so that we can use the dot product similarity. After transforming all of our ColPali vectors to average vectors, we can index them into our dense_vector field: We have to consider that this will increase total disk usage since we are saving more information along with our late interaction vectors. Additionally, we will use extra RAM to hold the HNSW graph, allowing us to scale the search over billions of vectors. To reduce the usage of RAM, we can make use of our popular BBQ feature . In turn, we get fast search results over massive data sets that would otherwise not be possible. Now, we simply search with the knn query to find our most relevant documents. The previously best match has unfortunately fallen to rank 3. To fix this problem, we can do a multi-stage retrieval. In our first stage, we are using the knn query to search the best candidates for our query over millions of documents. In the second stage, we are only reranking the top k (here: 10) with the higher fidelity of the ColPali late interaction vectors. Here, we are using the in 8.18 introduced rescore retriever to rerank our results. After rescoring we see that our best match is again in first position. Note: In a production application we can use a much higher k than 10 as the max sim function is still comparatively performant. Token pooling Token pooling reduces the sequence length of multi-vector embeddings by pooling redundant information, such as white background patches. This technique decreases the number of embeddings while preserving most of the page's signal. We are clustering semantically similar vectors to achieve less vectors overall. Token pooling works by grouping similar token embeddings within a document into clusters using a clustering algorithm. Then, the mean of the vectors in each cluster is calculated to create a single, aggregated representation. This aggregated vector replaces the original tokens in the group, reducing the total number of vectors without significant loss of document signal. The ColPali paper proposes an initial pool factor value of 3 for most datasets, which maintains 97.8% of the original performance while reducing the total number of vectors by 66.7%. Source: https://arxiv.org/pdf/2407.01449 But we need to be careful: The \"Shift\" dataset, which contains very dense, text-heavy documents with little white space, declines rapidly in performance as pool factors increase. To create the pooled vectors, we can use the colpali_engine library: We now have a vector that was reduced by about 66.7% in its dimensions. We index it as usual and we are able to search on it with our maxSimDotProduct() function. We are able to get good search results at the expense of some slight accuracy in results. Hint: With a higher pool_factor (100-200), you can also have a middle ground between the average vector solution and the one we discussed here. With around 5-10 vectors per document, it becomes viable to index them in a nested field to leverage the HNSW index. Coss-encoder vs. late-interaction vs. bi-encoder With what we have learned so far, where does this place late interaction models such as ColPali or ColBERT when we compare them to other AI retrieval techniques? While the max sim function is cheaper compared to cross-encoders, it still requires many more comparisons and computation than vector search with bi-encoders, where we are just comparing two vectors for each query-document pair. Because of this, our recommendation for late-interaction models is to generally only use them for reranking the top k search results. We also capture this in the name of the field type: rank_vectors. But what about the cross encoder? Are late interaction models better because they are cheaper to execute at query time? As is often the case, the answer is: it depends. Cross encoders generally produce higher quality results, but they require a lot of compute because the query document pairs need to do a full pass through the transformer model. They also benefit from the fact that they do not require any indexing of vectors and can operate in a stateless manner. This results in: Less disk space used A simpler system Higher quality of search results Higher latency and therefore not being able to rerank as deep On the other hand, late Interaction models can offload some of this computation at index them, making the query cheaper. The price we pay is having to index the vectors, which makes our indexing pipelines more complex and also requires more disk space to save these vectors. Specifically in the case of ColPali, the analysis of information from images is very expensive as they contain a lot of data. In this case, the tradeoff shifts in favor of using a late interaction model such as ColPali because evaluating this information at query time would be too resource intensive/slow. For a late interaction model such as ColBERT, which works on text data like most cross-encoders (e.g., elastic-rerank-v1), the decision might lean more toward using the cross-encoder to benefit from the disk savings and simplicity. We encourage you to weigh those pros and cons for your use-case and experiment with the different tools that Elasticsearch provides you to build the best search applications. Conclusion In this blog, we explored various techniques to optimize late interaction models like ColPali for large-scale vector search in Elasticsearch. While late interaction models provide a strong balance between retrieval efficiency and ranking quality, they also introduce challenges related to storage and computation. To address these challenges, we looked at: Bit vectors to significantly reduce disk space while leveraging efficient similarity computations like hamming distance or asymmetric max similarity. Average vectors to compress multiple embeddings into a single dense representation, enabling efficient retrieval with HNSW indexing. Token pooling to intelligently merge redundant embeddings while maintaining semantic integrity, reducing computational overhead at query time. Elasticsearch provides a powerful toolkit to customize and optimize search applications based on your needs. Whether you prioritize retrieval speed, ranking quality, or storage efficiency, these tools and techniques allow you to balance performance and quality as you need for your real-world applications. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to Problem Bit vectors Average vectors Token pooling Coss-encoder vs. late-interaction vs. bi-encoder Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Scaling late interaction models in Elasticsearch - part 2 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/scale-late-interaction-model-colpali",
+    "meta_description": " Explore techniques to scale late interaction models like ColPali for large-scale vector search in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. Inside Elastic AJ By: Andy James On November 8, 2024 Part of Series GenAI for customer support Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog series reveals how our Field Engineering team used the Elastic stack with generative AI to develop a lovable and effective customer support chatbot. If you missed other installments in the series, be sure to check out: Part 1: Building our proof of concept Part 2: Building a knowledge library Part 3: Designing a chat interface for chatbots... for humans Part 4: Tuning RAG search for relevance Launch blog: GenAI for customer support - Explore the Elastic Support Assistant What I find compelling about observability is that it has utility in both good times and bad. When everything is going great, your observability is what provides you the metrics to show off the impact your work is having. When your system is a rough day, your observability is what will help you find the root cause and stabilize things as quickly as possible. It's how we noticed a bug causing us to load the same data over and over again from our server. We saw in the APM data that the throughput for one of our endpoints was well over 100 transactions per minute, which was unreasonably large for the size of our user base. We could confirm the fix when we saw the throughput reduce to a much more reasonable 1 TPM. It's also how I know we served up our 100th chat completion 21 hours post launch (you love to see it folks). This post will discuss the observability needs for a successful launch, and then some unique observability considerations for a chatbot use case such as our Support Assistant. Critical GenAI observability components You're going to want three main pieces in place. A status dashboard, alerting, and a milestones dashboard, in that order. We’ll dig more into what that means, and what I put into place for the Support Assistant launch as it pertains to each. There is one requirement that all three of those components requires; data. So before we can dive into how to crunch the data for actionable insights, let’s take a look at how we collect that data for the Support Assistant (and generally for the Support Portal). Observability data collection We have an Elastic Cloud cluster dedicated to monitoring purposes. This is where all of the observability data I am going to discuss gets stored and analyzed. It is separate from our production and staging data Elastic clusters which are where we manage the application data (e.g. knowledge articles, crawled documentation). We run Elastic’s Node APM client within our Node application that serves the API, and have Filebeat running to capture logs. We have a wrapper function for our console.log and console.error calls that appends APM trace information at the end of each message, which allows Elastic APM to correlate logs data to transaction data. Additional details about this feature are available on the Logs page for APM . The key piece of information you'll find there is that apm.currentTraceIds exists to provide exactly what you need. From there it's nothing complicated, just a pinch of string formatting. Copy ours. A small gift; from my team, to yours. We use the Elastic Synthetics Monitoring feature to check on the liveliness of our application and critical upstream services (e.g. Salesforce, our data clusters). At the moment we use the HTTP monitor type, but we're looking at how we might want to use Journey monitors in the future. The beauty of the basic HTTP monitor is that all you need to configure is a URL to ping, how often you want to ping it, and from where. When choosing which locations to check from, we know for the app itself we want to check from locations around the world, and because there are some calls directly from our user's browsers to the data clusters, we also check that from all of the available locations. However, for our Salesforce dependency, we know we only connect to that from our servers, so we only monitor that from locations where the Support Portal app is being hosted. We also ship Stack Monitoring data from the application data Elastic clusters, and have the Azure OpenAI integration shipping logs and metrics from that service via an Elastic Agent running on a GCP Virtual Machine. Setting up Elastic APM Getting started with Elastic APM is really easy. Let's go over the APM configuration for our Support Portal's API service as an example. Let's unpack a few things going on there. The first is that we've allowed ourselves to inject a mock APM instance in testing scenarios, and also added a layer of protection to prevent the start function from being called more than once. Next, you'll see that we are using environment variables to power most of our configuration options. APM will automatically read the ELASTIC_APM_ENVIRONMENT environment variable to fill in the environment setting, ELASTIC_APM_SERVER_URL for the serverUrl setting, and ELASTIC_APM_SECRET_TOKEN for the secretToken setting. You can read the full list of configuration options here , which includes the names of the environment variables that can be used to configure many of the options. I want to emphasize the value of setting environment . It allows me to easily distinguish traffic from different environments. Even if you aren't running a staging environment (which you really should), collecting APM when you're developing locally can also come in handy, and you will want to be able to look at production and development data in isolation most of the time. Being able to filter by service.environment is convenient. If you're running in Elastic Cloud, you can follow these steps to get the values for serverUrl and secretToken to use with your configuration. Visit your Kibana instance, and then navigate to the Integrations page. Find the APM integration. Scroll past the APM Server section to find the APM Agents section and you'll see a Configure the agent subsection that includes the connection info. Status dashboard Data is only as useful as your ability to extract meaning from it, and that’s where dashboarding comes in. With Elastic Cloud, it’s default to be running Kibana along with Elasticsearch, so we’ve already got a great visualization layer available within our stack. So what do we want to see? Usage, latency, errors, and capacity are pretty common categories of data, but even within those, your specific needs will dictate what specific visualizations you want to make for your dashboard. Let’s go over the status dashboard I made for the Support Assistant launch to use as an example. You might be surprised to notice the prime real estate in the upper-left being host to text. Kibana has a markdown visualization you can use to add instructions, or in my case a bunch of convenient links to other places where we might want to follow up on something seen in the dashboard. The rest of the top row displays some summary stats like the total number of chat completions, unique users, and errors for the time range of the dashboard. The next set of visualizations are time series charts to examine latency, and usage over time. For our Support Assistant use case, we are specifically looking at latency of our RAG searches and our chat completions. For usage, I’m interested in the number of chat completions, unique users, returning users, and a comparison of assistant users to all Support Portal users. Those last two I've left below the fold of the image because they include details we decided not to share. I like to save a default time range with dashboards. It anchors other users to a default view that should be generally useful to see when they first load the dashboard. I pinned the start timestamp to approximately when the release went live, and the end is pinned to now . During the launch window, it's great to see the entire life of the feature. At some point it will probably make more sense to update that stored time to be a recent window like “last 30 days.” Bonus challenge: Can you tell when we upgraded our model from GPT-4 to the more powerful GPT-4o? I have additional areas of the status dashboard focused on users who are using the most or experiencing the most errors, and then also some time series views of HTTP status and errors over time. Your status dashboard will be different, and it should be. This type of dashboard also has the tendency to evolve over time (mine did noticably during the time I was drafting this post). Its purpose is to be the answer key to the series of questions that are most important to be able to answer about the feature or system you’re observing. You will discover new questions that are important, and that might add some new visualizations to the dashboard. Sometimes a question becomes less relevant or you come to understand it was less meaningful than you expected, and so you could remove or rearrange it below other items. Before we move on from this dashboard, let's take a couple of detours to take a look at an APM trace for our chat completions, and then how I used ES|QL to create that returning users visualization. APM traces If you've never seen an Elastic APM trace there is probably a ton of really compelling things going on in that image. The header shows request URL, response status, duration, which browser was used. Then when we get into the waterfall chart we can see the breakdown of which services were involved and some custom spans. APM understands that this trace traveled through our frontend server (green spans), and our API service (blue spans). Custom spans are a great way to monitor performance of specific tasks. In this case where we are streaming chat completions, I want to know how long until the first tokens of generation arrive, and also how long the entire completion process takes. The average duration of these spans is charted on the dashboard. Here's a trimmed down snippet of the chat completion endpoint that focusses on starting and ending the custom spans. Using ES|QL to visualize returning users When I first started trying to visualize repeat users, my original goal was to end up with something like a stacked bar chart per day where the total size of the bar should be the number of unique users that day, and the break down would be net new users vs. returning users. The challenge here is that to compute this requires overlapping windows, and that's not compatible with how histograms work in Kibana visualizations. A colleague mentioned that ES|QL might have some tools to help. While I didn't end up with the visualization I originally described, I was able to use it to help me process a dataset where I could generate the unique combinations of user email and request date, which then enabled counting how many unqiue days each user had visited. From there, I could visualize the distribution of quantity of visits. Here's the ES|QL query that powers my chart. Alerting With the status dashboard in place, you have a way to quickly understand the state of the system both at the present and over time. The metrics being displayed in the visualizations are inherently metrics that you care about, but you can’t nor would you want to be glued to your dashboard all day (well maybe the excitement of the first couple days after launch left me glued to my dashboard, but it’s definitely not a sustainable strategy). So let’s talk about how alerting can untether us from the dashboard, while letting us sleep well, knowing that if something starts going wrong, we’ll get notified instead of finding out the next time you chase the sweet sensation of staring at that beautiful dahsboard. A very convenient thing about Elastic Observability is that the details you need to know to make the alerting rules, you already figured out in making the visualizations for the dashboard. Any filters you were applying, and the specific fields from the specific indices that you visualized are the main configuration details you need to configure alerting rules. You’re essentially taking that metric defined by the visualization and adding a threshold to decide when to trigger the alert. How should I pick a threshold? For some alerts it might be about trying to achieve a certain quality of service that is defined by the team. In a lot of cases, you want to use the visualizations to establish some sort of expected baseline, so that you can then choose a threshold based on how much of a deviation from that observed baseline you’re willing to tolerate. This is a good time to mention that you might be planning to hold off integrating APM until the end of the development process, but I would encourage you to do it sooner. For starters, it’s not a big lift (as I showed you above). The big bonus for doing it early is that during development you are capturing APM information. It might help you debug something during development by capturing details you can investigate during an expected error, and then it’s also capturing sample data. That can be useful for both verifying your visualizations (for metrics involving counts), and then also for establishing baseline values for metric categories like latency. How should I get alerted? That will really depend on the urgency of the alert. For that matter, there are some alerts where you may want to configure multiple alerts at different thresholds. For example, at a warning level, you might just send an email, but then there could also be a critical level that sends a Slack message tagging your team. An example of a non-critical alert that is best as email-only are the ones I configured to go along with the milestones dashboard we’ll talk about next. It’s a good idea to test the formatting of your alert outputs by temporarily configuring it such that it will trigger right away. A best practice for determining which alerts notify in passive ways (e.g. an email) vs. demanding immediate attention (e.g. getting paged) is to ask yourself \"is there a defined set of steps to take in response to this alert to resolve it?\" If there is not a well-defined path to take to investigate or resolve an alert, then paging someone isn't going to add much value, and instead just add noise. It can be hard to stick to this, and if you've just realized that you've got a bunch of unactionable alerts being noisy, maybe see if you can think of a way to surface those in a less demanding way. What you don't want, is to accidentally train your team to ignore alerts because they are so often inactionable. Milestones dashboard The milestones dashboard arguably does not need to be separate from the status dashboard, and could be arranged as an area of the status dashboard, but I like having the separate space focused on highlighting achievements. The two metrics I was most interested in highlighting with milestones were unique users, and chat completions. There is a horizontal bullet visualization that I found suitable for displaying a gauge with a set range and an optional goal. I decided that time windows for all time, last 7 days, and last 30 days were standard but interesting to look at and so I have two columns side by side where each row is a different window of time. The bottom row has a bar chart aggregating by day, creating a nice way to look for growth over time. Special considerations for the Support Assistant & GenAI Observability We’ve discussed the basics of observing any new feature or system you’re launching, but every project is going to have some unique observability opportunities, so the rest of this blog post will be discussing some of the ones that came up for our team while working on the Support Assistant. If you’re also building a chatbot experience, some of these might apply directly for your use case, but even if your project is very different, maybe these ideas inspire some additional observability options and strategies. NOTE: Most of the code examples I am about to share come from the chat completion request handler in our API layer where we send a request to the LLM and stream the response back to the client. I am going to show you that same handler a few different times, but editted down to only include the lines relevant to the functionality being described at that time. First generation timeouts You may remember from the UX entry in this series that we chose to use streaming responses from the LLM in order to avoid having to wait for the LLM generation to finish before being able to show anything to the user. The other thing we did to try to give our assistant a more responsive experience was to enforce a 10 second timeout on getting the first chunk of generated text back. Being able to see trends in this type of error is critical for us to be able to know if our service is reliable, or overloaded. We've noticed with the launch that these timeouts are more likely to happen when there are more simultaneous users. Sometimes this even leads to retries overloading the provisioned capacity on our LLM service, leading to further errors displayed to the user. The APM agent runs on our server, and the timeout for first generation was configured in the client code that runs in the user’s browser, so I started experimenting with listening for events on the server to detect when the client had sent the abort signal so that I could send an error to APM with captureError , but what I found was that the server never became aware that the client aborted the request. I listened on the request, I listened on the socket, and then I did some Internet searches, and reached the conclusion that at least for our application stack there was no practical or built-in way for our server to recognize the client had timed out. To work around this, I moved the timeout and AbortController from the client code to be in our API layer that was talking directly to the LLM. Now when we hit the timeout, we’re on the server where we can send the error to APM and then close the connection early from the server side, which propagates just fine down to the client. Here's a view of our request handler that shows just the parts related to first generation timeout: Unfortunately, just closing the connection from the server created an unexpected behavior with the client. Without sending back a proper error signal or any generated response text, the client code was not running the parts of the code where we exited the loading state. To smooth this out, I updated the server side timeout to add an extra step before calling end() on the response. The streaming responses work by sending a series of events related to the generation down to the client. There are 4 flavors; Started, Generation, End, and Error. By adding an extra step to send an Error event before closing the connection, the client code was able to update the UI state to reflect an error. So let's see the handler again with that included: The first generation timeout error is a very generic error, and always logs the same message. For the other types of errors, there are many different failures that could result in reaching the error handler. For this, we pass in a parameterized message object , so that APM will group all of the errors captured by the same error handling together, despite the error message varying depending on the actual error that occurred. We have parameters for the error message, error code, and also which LLM we used. Declining requests The goal of the Support Assistant is to be helpful, but there are two broad categories of input that we want to avoid engaging with. The first is questions unrelated to getting technical support for Elastic products. We think it’s pretty fair to insist that since we pay the bills for the LLM service, that we don’t want folks using the Support Assistant to draft emails or write song lyrics. The second broad category we avoid are topics we know it cannot answer well. The prime example of this is billing questions. We know the Support Assistant does not have access to the data needed to help answer billing questions accurately, and certainly for a topic like billing, an inaccurate answer is worse than none at all (and the Sales team, finance team, and lawyers all breathed a sigh of relief 😉). Our approach was to add instructions to the prompt before the user's input as opposed to using a separate call to a 3rd party service. As our hardening needs evolve we may consider adding a service, or at least splitting the task of deciding whether or not to attempt to respond into a separate LLM request dedicated to making that determination. Standardized response I’m not going to share a lot of details about our prompt hardening methods and what rules we put in the prompt because this blog is about observability, and I also feel that the state of prompt engineering is not at a place where you can share your prompt without helping a malicious user get around it. That said, I do want to talk about something I noticed while I was developing our prompting strategy to avoid the two categories mentioned above. I was having some success with getting it to politely decline to answer certain questions, but it wasn’t very consistent with how it replied. And the quality of the response varied. To help with this, I started including a standardized response to use for declining requests as part of the prompt. With a predefined response in hand, the chatbot reliably used the standard response when declining a request. The predefined response is stored as its own variable that is then used when building the payload to send to the LLM. Let's take a look at why that comes in handy. Monitoring declined requests Bringing this back to observability, by having a predefined response for declining requests, it created an opportunity for me to examine the response coming from the LLM, and compare it to the variable containing the standardized decline message. When I see a match, I use captureError to keep a record of it. It’s important for us to keep an eye on declined requests because we want to be sure that these rejections are happening for the right reasons. A spike in rejections could indicate that a user or group of users is trying to get around our restrictions to keep the chat on the topic of Elastic product technical support. The strategy shown above collects all the tokens in a string[] and then joins then when the response is complete to make the comparison. I heard a great optimization suggestion from a colleague. Instead of collecting the tokens during streaming, just track an index into the DECLINED_REQUEST_MESSAGE , and then as each token comes in, see if it matches the next expected characters of the message. If so, keep tracking, but if there ever isn't a match, you know it's not a declined request. That way you don't have to consume extra memory buffering the whole response. We aren't seeing performance or memory issues, so I didn't update my strategy, but it was too clever of an idea to not mention here. Mitigating abuse Closely related to the previous section on declining requests, we know that these chatbot systems backed by LLMs can be a target for folks who want free access to an LLM service. Because you have to be logged in and have a technical support subscription (included with Elastic Cloud) to get access to the Support Assistant, this was a bit less of a concern for our particular launch, but we wanted to be prepared just in case, and maybe your use case doesn’t have the same gating upfront. Our two prongs of abuse mitigation are a reduced rate limit for the chat completion endpoint, and a feature flag system with the flexibility to allow us to configure flags that block access to a particular feature for a given user or organization. Rate limit Our application already had a general rate limit across all of our endpoints, however that rate limit is meant to be a very generous rate that should really only get triggered if something was actually going wrong and causing a significant amount of spam traffic. For a rate limit to be meaningful as applied to the Support Assistant chat completion endpoint, it was going to have to be a much lower limit. It was also important to leave the limit generous enough that we wouldn’t be penalizing enthusiastic users either. In addition to usage data from beta test we did with customers, we’ve had an internally-facing version of the Support Assistant available to our Support Engineers to help streamline their workflows in answering cases. This gave us something to anchor our usage expectation to. I looked at the previous week's data, and saw that our heaviest internal users had sent 10-20 chat messages per day on average with the top user sending over 70 in a single day. I also had latency metrics to tell me that the average completion time was 20 seconds. Without opening multiple windows or tabs, a single user asking rapid fire questions one after another, would not be able to send more than about 3 chat messages in a minute. Our app sessions expire after an hour, so I decided that it would be best to align our rate limit window with that hour long session window. That means the theoretical max chats for a single user in an hour where they use a single tab is 180 chats in an hour. The team agreed on imposing a limit of 20 chat completions in a one hour window. This is as many chats for our customer users in an hour as our heaviest internal users send in a whole day, while limiting any malicious users to ~11% of that theoretical max based on latency for a full completion. I then configured an alert looking for HTTP 429 responses on the chat completion endpoint, and there is also a table in the status dashboard listing users that triggered the limit, how many times, and when the most recent example was. I am very happy to report that we have not had anyone hit the limit in these first couple of weeks since launch. The next section discusses an option we gave ourselves for how to react if we did see certain individuals that seemed to be trying to abuse the system. Ban flags In rolling out the Support Assistant, we did a limited beta test with some hand-selected customers. To enable the Support Assistant for a subset of users during development, we set up a feature flag system to enable features. As we got closer to the launch we realized that our feature flags needed a couple of upgrades. The first was that we wanted to have the concept of features that were on by default (i.e. already fully launched), and the second was to allow flags to be configured such that they blocked access to a feature. The driving factor behind this one was that we heard some customer organizations might be interested in blocking their employees from engaging with the Support Assistant, but we also recognized that it could also come in handy if we ever reached a conclusion that some particular user was consistently not playing nice, we could cut off the feature while an appropriate Elastic representative tried to reach out and have a conversation. Context creates large payloads This last section is part special consideration for a chatbot, and part observability success story. In studying our status dashboard we started seeing HTTP 413 status codes coming back for a small, but non-negligible amount of traffic. That meant we were sending payloads from the browser that were above the configured size that our server would accept. Then one of our developers stumbled upon a reliable chat input that reproduced it so that we could confirm that the issue was the amount of context generated from our RAG search, combined with the user’s input was exceeding the default limits. We increased the size of the payloads accepted by the chat completion endpoint, and ever since we released the fix, we haven’t seen any more transactions with 413 response status. It’s worth noting that our fix to expand the accepted payload size is really more of a short-term bandage than a long-term solution. The way we plan to solve this problem in a more holistic way is to refactor the way we orchestrate our RAG searches and chat completions such that instead of sending the full content of the RAG results back to the client to include in the completion payload, instead we’d rather only return limited metadata like ID and title for the RAG results to the client, and then include that in the request with the user’s input to the completion endpoint. The completion endpoint would fetch the content of the search results by ID, and combine it with our prompt, and the user’s input to make the request to the LLM service. Here's a snippet where we configure the Express route for the chat completion endpoint. It touches on the rate limit, flags, and the boosted payload size: Conclusion Ideally, observability is more than one thing. It's multi-faceted to provide multiple angles and viewpoints for creating a more complete understanding. It can and should evolve over time to fill gaps or bring deeper understanding. What I hope you can take away from this blog is a framework for how to get started with observability for your application or feature, how the Elastic stack provides a full platform for achieving those monitoring goals, and a discussion of how this fits into the Support Assistant use case. Engage with this advice, and Bob's your mother's brother, you've got a successful launch! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Jump to Critical GenAI observability components Observability data collection Setting up Elastic APM Status dashboard APM traces Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "GenAI for customer support — Part 5: Observability - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/genai-customer-support-observability",
+    "meta_description": "Explore how we're using GenAI in customer support and for GenAI observability through a real-world use case. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. ES|QL Inside Elastic CL By: Costin Leau On April 15, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. It's my pleasure to announce LOOKUP JOIN —a new ES|QL command available in tech preview in Elasticsearch 8.18, designed to perform left joins for data enrichment. With ES|QL, one can query and combine documents from one index with documents from a second index based on a criterion defining how the documents should be paired natively in Elasticsearch. This approach enhances data management by dynamically correlating documents at query time across multiple indices, thus removing duplication. For instance, the following query connects employee data from one index with their corresponding department information from another index using a shared field key name: As the name indicates, LOOKUP JOIN performs a complementing, or left (outer) , join at query time between any regular index (employees) - the left side and any lookup index (departments) - the right side. All the rows from the left side are returned along with their corresponding equivalent (if any) from the right side. The lookup side's index mode must be set to lookup. This means that the underlying index can only have one shard. This current solution addresses the cardinality challenges of one side of the join and the issues that distributed systems like Elasticsearch encounter, which are outlined in the next section. Apart from using the lookup index mode, there are no limitations on the source data or the commands used. Additionally, no data preparation is needed. The join can be performed before or after the filtering: Be mixed with aggregations: Or be combined with another join: Executing a Lookup Join Let's illustrate what happens during runtime by looking at a basic query that doesn't include any other commands, such as filter. This will allow us to concentrate on the execution aspect as opposed to the planning phase. The logical plan, a tree structure representing the data flow and necessary transformations, is the result of translating the query above. This logical plan centers on the semantics of the query. To ensure efficient scaling, standard Elasticsearch indices are divided into multiple shards spread across the cluster. In a join scenario, sharding both the left (L) and right (R) sides would result in L*R partitions. To minimize the need for data movement, lookup joins require the right side (which provides the enriching data) to have a single shard, similar to an enrich index, with the replication dictated by the index settings (default is 1). This decreases the amount of nodes needed to execute the join, thereby reducing the problem space. As a result, L*R becomes L*1, which equals L. Thus, the coordinator needs to dispatch the plan only to the left side data nodes, with the hash join performed locally using the lookup/right index to “build” the underlying hash map while the left side is used for “probing” for matching keys in batches. The resulting distributed physical plan, which focused on the distributed execution of the query, looks as follows: The plan consists of two main parts or sub-plans: the physical plan that gets executed on the coordinator (generally speaking, the node receiving/responsible for the query completion) and the plan fragment, which is executed on the data nodes (the nodes holding the data). Since the coordinator does not have the data, it sends a plan fragment to the relevant data nodes for local execution. The results are then sent back to the coordinator node, which computes the final results. The communication between the two entities is represented in the plan through the Exchange block. The coordinator doesn't have to do much work for this query because most of the processing happens on the data nodes. The fragment encapsulates the logical subplan, enabling optimization based on the specific characteristics of each shard's data (e.g., missing fields, local minimum and maximum values). This local replanning also helps manage differences in node code that might exist between nodes or between a node and the coordinator, for example, during cluster upgrades. The local physical plan looks something like this: The plan is designed to reduce I/O by using efficient data extraction methods. The two nodes at the bottom of the tree act as roots , supplying the nodes above. Each one outputs references to the underlying Elasticsearch documents ( doc_id ). This is done intentionally to delay the loading of columns (fields) or documents for as long as possible through the designated extraction nodes (in yellow). In this particular plan, loading takes place right before the hash join on each of the joining sides and prior to the final project just before the data exits the node using only the join resulting data. Future work Qualifiers At the moment, the lookup join syntax requires the key to have the same name in both tables (similar to JOIN USING in some SQL dialects). This can be addressed through RENAME or EVAL : It’s an unnecessary inconvenience that we’re working on removing in the near future by introducing (source) qualifiers. The previous query could be rewritten as (syntax wip): Notice that the join key was replaced by an equality comparison, where each side is using a field name qualifier, which can be implicit (departments) or explicit (e). More join types and performance We are currently working on enhancing the lookup join algorithm to better exploit the data topology with a focus on specializations that leverage the underlying search structures and statistics in Lucene for data skipping. In the long term, we plan to support additional join types, such as inner join (or intersection, which combines documents that have the same field on both sides) and full outer join (or union, which combines the documents from both sides even when there is no common key). Feedback The path to native JOIN support in Elasticsearch has been a long one, dating back to version 0.90. Early attempts included nested and ` _parent ` field types, with the latter eventually being rewritten (in 2.0), deprecated (in 5.0), and replaced by the join field (in 6.0). More recent features like Transforms (7.3) and the Enrich ingest pipeline (7.5) also aimed to address join-like use cases. In the wider Elasticsearch ecosystem, Logstash and Apache Spark (via the ES-Hadoop connector) have provided alternative solutions. Elasticsearch SQL , which debuted in 6.3.0, is also worth mentioning due to the grammar similarity: while it supports a wide range of SQL functionality, native JOIN support has remained elusive. All these solutions work and continue to be supported. However, we think ES|QL, due to its query language and execution engine, significantly simplifies the user experience! ESQL Lookup join is in tech preview, freely available in Elasticsearch 8.18 and Elastic Cloud—try it out and let us know how it works for you! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more ES|QL December 31, 2024 Improving the ES|QL editor experience in Kibana With the new ES|QL language becoming GA, a new editor experience has been developed in Kibana to help users write faster and better queries. Features like live validation, improved autocomplete and quick fixes will streamline the ES|QL experience. ML By: Marco Liberati Jump to Executing a Lookup Join Future work Qualifiers More join types and performance Feedback Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Native joins available in Elasticsearch 8.18 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-esql-lookup-join",
+    "meta_description": "Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. Inside Elastic SB PK By: Shay Banon and Philipp Krenn On February 12, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch just turned 15-years-old. It all started back in February 2010 with the announcement blog post (featuring the iconic “You Know, for Search” tagline), first public commit , and the first release , which happened to be 0.4.0. Let’s take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. GET _cat/stats Since its launch, Elasticsearch has been downloaded an average of 3 times per second , totaling over 1.45 billion downloads. The GitHub stats are equally impressive: More than 83,000 commits from 2,400 unique authors , 38,000 issues, 25,000 forks, and 71,500 stars. And there is no sign of slowing down . All of this is on top of countless Apache Lucene contributions . We’ll get into those for the 25 year anniversary of Lucene, which is also this year. In the meantime, you can check out the 20 year anniversary page to celebrate one of the top Apache projects. A search (hi)story There are too many highlights to list them all, but here are 15 releases and features from the last 15 years that got Elasticsearch to where it is today: Elasticsearch, the company (2012): The open source project became an open source company, setting the stage for its growth. ELK Stack (2013): Elasticsearch joined forces with Logstash and Kibana to form the ELK Stack, which is now synonymous with logging and analytics. Version 1 (2014): The first stable release introduced key features like snapshot/restore, aggregations, circuit breakers, and the _cat API. Shield and Found (2015): Shield brought security to Elasticsearch clusters in the form of a (paid) plugin. And the acquisition of found.no brought Elasticsearch to the cloud, evolving into what is now Elastic Cloud. As an anecdote, nobody could find Found — SEO can be hard for some keywords. Version 2 (2015): Introduced pipelined aggregations, security hardening with the Java Security Manager, and performance and resilience improvements. Version 5 and the Elastic Stack (2016): Skipping two major versions to unify the version numbers of the ELK Stack and turning it into the Elastic Stack after adding Beats. This version also introduced ingest nodes and the scripting language Painless. Version 6 (2017): Brought zero-downtime upgrades, index sorting, and the removal of types to simplify data modeling. Version 7 (2019): Changed the cluster coordination to the more scalable and resilient Zen2, single-shard default settings, built-in JDK, and adaptive replica selection. Free security (2019): With the 6.8 and 7.1 releases, core security became free to help everyone secure their cluster. ILM, data tiers, and searchable snapshots (2020): Made time-series data more manageable and cost-effective with Index Lifecycle Management (ILM), tiered storage, and searchable snapshots. Version 8 (2022): Introduced native dense vector search with HNSW and enabled security by default. ELSER (2023): Launched Elastic Learned Sparse EncodeR model, bringing sparse vector search for better semantic relevance. Open source again (2024): Added AGPL as a licensing option to bring back open source Elasticsearch. Start Local (2024): Made it easier than ever to run Elasticsearch and Kibana: curl -fsSL https://elastic.co/start-local | sh LogsDB (2024): A new specialized index mode that reduces log storage by up to 65%. The future of search is bright Thanks to the rise of AI capabilities, search is more relevant and interesting than ever. So what is next for Elasticsearch? There’s way too much to name, so we’ll stick to three areas and the challenges they address. Serverless No shards, nodes, or versions. Elasticsearch Serverless — which is GA on AWS and just entered technical preview on Azure — takes care of the operational issues you might have experienced in the past: 15 years in, and someone is still setting number_of_shards: 100 for no reason. 15 years, and we’re still debating refresh_interval: 1s vs 30s like it’s a life-or-death decision. 15 years of major versions, minor heart attacks, and the thrill of migrating to the latest version. You can try out Elasticsearch Serverless today. ES|QL “Cheers to 15 years of Elasticsearch — where the Query DSL is still the most complex part of your day.” But it doesn’t have to be. The new Elasticsearch Piped Query Language (ES|QL) brings a much simpler syntax and a significant investment into a new compute engine with performance in mind. While we’re building out more features, you can already use ES|QL today. Don’t worry; the Query DSL will understand. AI everywhere 15 years of query tuning, and we’re still just throwing boost: 10 at the problem. 15 years of making your logs searchable while you still have no idea what’s happening in production. Still the best at finding that one log line… if you remember how you indexed it. AI is redefining what’s possible — from turning raw logs into actionable insights with the AI Assistant for observability and security , to more relevant search with semantic understanding and intelligent re-ranking .. This is only the beginning. More AI-powered features are on the horizon — bringing smarter search, enhanced observability, and stronger security. The future of Elasticsearch isn’t just about finding data; it’s about understanding it. Stay tuned — the best is yet to come. Thanks to all of you Thanks to all contributors, users, and customers over the last 15 years to make Elasticsearch what it is today. We couldn’t have done it without you and are grateful for every query you send to Elasticsearch. Here’s to the next 15 years. Enjoy! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Jump to GET _cat/stats A search (hi)story The future of search is bright Serverless ES|QL Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch: 15 years of indexing it all, finding what matters - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-history-15-years",
+    "meta_description": "Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. Inside Elastic US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more On January 13, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. The frozen data tier can achieve both low cost and good performance by leveraging Elastic's Searchable Snapshots - which offer a compelling solution for managing vast amounts of data while maintaining the performant searchability of data within a budget. In this article, we delve into a benchmark of Elastic's hot and frozen data tiers by running sample queries on 105 terabytes of logs spanning more than 90 days. These queries replicate common tasks within Kibana's Discover, including search with highlighting, total hits, date histogram aggregation, and terms aggregation; that all happen behind the scenes when a user triggers a simple search. The results reveal that Elastic's frozen data tier is quick and delivers latency comparable to its hot tier, with only the first query to the object store being slower - subsequent queries are fast. We replicated the way a typical user would interact with a hot-frozen deployment through Kibana's Discover - its main interface for interacting with indexed documents. When a user issues a search using Discover's search bar three tasks are executed in parallel: a search and highlight operation on 500 docs that doesn't track the total amount of hits (referred as discover_search tasks on the results) a search that tracks the total hits ( discover_search_total in the results) a date histogram aggregation to construct the bar chart (referred as discover_date_histogram ) and also a terms aggregation (referred as discover_terms_agg ) when/if the user clicks the left side bar. Data tiers in Elastic Some types of data decrease in value over time. It's natural to think about application logs where the most recent records are usually the ones that need to be queried more frequently and also need the fastest possible response time. But there are several other examples of such data like medical records (detailed patient histories, diagnoses and physician notes); legal documents (contracts, court rulings, case files, etc.) and bank records (transaction records including descriptions of purchases and merchant names)-just to cite three. All contain unstructured or semi-structured text that requires efficient search capabilities to extract relevant information. As these records age, their immediate relevance may diminish, but they still hold significant value for historical analysis, compliance, and reference purposes. Elastic's data tiers — Hot, Warm, Cold, and Frozen– provide the ideal balance of speed and cost, ensuring you maximize the value of these types of data as they age without sacrificing usability. Through both Kibana and Elasticsearch's search API the use of the underlying data tiers is always automatic and transparent–users don't need to issue search queries in a different way to retrieve data from any specific tier (no need to manually restore the data, or \"rehydrate\"). In this blog we keep it simple by using solely the Hot and Frozen tiers, in what is commonly called a hot-frozen scenario. How the frozen tier works In a hot-frozen scenario, data begins its journey in the hot tier, where it is actively ingested and queried. The hot tier is optimized for high-speed read and write operations, making it ideal for handling the most recent and frequently accessed data. As data ages and becomes less frequently accessed, it is transitioned to the frozen tier to optimize storage costs and resource utilization. The transition from the hot tier to the frozen tier involves converting the data into searchable snapshots. Searchable snapshots leverage the snapshot mechanism used for backups, allowing the data to be stored in a cost-effective manner while still being searchable. This eliminates the need for replica shards, significantly reducing the local storage requirements. Once the data is in the frozen tier, it is managed by nodes specifically designated for this purpose. These nodes do not need to have enough disk space to store full copies of all indices. Instead, they utilize an on-disk Least Frequently Used (LFU) cache. This cache stores only portions of the index data that are downloaded from the blob store as needed to serve queries. The on-disk cache functions similarly to an operating system's page cache, enhancing access speed to frequently requested parts of the data. When a query is executed in the frozen tier, the process involves several steps to ensure efficient data retrieval and caching: 1. Read requests mapping : At the Lucene level, read requests are mapped to the local cache. This mapping determines whether the requested data is already present in the cache. 2. Cache mishandling : If the required data is not available in the local cache (a cache miss), Elasticsearch handles this by downloading a larger region of the Lucene file from the blob store. Typically, this region is a 16MB chunk, which is a balance between minimizing the number of fetches and optimizing the amount of data transferred. 3. Adding data to cache : The downloaded chunk is then added to the local cache. This process ensures that subsequent read requests for the same region can be served directly from the local cache, significantly improving query performance by reducing the need to repeatedly fetch data from the blob store. 4. Cache configuration options : Shared cache size : This setting accepts either a percentage of the total disk space or an absolute byte value. For dedicated frozen tier nodes, the default is 90% of the total disk space. Max headroom : Defines the maximum headroom to maintain. If not explicitly set, it defaults to 100GB for dedicated frozen tier nodes. 5. Eviction policy : The node-level shared cache uses a LFU policy to manage its contents. This policy ensures that frequently accessed data remains in the cache, while less frequently accessed data is evicted to make room for new data. This dynamic management of the cache helps maintain efficient use of disk space and quick access to the most relevant data. 6. Lucene index management : To further optimize resource usage, the Lucene index is opened only on-demand—when there is an active search. This approach allows a large number of indices to be managed on a single frozen tier node without consuming excessive memory. Methodology We ran the tests on a six node cluster in Elastic Cloud hosted on Google Cloud Platform on N2 family nodes: 3 x gcp.es.datahot.n2.68x10x45 - Storage-optimized Elasticsearch instances for hot data. 3 x gcp.es.datafrozen.n2.68x10x90 - Storage-optimized (dense) Elasticsearch instances serving as a cache tier for frozen data. We measured the following spans, which also equate to Terabytes in size, since we indexed one Terabyte per day. We used Rally to run the tests, below is a sample test relative to an uncached search on one day of frozen data ( discover_search_total-1d-frozen-nocache ), iterations refer to the number of times the entire set of operations is repeated, which in this case is 10. Each operation defines a specific task or set of tasks to be performed, and in this example, it is a composite operation. Within this operation, there are multiple requests that specify the actions to be taken, such as clearing the frozen cache by issuing a POST request. The stream within a request indicates a sequence of related actions, such as submitting a search query and then retrieving and deleting the results. Each test would run for 10 times per benchmark run, and we performed 500 benchmark runs across several days, therefore the sample for each task is 5,000. Having a high amount of measurements is essential when we want to ensure statistical significance and reliability of the results. This large sample size helps to smooth out anomalies and provides a more accurate representation of performance, allowing us to draw meaningful conclusions from the data. Results The detailed results are outlined below. The \"tip of the candle\" represents the max (or p100 ) value observed within all the requests for a specific operation, and they are grouped by tier. The green value represents the p99.9 , or the value below what 99.9% of the requests would fall. Due to how Kibana interacts with Elasticsearch–which is via async searches–a more logical way of representing the time is by using horizontal bar charts as below. Since the requests are asynchronous and parallel, they will complete at different times. You don't have to wait for all of them to start seeing query results, and this is how we read the benchmark results. The results are expressed as, for example, 543ms - 2s where 543ms is when we received the first result and 2s when we received the last. 1 Day Span / 1 Terabyte What we observed 99.9% of the times (p99.9): Hot: 543ms - 2s Frozen Not Cached: 1.8s - 14s Frozen Cached: 558ms - 11s What we observed as a maximum latency (likely the very first query): Hot: 630ms - 2s Frozen Not Cached: 1.9s - 28s Frozen Cached: 750ms - 19s 7 Days Span / 7 Terabytes What we observed 99.9% of the times (p99.9): Hot: 555ms - 792s Frozen Not Cached: 2.5s - 14s Frozen Cached: 1s - 12s What we observed as a maximum latency (likely the very first query): Hot: 842ms - 4s Frozen Not Cached: 2.5s - 5.6m (336s) Frozen Cached: 1.1s - 26s 14 Days Span / 14 Terabytes What we observed 99.9% of the times (p99.9): Hot: 551ms - 608ms Frozen Not Cached: 1.8s - 15s Frozen Cached: 551ms - 592ms What we observed as a maximum latency (likely the very first query): Hot: 785ms - 9s Frozen Not Cached: 2.3s - 32s Frozen Cached: 624ms - 7s 30 Days Span / 30 Terabytes We did not use hot data past 14 days on this test, but we can still use the results for frozen as a reference. What we observed 99.9% of the times (p99.9): Frozen Not Cached: 2.3s - 12s Frozen Cached: 1s - 11s What we observed as a maximum latency (likely the very first query): Frozen Not Cached: 2.4s - 68s Frozen Cached: 1.1s - 27s 60 Days Span / 60 Terabytes What we observed 99.9% of the times (p99.9): Frozen Not Cached: 2.3s - 13s Frozen Cached: 1s - 11s What we observed as a maximum latency (likely the very first query): Frozen Not Cached: 2.4s - 18s Frozen Cached: 1.1s - 240s 90 Days Span / 90 Terabytes What we observed 99.9% of the times (p99.9): Frozen Not Cached: 2.4s - 13s Frozen Cached: 1s - 11s What we observed as a maximum latency (likely the very first query): Frozen Not Cached: 3.3s - 5m (304s) Frozen Cached: 1.1s - 1.6m (98s) Cost implications (16x reduction) Let's make a simple pricing exercise using Elastic Cloud. If we were to put the entirety of a 90 days / 90 TB dataset in an all-hot deployment on the most performant hardware profile for large datasets ( Storage Optimized ), that would cost $53.382 / month since we would need about 45 hot nodes to cover about 120TB. Since Elastic Cloud has different hardware profiles, we could also select Storage optimized (dense), which brings the cost to $28.222. However, by benefiting from the Frozen tier, we could make a deployment that holds 1 day in Hot and the rest on Frozen. The cost of such deployment can be as low as $3.290, a staggering 16x reduction on costs . Use Elastic's frozen data tier to cool down the cost of data storage Elastic's frozen data tier redefines what's possible in data storage and retrieval. Benchmark results show that it delivers performance comparable to the hot tier, efficiently handling typical user tasks. While rare instances of slightly higher latency (0.1% of the time) may occur, Elastic's searchable snapshots ensure a robust and cost-effective solution for managing large datasets. Whether you're searching through years of security data for advanced persistent threats or analyzing historical seasonal trends from logs and metrics, searchable snapshots and the frozen tier deliver unmatched value and performance. By adopting the frozen tier, organizations can optimize storage strategies, maintain responsiveness, keep data searchable, and stay within budget. To learn more, see how to set up hot and frozen data tiers for your Elastic Cloud deployment. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Jump to Data tiers in Elastic How the frozen tier works Methodology Results 1 Day Span / 1 Terabyte Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Ice, ice, maybe: Measuring searchable snapshots performance - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/searchable-snapshots-benchmark",
+    "meta_description": "Learn how Elastic’s searchable snapshots make the frozen tier perform like the hot tier, offering latency consistency and reducing costs."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. Vector Database Lucene ML Research TT By: Tommaso Teofili On January 7, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. The challenge of finding nearest neighbors efficiently in high-dimensional spaces, particularly as datasets grow in size, is one of the most important ones in the context of vector search. As discussed in our previous blog post , brute force nearest neighbor search might be the best choice, when dataset size is limited. On the other hand, as vector dataset size increases, switching to approximate nearest neighbor search can be useful to retain query speed without sacrificing accuracy. Elasticsearch implements approximate nearest neighbor search via the Hierarchical Navigable Small World algorithm . HNSW offers an efficient way to navigate the vector space, reducing the computational cost while still maintaining high search accuracy. In particular its hierarchical layered structure makes it possible to visit candidate neighbors and decide whether to include them in the final result set with fewer vector distance computations. However, despite its strengths, the HNSW algorithm can be further optimized for large-scale vector searches. One effective way to enhance HNSW's performance is by finding ways to stop visiting the graph under specific circumstances, called early termination . This blog post explores early termination concepts for HNSW and how they can optimize query execution. HNSW redundancy HNSW is an approximate nearest neighbor algorithm that builds a layered graph where nodes represent vectors and edges represent the proximity between vectors in the vector space. Each layer contains incrementally a larger number of graph nodes. When querying, the search traverses this graph, starting at a random entry point and navigating towards the closest neighbors through the layers. The search process is iterative and expands as it examines more nodes and vectors. This balance between speed and accuracy is central to HNSW, but it can still result in redundant computations, especially when large datasets are involved. In HNSW, redundant computations primarily occur when the algorithm continues to evaluate new nodes or candidates that provide little to no improvement in finding actual neighbors to a query. This happens because, in standard HNSW traversal, the algorithm proceeds layer-by-layer, exploring and expanding candidate nodes until all possible paths are exhausted. In particular, this kind of redundancy can arise when the dataset includes highly similar or duplicate vectors, clusters with dense intra-cluster connections, or vectors in very high-dimensional spaces with little intrinsic structure. Such redundancy leads to visiting unnecessary edges, increasing memory usage and potentially slowing search performance without improving accuracy. In high-dimensional spaces where similarity scores decay quickly, some edges often fail to contribute meaningful shortcuts in the graph, resulting in inefficient navigation paths too. So, in case a number of unnecessary computations can be performed while traversing the graph, one could try to improve the HNSW algorithm to mitigate this issue. Early Termination FTW Navigating the solution space is a fundamental concept in optimization and search algorithms, where the goal is to find an optimal solution among a set of possible solutions. The solution space represents all potential solutions to a given problem, and navigating it involves systematically exploring these possibilities. This process can be visualized as moving through a graph where each node represents a different solution, and the objective is to identify the node that best meets the criteria of the problem. Understanding how to effectively navigate the solution space is crucial for solving complex problems that have huge numbers of solutions. Early termination is a generic optimization strategy that can be applied to any such algorithm to make smart decisions about when stopping searching for solutions under certain circumstances. If any solution is considered 'good enough' to meet a desired criteria, the search can stop and the solution can be considered either a good candidate or an optimal solution. This means some potentially better solutions might remain unexplored, so it's tricky to find a perfect compromise between efficiency and quality of the final solution(s). Early Termination in HNSW In the context of HNSW, early termination can be used to stop the search process before all potential candidates nodes (vectors) have been fully evaluated. Evaluating a candidate node means calculating the actual similarity between the query vector and the vector corresponding to the node in the graph that is being processed; for this reason, when skipping a bunch of such operations while traversing each layer, the computational cost of the query can be greatly reduced. On the other hand, skipping a candidate that would otherwise result in a true nearest neighbor will surely affect the quality of the search results, potentially missing a few candidate vectors that are close to the query vector. Consequently the trade-off between the efficiency gains and loss in accuracy is to be handled with care. Early termination is useful in case of: Sublinear Efficiency: You want to optimize performance in exchange for slightly lower recall. High-Throughput Systems: Faster response times are more valuable than the highest accuracy. Resource Constraints: Compute or memory limitations make full traversal of the HNSW graph undesirable. In the context of HNSW there are a number of options for implementing an early termination strategy. Fixed candidate pool size: One of the simplest early termination strategies is to limit the size of the candidate pool (e.g., the number of nodes evaluated during the search). In HNSW, the search process is iterative, expanding to more nodes as it progresses. By setting a limit on the number of candidates considered, we can terminate the search early and return results based on only a subset of the total graph. Of course this can be implemented either in a layer-wise fashion or accounting for all the nodes across the whole graph. Distance threshold-based termination: Another potentially effective early termination strategy is to make smart decisions based on distance computations between the query vector and the vectors corresponding to HNSW nodes. One could set a threshold based on the distance between the query vector and the current closest vector. If a vector is found whose distance is below a specified threshold, the search can be stopped early, assuming that further exploration is unlikely to yield significantly better results. This goes hand in hand with constraints on the fact that nodes get visited in a smart order, to avoid missing possibly relevant neighbors. Dynamic early termination based on quality estimation: A more sophisticated approach is dynamically adjusting the termination criteria based on the \"quality\" of the results found during each search query. If the search process is converging quickly on high-quality neighbors (e.g., neighbors with very close distances), the algorithm can terminate early, even without hitting a predefined threshold. The first two strategies fall in the category of \"fixed configuration\" early termination strategies, so that the search terminates based on fixed constraints that do not take into account query-specific challenges. In fact not all queries are equally hard, some queries require more candidate visit than others, for example, when the distribution of the vectors is skewed. Consequently some query vectors might fall into denser regions of the vector space, so that they have more candidate nearest neighbors, while some others might fall into \"less populated regions\", making it harder to find their true nearest neighbors. Because of such situations, early termination strategies that can adapt to the density of the vector space (and consequently to the connectivity of the HNSW graph) seem more attractive for real-life scenarios. Therefore determining the optimal point at which to stop searching for each query is more likely to lead to substantial latency reductions without compromising accuracy. Such kinds of early termination strategies are dubbed adaptive as they adapt to each query instance to decide when to terminate the search process. For example, an adaptive early termination strategy can utilize machine learning models to predict how much search effort is sufficient for a given query to achieve the desired accuracy. One such a model dynamically adjusts how much of the graph to explore based on the individual query's characteristics and intermediate search results. Speaking about intermediate search results, they are often powerful predictors of how much further to search. If the initial results are already close to the query, the nearest neighbors are likely to be found soon, allowing for early termination. Conversely, poor initial results indicate a need for more extensive exploration, (see this paper ). Lucene makes it possible to implement early termination in HNSW by means of the KnnCollector interface that exposes an earlyTerminated() method , but it also offers a couple of fixed configuration early termination strategies for HNSW: TimeLimitingKnnCollector makes it possible to stop the HNSW graph traversing when a certain time threshold is met. AbstractKnnCollector is a base KnnCollector implementation that stops the graph traversal once a fixed number of graph nodes are visited. As an additional example, to implement a distance threshold-based termination, we could rely on the minimum competitive similarity recorded by Lucene during HNSW traversal (used to make sure only competitive nodes are explored) and early exit when it falls below a given threshold. Conclusion Early termination strategies for approximate KNN can lead to notable speedups while retaining good accuracy, if correctly implemented. Fixed strategies are easier to implement but they might require more tuning and also not work well across different queries. Dynamic / adaptive strategies, on the other hand, are harder to implement but have the advantage of being able to better adapt to different search queries. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Jump to HNSW redundancy Early Termination FTW Early Termination in HNSW Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Early termination in HNSW for faster approximate KNN search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/hnsw-knn-search-early-termination",
+    "meta_description": "Learn how HNSW can be made faster for KNN search, using smart early termination strategies."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. Integrations Vector Database CH HM By: Chris Hegarty and Hemant Malik On March 19, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. We in the Elastic Engineering org have been busy optimizing vector database performance for a while now. Our mission: making Lucene and Elasticsearch the best vector database. Through hardware accelerated CPU SIMD instructions , introducing new vector data compression innovations ( Better Binary Quantization a.k.a BBQ ), and then exceeding expectations by updating the algorithmic approach to BBQ for even more benefits, and also making Filtered HNSW faster . You get the gist — we’re building a faster, better, efficient(er?) vector database for the developers as they solve those RAG-gedy problems! As part of our mission to leave no efficiencies behind, we are exploring acceleration opportunities with these curious computer chips, which you may have heard of — NVIDIA GPUs! (Seriously, have you not?). When obsessing over performance, we have several problem spaces to explore — how to index exponentially more data, how to retrieve insights from it, and how to do it when your ML models are involved. You should be able to eke out every last benefit available when you have GPUs. In this post, we dive into our collaboration with the NVIDIA vector search team as we explore GPU-accelerated vector search in Elasticsearch. This work paves the way for use cases where developers could use a mix of GPUs and CPUs for real-world Elasticsearch-powered apps. Exciting times! Elasticsearch: Well, hello, GPUs! We are excited to share that the Elasticsearch engineering team is helping build the open-source cuVS Java API experience for developers, which exposes bindings for vector search algorithms. This work leverages our previous experience with Panama FFI. Elasticsearch and Apache Lucene use the NVIDIA cuVS API to build the graph during indexing. Okay, we are jumping ahead; let’s rewind a bit. NVIDIA cuVS , an open-source C++ library, is at the heart of this collaboration. It aims to bring GPU acceleration to vector search by providing higher throughput, lower latency, and faster index build times. But Elasticsearch and Apache Lucene are written in Java; how will this work? Enter lucene-cuvs and the Elastic-NVIDIA-SearchScale collaboration to bring it into the Lucene ecosystem to explore GPU-accelerated vector search in Elasticsearch. In the recent NVIDIA cuVS 25.02 release, we added a Java API for cuVS. The new API is experimental and will continue to evolve, but it’s currently available for use. The question may arise: aren’t Java to native function calls slow? Not anymore! We’re using the new Panama FFI (Foreign Function Interface) for the bindings, which has minimal overhead for Java to native downcalls. We’ve been using Panama FFI in Elasticsearch and Lucene for a while now. It’s awesome! But... there is always a “but”, isn’t there? FFI has availability challenges across Java versions. We overcame this by compiling the cuVS API to Java 21 and encapsulating the implementation within a multi-release jar targeting Java 22. This allows the use of cuVS Java directly in Lucene and Elasticsearch. Ok, now that we have the cuVS Java API, what else would we need? A tale of two algorithms Elasticsearch supports the HNSW algorithm for scalable approximate KNN search. However, to get the most out of the GPU, we use a different algorithm, CAGRA [ C UDA A NN GRA ph ] , which has been specifically designed for the high levels of parallelism offered by the GPU. Before we get into how we look to add support for CAGRA, let’s look at how Elasticsearch and Lucene access index data through a “codec format”. This consists of the on-disk representation, the interfaces for reading and writing data, and the machinery for dealing with Lucene’s segment-based architecture. We are implementing a new KNN (k-nearest neighbors) vector format that internally uses the cuVS Java API to index and search on the GPU. From here, we “plumb” this codec type through Elasticsearch’s mappings to a field type in the index. As a result, your existing KNN queries continue to work regardless of whether the backing index is using a CAGRA or HNSW graph. Of course, this glosses over many details, which we plan to cover in a future blog. The following is the high-level architecture for a GPU-accelerated Elasticsearch. This new codec format defaults to CAGRA. However, it also supports converting a CAGRA graph to an HNSW graph for search on the CPU. Indexing and Searching: Making some “Core” decisions With the stateless architecture for Elasticsearch Serverless, which separates indexing and search, there is now a clear delineation of responsibilities. We pick the best hardware profile to fulfill each of these independent responsibilities. We anticipate users to consider two main deployment strategies: Index and Search on the GPU: During indexing, build a CAGRA graph and use it during search - ideal when extremely low latency search is required. Index on GPU and Search on CPU: During indexing, build a CAGRA graph and convert it to an HNSW graph. The HNSW graph is stored in the index, which can later be used on the CPU for searching. This flexibility provides different deployment models, offering tradeoffs between cost and performance. For example, an indexing service could use GPU to efficiently build and merge graphs in a timely manner while using a lower-powered CPU for searching. So here is the Plan, Stan We are looking forward to bringing performance gains and flexibility with deployment strategies to users, offering various knobs to balance cost and performance. Here is the NVIDIA GTC 2025 session where this work was presented in detail. We’d like to thank the engineering teams at NVIDIA and SearchScale for their fantastic collaboration. In an upcoming blog, we will explore the implementation details and performance analysis in greater depth. Hold on to your curiosity hats 🎩! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Jump to Elasticsearch: Well, hello, GPUs! A tale of two algorithms Indexing and Searching: Making some “Core” decisions So here is the Plan, Stan Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/gpu-accelerated-vector-search-elasticsearch-nvidia",
+    "meta_description": "Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. Inside Elastic AS By: Antonio Schönmann On August 22, 2024 Part of Series GenAI for customer support Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog series reveals how our Field Engineering team used the Elastic stack with generative AI to develop a lovable and effective customer support chatbot. If you missed other installments in the series, be sure to check out part one , part two , part three , the launch blog , and part five . Welcome to part 4 of our blog series on integrating generative AI in Elastic's customer support. This installment dives deep into the role of Retrieval-Augmented Generation (RAG) in enhancing our AI-driven Technical Support Assistant. Here, we address the challenges, solutions, and outcomes of refining search effectiveness, providing action items to further improve its capabilities using the toolset provided in the Elastic Stack version 8.11 . Implied by those actions, we have achieved a ~75% increase in top-3 results relevance and gained over 300,000 AI-generated summaries that we can leverage for all kinds of future applications . If you're new to this series, be sure to review the earlier posts that introduce the core technology and architectural setup. If you missed the last blog of the series, you can find it here . RAG tuning: A search problem Perfecting RAG (Retrieval-Augmented Generation) is fundamentally about hitting the bullseye in search accuracy 🎯: Like an archer carefully aiming to hit the center of the target, we want to focus on accuracy for each hit. Not only that, we also want to ensure that we have the best targets to hit – or high-quality data . Without both together , there's the potential risk that large language models (LLMs) might hallucinate and generate misleading responses. Such mistakes can definitely shake users' trust in our system, leading to a deflecting usage and poor return on investment. To avoid those negative implications, we've encountered several challenges that have helped us refine our search accuracy and data quality over the course of our journey. These challenges have been instrumental in shaping our approach to tuning RAG for relevance, and we're excited to share our insights with you. That said: let's dive into the details! Our first approach We started with a lean, effective solution that could quickly get us a valuable RAG-powered chatbot in production. This meant focusing on key functional aspects that would bring it to operational readiness with optimal search capabilities. To get us into context, we'll make a quick walkthrough around four key vital components of the Support AI Assistant: data , querying , generation , and feedback . Data As showcased in the 2nd blog article of this series , our journey began with an extensive database that included over 300,000 documents consisting of Technical Support Knowledge Articles and various pages crawled from our website, such as Elastic's Product Documentation and Blogs . This rich dataset served as the foundation for our search queries, ensuring a broad spectrum of information about Elastic products was available for precise retrieval. To this end, we leveraged Elasticsearch to store and search our data. Query Having great data to search by, it's time to talk about our querying component. We adopted a standard Hybrid-Search strategy, which combines the traditional strengths of BM25 , Keyword-based Search , with the capabilities of Semantic Search , powered by ELSER . For the semantic search component, we used text_expansion queries against both title and summary embeddings. On the other hand, for broad keyword relevance we search multiple fields using cross_fields , with a minimum_should_match parameter tuned to better perform with longer queries. Phrase matches, which often signal greater relevance, receive a higher boost. Here’s our initial setup: Generation After search, we build up the system prompt with different sets of instructions, also contemplating the top 3 search results as context to be used. Finally, we feed the conversation alongside the built context into the LLM, generating a response. Here's the pseudocode showing the described behavior: The reason for not including more than 3 search results was the limited quantity of tokens available to work within our dedicated Azure OpenAI's GPT4 deployment (PTU), allied with a relatively large user base. Feedback We used a third-party tool to capture client-side events, connecting to Big Query for storage and making the JSON-encoded events accessible for comprehensive analysis by everyone on the team. Here's a glance into the Big Query syntax that builds up our feedback view. The JSON_VALUE function is a means to extract fields from the event payload: We also took advantage of valuable direct feedback from internal users regarding the chatbot experience, enabling us to quickly identify areas where our search results did not match the user intent. Incorporating both would be instrumental in the discovery process that enabled us to refine our RAG implementation, as we're going to observe throughout the next section. Challenges With usage, interesting patterns started to emerge from feedback. Some user queries, like those involving specific CVEs or Product Versions for instance, were yielding suboptimal results, indicating a disconnect between the user's intent and the GenAI responses. Let's take a closer look at the specific challenges identified, and how we solved them. #1: CVEs (Common Vulnerabilities and Exposures) Our customers frequently encounter alerts regarding lists of open CVEs that could impact their systems, often resulting in support cases. To address questions about those effectively, our dedicated internal teams meticulously maintain CVE-type Knowledge Articles. These articles provide standardized, official descriptions from Elastic, including detailed statements on the implications, and list the artifacts affected by each CVE. Recognizing the potential of our chatbot to streamline access to this crucial information, our internal InfoSec and Support Engineering teams began exploring its capabilities with questions like this: For such questions, one of the key advantages of using RAG – and also the main functional goal of adopting this design – is that we can pull up-to-date information, including it as context to the LLM and thus making it available instantly to produce awesome responses. That naturally will save us time and resources over fine-tuned LLM alternatives. However, the produced responses wouldn't perform as expected. Essential to answer those questions, the search results often lacked relevance, a fact which we can confirm by looking closely at the search results for the example: With just one relevant hit ( CVE-2019-10172 ), we left the LLM without the necessary context to generate proper answers: The observed behavior prompted us with an interesting question: How could we use the fact that users often include close-to-exact CVE codes in their queries to enhance the accuracy of our search results? To solve this, we approached the issue as a search challenge. We hypothesized that by emphasizing the title field matching for such articles, which directly contain the CVE codes, we could significantly improve the precision of our search results. This led to a strategic decision to conditionally boost the weighting of title matches in our search algorithm. By implementing this focused adjustment, we refined our query strategy as follows: As a result, we experienced much better hits for CVE-related use cases, ensuring that CVE-2016-1837 , CVE-2019-11756 and CVE-2014-6439 are top 3: And thus generating a much better response by the LLM: Lovely! By tuning our Hybrid Search approach, we significantly improved our performance with a pretty simple, but mostly effective Bob's Your Uncle solution (like some folks would say)! This improvement underscores that while semantic search is a powerful tool, understanding and leveraging user intent is crucial for optimizing search results and overall chat experience in your business reality. With that in mind, let's dive into the next challenge! #2: Product versions As we delved deeper into the challenges, another significant issue emerged with queries related to specific versions. Users frequently inquire about features, migration guides, or version comparisons, but our initial search responses were not meeting expectations. For instance, let's take the following question: Our initial query approach would return the following top 3: Elasticsearch for Apache Hadoop version 8.14.1 | Elasticsearch for Apache Hadoop [8.14] | Elastic ; APM version 8.14 | Elastic Observability [8.14] | Elastic ; Elasticsearch for Apache Hadoop version 8.14.3 | Elasticsearch for Apache Hadoop [8.14] | Elastic . Corresponding to the following _search response: Being irrevocably irrelevant, they ended up resulting in a completely uninformed answer from the chatbot, affecting the overall user experience and trust in the Support AI Assistant: Further investigating the issue we collected valuable insights. By replaying the query and looking into the search results, we noticed three serious problems with our crawled Product Documentation data that were contributing to the overall bad performance: Inaccurate semantic matching : Semantically, we definitely missed the shot. Why would we match against such specific articles, including two specifically about Apache Hadoop, when the question was so much broader than Hadoop? Multiple versions, same articles : Going further down on the hits of the initially asked question, we often noticed multiple versions for the same articles, with close to exactly the same content. That often led to a top 3 cluttered with irrelevant matches! Wrong versions being returned : It's fair to expect that having both 8.14.1 and 8.14.2 versions of the Elasticsearch for Apache Hadoop article, we'd return the latter for our query – but that just wasn't happening consistently. From the impact perspective, we had to stop and solve those – else, a considerable part of user queries would be affected. Let's dive into the approaches taken to solve both! A. Inaccurate semantic matching After some examination into our data, we've discovered that the root of our semantic matching issue lived in the fact that the summary field for Product Documentation-type articles generated upon ingestion by the crawler was just the first few characters of the body . This redundancy misled our semantic model, causing it to generate vector embeddings that did not accurately represent the document's content in relation to user queries. As a data problem, we had to solve this problem in the data domain: by leveraging the use of GenAI and the GPT4 model, we made a team decision to craft a new AI Enrichment Service – introduced in the 2nd installment of this blog series . We decided to create our own tool for a few specific reasons: We had unused PTU resources available. Why not use them? We needed this data gap filled quickly, as this was probably the greatest relevance detractor. We wanted a fully customizable approach to make our own experiments. Modeled to be generic, our usage for it boils down to generating four new fields for our data into a new index, using Enrich Processors to make them available to the respective documents on the target indices upon ingestion. Here's a quick view into the specification for each field to be generated: After generating those fields and setting up the index Enrich Processors , the underlying RAG-search indices were enriched with a new ai_fields object, also making ELSER embeddings available under ai_fields.ml.inference : Now, we can tune the query to use those fields, making for better overall semantic and keyword matching: Single-handedly, that made us much more relevant. More than that – it also opened a lot of new possibilities to use the AI-generated data throughout our applications – matters of which we'll talk about in future blog posts. Now, before retrying the query to check the results: what about the multiple versions problem? B. Multiple versions, same articles When duplicate content infiltrates these top positions, it diminishes the value of the data pool, thereby diluting the effectiveness of GenAI responses and leading to a suboptimal user experience. In this context, a significant challenge we encountered was the presence of multiple versions of the same article. This redundancy, while contributing to a rich collection of version-specific data, often cluttered the essential data feed to our LLM, reducing the diversity of it and therefore undermining the response quality. To address the problem, we employed the Elasticsearch API collapse parameter, sifting through the noise and prioritizing only the most relevant version of a single content. To do that, we computed a new slug field into our Product Documentation crawled documents to identify different versions of the same article, using it as the collapse field (or key ). Taking the Sort search results documentation page as an example, we have two versions of this article being crawled: Sort search results | Elasticsearch Guide [8.14] | Elastic Sort search results | Elasticsearch Guide [7.17] | Elastic Those two will generate the following slug : guide-en-elasticsearch-reference-sort-search-results Taking advantage of that, we can now tune the query to use collapse : As a result, we'll now only show the top-scored documentation in the search results, which will definitely contribute to increasing the diversity of knowledge being sent to the LLM. C. Wrong versions being returned Similar to the CVE matching problem , we can boost results based on the specific versions being mentioned, allied with the fact that version is a separate field in our index. To do that, we used the following simple regex-based function to pull off versions directly from the user question: We then add one more query to the should clause, boosting the version field accordingly and getting the right versions to the top (whenever they're mentioned): With A , B and C solved, we're probably ready to see some strong results! Let's replay the question! By replaying the previously tried question: And therefore running the Elasticsearch query once again, we get dramatically better results consisting of the following articles: Elasticsearch version 8.14.3 | Elasticsearch Guide [master] | Elastic Elasticsearch version 8.14.2 | Elasticsearch Guide [master] | Elastic Release notes | Elasticsearch Guide [8.14] | Elastic Consequently, we have a better answer generated by the LLM. More powerful than that – in the context of this conversation, the LLM is now conscious about versions of Elasticsearch that are newer than the model's cut-off date, crafting correct answers around those: Exciting, right? But how can we quantify the improvements in our query at this point? Let's see the numbers together! Measuring success To assess the performance implied by our changes, we've compiled a test suite based on user behavior, each containing a question plus a curated list of results that are considered relevant to answer it. Those will cover a wide wide range of subjects and query styles, reflecting the diverse needs of our users. Here's a complete look into it: But how do we turn those test cases into quantifiable success? To this end, we have employed Elasticsearch's Ranking Evaluation API alongside with the Precision at K (P@K) metric to determine how many relevant results are returned between the first K hits of a query. As we're interested in the top 3 results being fed into the LLM, we're making K = 3 here. To automate the computation of this metric against our curated list of questions and effectively assess our performance gains, we used TypeScript/Node.js to create a simple script wrapping everything up. First, we define a function to make the corresponding Ranking Evaluation API calls: After that, we need to define the search queries before and after the optimizations: Then, we'll output the resulting metrics for each query: Finally, by running the script against our development Elasticsearch instance, we can see the following output demonstrating the P@K or (P@3) values for each query, before and after the changes. That is – how many results on the top 3 are considered relevant to the response: Improvements observed As an archer carefully adjusts for a precise shot, our recent efforts into relevance have brought considerable improvements in precision over time. Each one of the previous enhancements, in sequence, were small steps towards achieving better accuracy in our RAG-search results, and overall user experience. Here's a look at how our efforts have improved performance across various queries: Before and after – P@K Relevant results in the top 3: ❌ = 0 , 🥉 = 1 , 🥈 = 2 , 🥇 = 3 . Query Description P@K Before P@K After Change Support Diagnostics Tool 0.333 🥉 1.000 🥇 +200% Air Gapped Maps Service 0.333 🥉 0.667 🥈 +100% CVE Implications 0.000 ❌ 1.000 🥇 ∞ Enrich Processor Setup 0.667 🥈 0.667 🥈 0% Proxy Certificates Rotation 0.333 🥉 0.333 🥉 0% Proxy Certificates Version-specific Rotation 0.333 🥉 0.333 🥉 0% Searchable Snapshot Deletion 0.667 🥈 1.000 🥇 +50% Index Lifecycle Management Usage 0.667 🥈 0.667 🥈 0% Creating Data Views via API in Kibana 0.333 🥉 0.667 🥈 +100% Kibana Data View Creation 1.000 🥇 1.000 🥇 0% Comparing Elasticsearch Versions 0.000 ❌ 0.667 🥈 ∞ Maximum Bucket Size in Aggregations 0.000 ❌ 0.333 🥉 ∞ Average P@K Improvement: +78.41% 🏆🎉 . Let's summarize a few observations about our results: Significant Improvements : With the measured overall +78.41% of relevance increase, the following queries – Support Diagnostics Tool , CVE implications , Searchable Snapshot Deletion, Comparing Elasticsearch Versions – showed substantial enhancements. These areas not only reached the podium of search relevance but did so with flying colors, significantly outpacing their initial performances! Opportunities for Optimization : Certain queries like the Enrich Processor Setup , Kibana Data View Creation and Proxy Certificates Rotation have shown reliable performances, without regressions. These results underscore the effectiveness of our core search strategies. However, those remind us that precision in search is an ongoing effort. These static results highlight where we'll focus our efforts to sharpen our aim throughout the next iterations. As we continue, we'll also expand our test suite, incorporating more diverse and meticulously selected use cases to ensure our enhancements are both relevant and robust. What's next? 🔎 The path ahead is marked by opportunities for further gains, and with each iteration, we aim to push the RAG implementation performance and overall experience even higher. With that, let's discuss areas that we're currently interested in! Our data can be futher optimized for search : Although we have a large base of sources, we observed that having semantically close search candidates often led to less effective chatbot responses. Some of the crawled pages aren't really valuable, and often generate noise that impacts relevance negatively. To solve that, we can curate and enhance our existing knowledge base by applying a plethora of techniques, making it lean and effective to ensure an optimal search experience. Chatbots must handle conversations – and so must RAG searches : It's common user behavior to ask follow-up questions to the chatbot. A question asking \"How to configure Elasticsearch on a Linux machine?\" followed by \"What about Windows?\" should query something like \"How to configure Elasticsearch on a Linux machine?\" (not the raw 2nd question). The RAG query approach should find the most relevant content regarding the entire context of the conversation. Conditional context inclusion : By extracting the semantic meaning of the user question, it would be possible to conditionally include pieces of data as context, saving token limits, making the generated content even more relevant, and potentially saving round trips for search and external services. Conclusion In this installment of our series on GenAI for Customer Support, we have thoroughly explored the enhancements to the Retrieval-Augmented Generation (RAG) search within Elastic's customer support systems. By refining the interaction between large language models and our search algorithms, we have successfully elevated the precision and effectiveness of the Support AI Assistant. Looking ahead, we aim to further optimize our search capabilities and expand our understanding of user interactions. This continuous improvement will focus on refining our AI models and search algorithms to better serve user needs and enhance overall customer satisfaction. Stay tuned for more insights and updates as we continue to push the boundaries of what's possible with AI in customer support, and don't forget to join us in our next discussion, where we'll explore how Observability plays a critical role in monitoring, diagnosing, and optimizing the performance and reliability of the Support AI Assistant as we scale! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Jump to RAG tuning: A search problem Our first approach Data Query Generation Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "GenAI for customer support — Part 4: Tuning RAG search for relevance - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elser-rag-search-for-relevance",
+    "meta_description": "Discover how to fine tune RAG search for relevance with a GenAI for customer support use case. Learn challenges and strategies for tuning RAG with ELSER and the Elastic Stack."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog ES|QL queries to TypeScript types with the Elasticsearch JavaScript client Explore how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects. ES|QL Javascript How To JM By: Josh Mock On June 3, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction In a recent article, Laura highlighted how to use the Java Elasticsearch client to craft ES|QL queries and parse their results as native Java objects. Similar functionality, with TypeScript support, will be available in the JavaScript client in the upcoming 8.14.0 release. This blog explains how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects. Implementation: ES|QL queries to TypeScript types with the Elasticsearch JavaScript client First, let's use the bulk helper to index some data: Now, let's use a very basic ES|QL query to look at these newly indexed documents: Returning each row as an array of values is a simple default that's useful in many cases. Still, if you wanted an array of records instead—a standard structure in JavaScript applications—extra effort is necessary to convert the data. Fortunately, in 8.14.0, the JavaScript client will include a new ES|QL helper to do this for you: If you're using TypeScript, you can declare the type of the results inline: In another situation, you might need to declare a type that doesn't perfectly match the query results. Maybe the keys and columns don't have the same names, or the returned documents have columns that don't exist in your type. In this case, we can use the ES|QL RENAME and KEEP processing commands to modify our results to better fit your type: Conclusion These are relatively simple examples that highlight how to use the new ES|QL helper in the JavaScript client, so check out the docs for full details. Future releases of the JavaScript client will likely include more ES|QL helpers, like pagination over large result sets using generators, and support for Apache Arrow. All of our official clients have plans to include similar helpers and tools to make working with ES|QL queries as simple as possible. Check the changelog for your preferred client in the coming weeks and months! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction Implementation: ES|QL queries to TypeScript types with the Elasticsearch JavaScript client Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "ES|QL queries to TypeScript types with the Elasticsearch JavaScript client - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-javascript-helper-typescript",
+    "meta_description": "Explore how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Fast Kibana Dashboards From 8.13 to 8.17, the wait time for data to appear on a dashboard has improved by up to 40%. These improvements are validated both in our synthetic benchmarking environment and from metrics collected in real user’s cloud environments. Developer Experience TN By: Thomas Neirynck On March 3, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Almost all Elastic stack users at some point will use a Dashboard. Elastic ships many out-of-the-box dashboards for all its integrations, and users will create custom dashboards: to share with others, do a root cause analysis, and/or generate a PNG report. As a result, the Dashboard app is one of the most used applications in Kibana (for those wondering, the other most popular one is Discover). In addition, many of the core application components of Dashboards support other Kibana applications. Kibana’s widget framework for embedding charts and tables was originally developed for Dashboards, or the data plugin, which brokers search requests to Elasticsearch from the Kibana browser app and is heavily used by all Kibana plugins. In other words, the Dashboard application – both from a product and technical perspective – is central to Kibana, and deserves to be the best experience for users. Yet, dashboards in Kibana could feel “sluggish”. We would experience it as developers, and we would hear it from users. Comparisons with other tools (like Grafana, or to a lesser extent OpenSearch Dashboards) also showed that Dashboards in Kibana sometimes tend to feel slower. For this reason, the Kibana Team recently undertook an effort to bring down the render time of a Dashboard. Identifying the challenge Looking at real user telemetry (we’ll discuss this more below) we see a clear 80/20 division in the render time of Dashboards. a) On one hand, some dashboards take tens of seconds to load The time here is dominated by (very) long-running Elasticsearch queries. The 95th percentile (i.e. the minority) of dashboard rendering times sits significantly above the mean. This does not have to be unexpected; for example, the search the panels issue can span a large time-range, queries hit cold or frozen storage tiers that are not optimized for analytical workloads. These are a minority of dashboards, “the long tail”. Note also a clear seasonality in the data, with longer render times during the weekday versus the weekend. This likely indicates that the “worst case” is influenced by the overall load on the cluster including ingest (not just the runtime analytical queries), which tends to be elevated during working hours. (b) On the other hand, the majority of dashboards load in the low-second range. The 75% and under (the green, blue, and red lines) are significantly less, but they still take 1-3 seconds. When searches to Elasticsearch complete quickly, then where is the time going in Kibana? Why should a Kibana dashboard ever feel sluggish, especially when a deployment is well provisioned and searches complete quickly? In this initial phase of the project—and what we’ll summarize in this blog post—we decided in spring 2024 to focus on improving (b): the render time of the 75 percentile of dashboards, and ensure these become more snappy and pleasurable to work with. We did not forget about (a)! At the end of the post, we will highlight the initiatives that will bring down the render time for this top 20th percentile too. Telemetry From the start, we realized we had poor measurements of the render time of Dashboards. Existing instrumentation did not capture the various stages of the page load of a dashboard. It was also important to align metrics with what we could make actionable in day to day development, and what we could collect from users in the real world. a) What to measure From a high level, we introduced three core metrics, each capturing a specific span of the dashboard load. These can be stacked neatly bottom-to-top when opening a dashboard by its unique URL in a fresh tab. Metric What When Frequency kibana_loaded Initial asset loading session (“the spinner”) Opening Kibana for the first time Once per user session dashboard_overhead Bootstrapping of dashboard app and panels Opening a specific dashboard Once per dashboard time_to_data Time for data to appear on screen Changing a filter, a time-range, a control… Once per query-state change This instrumentation is largely implemented ad-hoc. Conventional Core Web Vitals ( https://web.dev/articles/vitals ) do not exactly measure what we were looking for, although we can draw some parallels. Specifically, Time-To-Interactive (TTI) corresponds roughly to the end of “kibana_loaded”. After that, the app is usable, although not all UX may have fully initialized yet. For example, the dashboard controls may not have been initialized yet, even though the dashboard-app technically can start responding to inputs. Another tricky metric was to match the “completion” of a dashboard to when all data is on screen, comparable in spirit to Largest Contentful Paint (LCP). This is the result of AJAX data-requests and front-end rendering which are custom to each panel-type. So to gather it correctly, each panel on a dashboard needs to report its “completion” state. For some charts, this is fairly straightforward (e.g. a simple metric-chart), but for others this is complicated. For example, a map is only complete when all tiles are rendered on screen. Detecting this is not trivial. After all panel-types report “completion” correctly, it is then up to the dashboard-application itself to observe all these “completion” events and report the “time-to-data” metric accordingly after the last completion. Additionally, further segmentations of these benchmarks were introduced, as well as additional metadata like the number of panels on a dashboard or whether the user navigated from within Kibana (some assets are loaded already) or from outside Kibana into the dashboard (no assets have been loaded). The app is also collecting metrics on the server on the duration of data requests. b) Importance of each segment Each of these segments has different semantics. Walking top-down the metric list: When it comes to “snappyness”, it is largely dominated by time-2-data. Each time a user manipulates the filter-state (e.g. time range), the user will need to wait for the new data to occur on screen. It is here where “lag” matters most. Think of a video game. Players may tolerate the loading icon before the level starts, but they then expect responsive gameplay when starting to control the game. This is the same for a dashboard. Users interact with charts, controls, filters… these interactions are the “gameplay” of a dashboard, and they determine how users experience responsiveness. Dashboard_overhead is relevant as well. It is the time it takes to load a dashboard configuration (which is a document retrieved from a system index in Elasticsearch). It also includes some additional code loading. This is because in the Kibana plugin-system, some code is loaded ad-hoc. To give an example: suppose a Dashboard has a swimlane-panel. The dashboard-app would initialize a “swimlane”-embeddable. If during that Kibana-session it is the first time the swimlane is being loaded, it will be up to the swimlane-embedable to ensure it has loaded all the swimlane-specific code before rendering. A deep-dive in the Kibana plugin system would take us too far, but summarized: some of the code loading footprint is captured in this “dashboard_overhead”, not just “kibana_loaded”. kibana_loaded only occurs a single time for the entirety of a Kibana user session, and is not dashboard specific. Users can navigate to Dashboards from many other Kibana-pages. For this reason, we wanted to isolate kibana_loaded from the specific experience of Dashboards. While being the largest “chunk” of time, it is also the least relevant when it comes to snappyness of the overall Kibana experience. Benchmarking Instrumentation in place, we now actually need to collect these in the appropriate environments: in benchmarking deployments, and real user deployments. a) In CI Performance metrics are being collected for a handful of representative dashboards. These have similar configurations as our integrations, each dashboard containing a mix of chart types. An example of a dashboard benchmark These benchmarks run every three hours on Kibana’s main release branch. The runner spins up a single Elasticsearch and Kibana cluster on dedicated hardware. Metrics are collected from Playwright scripts in a Chromium headless browser. b) In the wild The same metrics are reported for Elastic Cloud and Serverless users as well, and for self-hosted users who opt in into telemetry. While the benchmarks in CI provide an actionable signal at a point in time, collecting these metrics in the wild provide a backward looking signal that helps validate whether the movement of our benchmarks reflects in the real world. Later in the post, you will read how both evolved for the better part of last year. A note on process (meetings! alerts!) There is not a single engineer or team who “owns” the dashboard experience, as the display panels have been developed from all across Engineering. To align effort, some logistic arrangements have proven useful. Reporting Twice weekly reporting provides an opportunity to review the evolution of telemetry and discuss any ongoing related work. It is easiest to consider these as mini-retrospectives. Responding to regressions Benchmarks have proven to be quite reliable, the runs being highly reproducible. When there is negative or unexpected movement in the benchmark, it usually reliably indicates a regression that requires action. There is no hard and fast rule on how to respond. Significant regressions are generally rolled back immediately. Other times, we may let the offending commit ride, until a bug fix is merged. This is decided on a case by case basis, and really depends on the nature of the change. A typical smile-pattern in our benchmarks. A regression was detected and resolved. Ad-hoc runs - validating hypothesis As familiarity with the tooling has grown, we are also doing more trigger ad-hoc performance runs on individual PRs before they merge. This allows for more rapid validation of code changes before they get merged into the main development trunk. Improvements With all this yak-shaving is out of the way, let’s finally get to the interesting part. There has been no silver bullet. Kibana spends time everywhere, all of which adds a marginal overhead, which adds up in the aggregate. The improvements to Dashboard rendering has come from improved hygiene in many layers in the application. Below, we break these down in a couple of major themes. Reducing code and asset loading Efficient code loading is currently one of the largest challenges in Kibana. Kibana’s plugin architecture is very flexible in that it allows for fast turnaround in adding new pages and apps. This flexibility does come with a cost, with two being critically related to JavaScript code loading in the browser. One is that code gets loaded that is never required, the other is that asset-loading tends to be fragmented, with multiple small JavaScript assets to support a single app rather than fewer larger files. The first is particularly a problem for Dashboards. Many plugins register widgets with the dashboard app, for example,. a maps panel, a swimlane panel, etc…. However, most dashboards will never display a map or swimlane. Example: plugins can add pop-up links to dashboard panels. These are context dependent. Before: pseudo-code of how a plugin registers a pop-up item for dashboard panels. This pattern causes unnecessary code to be loaded. To avoid this, a new pattern was introduced that allows clients to delay the code loading, until it is needed. After: code only included when it is required. Isolate the definition in a different module ./case_action.ts ./plugin.ts The other issue is that initialization of plugins would block the responsiveness, causing a waterfall effect of asset loading getting serialized rather than parallelized. One example of this is that dashboard controls used to block rendering of the page, causing all panels to have to wait until the controls had loaded all its assets. This is of course not necessary, as rendering should be able to start before controls are fully initialized. Many of these code-loading issues were addressed, contributing to overall responsiveness. Since Kibana has many plugins (over 200 and counting), it will require sustained attention to address the misuse of both these inefficient code loading patterns. Embeddables and rendering A big effort from 2024 in Kibana was the embeddable refactor. This effort had a few goals, of which performance was largely an incidental concern. The effort had to enable a few critical features (like collapsible panels), remove a number of unused code paths (largely angular related), improve testability, and simplify some of the DevX when working with the APIs. You can read more about this here . One way embeddables have allowed to tidy up dashboard performance is by consolidating all rendering in a single React tree. Before, each panel was rendered into its own render-tree with ReactDOM.render() . This architecture was an artifact of a time when Kibana had both Angular and React-based (and jQuery, ahem) widgets. This mix of rendering technologies has not existed in Kibana for over 4 years, and has been fully standardized on React as the only UX rendering library. However, the Dashboard carried that legacy with an additional abstraction layer. Reducing the number of state changes and re-renderings that panels respond to has been a marginal improvement to the Dashboard app, overall increasing its responsiveness. The reduction in code too has helped reduce the app’s footprint. Avoid crufty memory allocation The code for charts and tables will re-organize the data received from Elasticsearch in a data structure that is easier to manipulate for display. They perform “a flattening” step that takes a nested data structure from the ES-response, and turns it into a one-dimensional array, where each item of the array corresponds to a single feature (e.g. a row in a table, a bar in a barchart…). E.g. consider a nested ES-doc with many sub-fields, or the hierarchical organization of buckets from an ES-arg search. The implementations for flattenings these often allocated short-lived objects, like object literals, or lambdas () => {}. Frequent use of array-comprehension methods like .map or .reduce are patterns where such object allocation sneaks in easily. Since these flattening-operations all occur in tight recursive loops (thousands of documents, hundreds of buckets) and given that dashboards may contain multiple tables and multiple charts, these allocations can add up quickly. Liberal heap allocation like this also cuts into the user-experience twice: once at construction, but also as a strain on the garbage collector (garbage collection is less predictable, but tends to contribute to hiccups in the frame rate). Our benchmarks showed meaningful improvements (between 5-10%) by removing some of the most glaring allocation in a few of these tight loops. Data transfer improvements The data request roundtrip from a Dashboard running Kibana-browser to and from Elasticsearch was batched. Multiple requests would be collected and the Kibana server would fan these out to Elasticsearch as individual _async_search requests, and combine these ES-JSON responses in a new Kibana-specific format. The main reason for this batching was that it side-steps the browser connection limit for HTTP1. This sits around 6 concurrent http requests, something which is easily exceeded on Dashboards with multiple panels. This approach has two main disadvantages. It adds a small delay to collect the batches. It also puts a strain on the kibana-server to wait and re-encode the ES-responses. Kibana-server would need to unzip them first, decode the JSON, concatenate the responses, and then gzip again. While this re-encoding step is generally small, in the worst case (for example, for large responses), it could be significant. It would also add significant memory pressure, occasionaly causing Out Of Memory issues. Given that the proxies that sit in front of Elastic Cloud and Serverless already support HTTP 2.0, and that Kibana will start to support HTTP 2.0 for stateful in 9.0, it was decided to remove this batching. In addition, kibana-server no longer re-encodes the data and streams the original gzipped result from Elasticsearch. This greatly simplifies the data-transfer architecture and in combination with running over HTTP 2.0, has shown some nice performance improvements. Apart from the performance benefits (faster, less sensitive to OOM), debuggability has much improved due to the simplified architecture and the fact that data-responses can now easily be inspected in the browser’s debugger. Outcomes The aggregate outcome of these changes have been significant, and is reflected both in the benchmarks and the user telemetry. Benchmark evolution The chart below shows the metrics for a mixture of multiple benchmarks, and how they have evolved the last 6 months. We can see an overall drop from around 3500ms to 2000ms (*the most recent uptick is related to a current effort to change the theme in Kibana. During this migration-phase we ship multiple themes. This will be removed over time. There’s also a few gaps when the CI-runs had keeled over). Real world users The real world, as explained in the intro, is harder to measure. We just do not know exactly which dashboards users are running, and how their configurations evolve over time. However, looking at it from two different perspectives, we can verify the same evolution as in the synthetic benchmarking environment.. First, over time we see a drop of time to render in the 75 percentile. It allows us to say that – on average – the experience of users on a dashboard in Jan 2025 is significantly better than in June 24. Dashboard render-time for 25, 50, and 75 percentile We can also compare mean time_to_data by version of all users in the last week. Users in 8.17 wait noticeably less time for data to appear on screen than users of 8.12. The drop in real world is also slightly delayed from what we are observing in our benchmarks, which roughly is in line with the cadence the stack is released. Looking ahead So the curve is trending downwards, largely by adding many small shaves. There are some significant areas where this approach of trimming the fat will eventually lead to diminishing returns. Below are some areas we believe will benefit from more structural changes on how we approach the problem. Code loading continued We have not discussed that kibana_loaded metric very much in this blog post. If we would have to characterize it: the Kibana plugin-architecture is optimized for allowing applications to load code ad-hoc, with the code-packaging process producing many JavaScript code bundles. However, in practice we do see unnecessary code being loaded, as well as “waterfall” code-loading where code loading may block rendering of the UX. All in all, things could improve here (see above, “Reducing code and asset loading”). The team is currently engaging in a wider ranging effort, “Sustainable Kibana”, which includes revisiting how we package and deliver code to the browser. We anticipate more benefits to materialize here later. When they do, be sure to check the blog post! Dealing with slow searches Long searches take a long time. It is what it is. An Elasticsearch search can be slow for many reasons, and this doesn’t even need to be a bug. Consider complex aggregations on a large cluster with terabytes of data, over a long time range hitting hundreds of nodes on different storage tiers. Depending on how the cluster is provisioned, this could end up being a slow search. Kibana needs to be resilient in face of this query-dependent limitation. In such a scenario, we cannot improve the “snappiness” of a Dashboard with low-level improvements in Kibana (like those we have discussed above). To address inherent slowness, there needs to be a suite of new features that allow users to opt into a “fast” mode. Consider for example sampling the data, where users can trade speed for accuracy. Or consider improving the perceived performance by incrementally filling in the charts as the data come in. Or allowing users to hide parts of the Dashboards with collapsible panels (they’re coming!). These changes will straddle more the line between product feature and low-level tech improvement. Chart and query dependent sluggishness The current effort has mainly sought improvement into lower level components with broad impact. However, there are instances where dashboards are slow due to very specific chart configurations. E.g. computing the “other” bucket, unique count aggregation over data with high cardinality, … Identifying these charts and queries will allow for more targeted optimizations. Are the defaults correct (e.g. do all charts require another bucket)? Are there more efficient ways to query for the same data? Adding Discover to the program All the paintpoints of Dashboards are also the same painpoints in Discover (pluggability of charts, data-heavy, requirements to be responsive…). So we have rolled out this program to guide development in the Discover app. Already, we are seeing some nice gains in Discover, and we’re looking to build on this momentum. This too would deserve its own blog post so stay tuned! Conclusion Dashboards are getting faster in Kibana. Recent improvements are due to the compound effect of many lower level optimizations. To progress even further, we anticipate a more two-pronged approach: First, continue this theme of improved hygiene. Second, expand to a broader program that will allow us to address the “long tail” of causes contributing to slowness. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote Jump to Identifying the challenge a) On one hand, some dashboards take tens of seconds to load (b) On the other hand, the majority of dashboards load in the low-second range. Telemetry a) What to measure Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Fast Kibana Dashboards - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/kibana-dashboard-rendering-time",
+    "meta_description": "Kibana Dashboards are getting faster. Explore recent improvements that bring down the render-time of a dashboard."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. Integrations Python How To JR By: Jeffrey Rengifo On April 24, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. AutoGen is a Microsoft framework for building applications that can act with human intervention or autonomously. It provides a complete ecosystem with different abstraction levels, depending on how much you need to customize. If you want to read more about agents and how they work, I recommend you read this article. Image Source: https://github.com/microsoft/autogen AgentChat allows you to easily instantiate preset agents on top of the AutoGen core so you can configure the model prompts, tools, etc. On top of AgentChat, you can use extensions that allow you to extend its functionalities. The extensions are both from the official library and community-based. The highest level of abstraction is Magnetic-One , a generalist multi-agent system designed for complex tasks. It comes preconfigured in the paper explaining this approach. AutoGen is known for fostering communication among agents, proposing groundbreaking patterns like: Group chat Multi agent debate Mixture of agents Concurrent agents Handoffs In this article, we will create an agent that uses Elasticsearch as a semantic search tool to collaborate with other agents and look for the perfect match between candidate profiles stored in Elasticsearch and job offers online. We will create a group of agents that share Elasticsearch and online information to try to match the candidates with job offers. We'll use the ´ Group Chat ´ pattern where an admin moderates the conversation and runs tasks while each agent specializes in a task. The complete example is available in this Notebook . Steps Install dependencies and import packages Prepare data Configure Agents Configure tools Run task Install dependencies and import packages Prepare data Setup keys For the agent AI endpoint, we need to provide an OpenAI API key. We will also need a Serper API key to give the agent search capabilities. Serper gives 2,500 search calls for free on sign-up . We use Serper to give our agent internet access capabilities, more specifically finding Google results. The agent can send a search query via API and Serper will return the top Google results. Elasticsearch client Inference endpoint and mappings To enable semantic search capabilities, we need to create an inference endpoint using ELSER . ELSER allows us to run semantic or hybrid queries, so we can give broad tasks to our agents and semantically related documents from Elasticsearch will show up with no need of typing keywords that are present in the documents. Mappings For the mappings, we are going to copy all the relevant text fields into the semantic_text field so we can run semantic or hybrid queries against the data. Ingesting documents to Elasticsearch We are going to load data about the job applicants and ask our agents to find the ideal job for each of them based on their experience and expected salary. Configure agents AI endpoint configuration Let’s configure the AI endpoint based on the environment variables we defined in the first step. Create agents We´ll begin by creating the admin that will moderate the conversation and run the tasks the agents propose. Then, we´ll create the agents that will carry out each task: Admin: leads the conversation and executes the other agents’ actions. Researcher : navigates online searching for job offers. Retriever : looks up candidates in Elastic. Matcher : tries to match the offers and the candidates. Critic : evaluates the quality of a match before providing the final answer. Configure tools For this project, we need to create two tools: one to search in Elasticsearch and another to search online. Tools are a Python function that we will register and assign to agents next. Tools methods Assigning tools to agents For the tools to work properly, we need to define a caller that will determine the parameters for the function and an executor that will run said function. We will define the admin as the executor and the respective agent as the caller. Run task We will now define a group chat with all agents, where the administrator assigns turns for each agent to define the task it wants to call and end it once the defined conditions are met based on previous instructions. Reasoning (Formatted for readability) The output will look like this: Next speaker: Matcher Next speaker: Retriever Next speaker: Admin Admin (to chat_manager): Researcher (to chat_manager): Next speaker: Critic Admin has been informed of the successful candidate-job offer matches. Results (Formatted for readability) Note that at the end of each Elasticsearch stored candidate, you can see a match field with the job listing that best fits them! Conclusion AutoGen allows you to create groups of agents that work together to solve a problem with different degrees of complexity. One of the available patterns is 'group chat,' where an admin leads a conversation among agents to reach a successful solution. You can add more features to the project by creating more agents. For example, storing the matches provided back into Elasticsearch, and then automatically applying to the job offers using the WebSurfer agent . The WebSurfer agent can navigate websites using visual models and a headless browser. To index documents in Elasticsearch, you can use a tool similar to elasticsearch_hybrid_search, but with added ingestion logic. Then, create a special agent “ingestor” to achieve indexing. Once you have that, you can implement the WebSurfer agent by following the official documentation . Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Install dependencies and import packages Prepare data Setup keys Elasticsearch client Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using AutoGen with Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/using-autogen-with-elasticsearch",
+    "meta_description": "Learn to create an Elasticsearch tool for your agents with AutoGen. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to use Elasticsearch Vector Store Connector for Microsoft Semantic Kernel for AI Agent development Microsoft Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase. With the release of Semantic Kernel Elasticsearch Vector Store Connector, developers using Semantic Kernel for building AI agents can now plugin Elasticsearch as a scalable enterprise-grade vector store while continuing to use Semantic Kernel abstractions. Generative AI .NET Integrations Vector Database How To FB SM By: Florian Bernd and Srikanth Manvi On December 6, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In collaboration with the Microsoft Semantic Kernel team, we are announcing the availability of Semantic Kernel Elasticsearch Vector Store Connector , for Microsoft Semantic Kernel (.NET) users. Semantic Kernel simplifies building enterprise-grade AI agents, including the capability to enhance large language models (LLMs) with more relevant, data-driven responses from a Vector Store. Semantic Kernel provides a seamless abstraction layer for interacting with Vector Stores like Elasticsearch, offering essential features such as creating, listing, and deleting collections of records and uploading, retrieving, deleting individual records. The out-of-the-box Semantic Kernel Elasticsearch Vector Store Connector supports the Semantic Kernel vector store abstractions which make it very easy for developers to plugin Elasticsearch as a vector store while building AI agents. Elasticsearch has a strong foundation in the open-source community and recently adopted the AGPL license . Combined with the open-source Microsoft Semantic Kernel, these tools offer a powerful, enterprise-ready solution. You can get started locally by spinning up Elasticsearch in a few minutes by running this command curl -fsSL https://elastic.co/start-local | sh (refer start-local for details) and move to cloud-hosted or self-hosted versions while productionizing your AI agents. In this blog we look at how to use Semantic Kernel Elasticsearch Vector Store Connector when using Semantic Kernel. A Python version of the connector will be made available in the future. High-level scenario In the following section we go through an example. At a high-level we are building a RAG (Retrieval Augmented Generation) application which takes a user's question as input and returns an answer. We will use Azure OpenAI ( local LLM can be used as well) as the LLM, Elasticsearch as the vector store and Semantic Kernel (.net) as the framework to tie all components together. If you are not familiar with RAG architectures, you can have a quick introduction with this article: https://www.elastic.co/search-labs/blog/retrieval-augmented-generation-rag . The answer is generated by the LLM which is fed with context, relevant to the question, retrieved from Elasticsearch vectorstore. The response also includes the source that was used as the context by the LLM. RAG Example In this specific example, we build an application that allows users to ask questions about hotels stored in an internal hotel database. The user could e.g. search for a specific hotel, based on different criteria, or ask for a list of hotels. For the example database, we generated a list of hotels containing 100 entries. The sample size is intentionally small to allow you to try out the connector demo as easily as possible. In a real-world application, the Elasticsearch connector would show its advantages over other options, such as the `InMemory` vector store implementation, especially when working with extremely large amounts of data. The complete demo application can be found in the Elasticsearch vector store connector repository . Let’s start with adding the required NuGet packages and using directives to our project: We can now create our data model and provide it with Semantic Kernel specific attributes to define the storage model schema and some hints for the text search: The Storage Model Schema attributes (`VectorStore*`) are most relevant for the actual use of the Elasticsearch Vector Store Connector, namely: VectorStoreRecordKey to mark a property on a record class as the key under which the record is stored in a vector store. VectorStoreRecordData to mark a property on a record class as 'data'. VectorStoreRecordVector to mark a property on a record class as a vector. All of these attributes accept various optional parameters that can be used to further customize the storage model. In the case of VectorStoreRecordKey , for example, it is possible to specify a different distance function or a different index type. The text search attributes ( TextSearch* ) will be important in the last step of this example. We will come back to them later. In the next step, we initialize the Semantic Kernel engine and obtain references to the core services. In a real world application, dependency injection should be used instead of directly accessing the service collection. The same thing applies to the hardcoded configuration and secrets, which should be read using a configuration provider instead: The vectorStoreCollection service can now be used to create the collection and to ingest a few demo records : This shows how Semantic Kernel reduces the use of a vector store with all its complexity to a few simple method calls. Under the hood, a new index is created in Elasticsearch and all the necessary property mappings are created. Our data set is then mapped completely transparently into the storage model and finally stored in the index. Below is how the mappings look in Elasticsearch. The embeddings.GenerateEmbeddingsAsync() calls transparently called the configured Azure AI Embeddings Generation service. Even more magic can be observed in the last step of this demo. With just a single call to InvokePromptAsync , all of the following operations are performed when the user asks a question about the data: 1. An embedding for the user's question is generated 2. The vector store is searched for relevant entries 3. The results of the query are inserted into a prompt template 4. The actual query in the form of the final prompt is sent to the AI chat completion service Remember the TextSearch* attributes, we previously defined on our data model? These attributes enable us to use corresponding placeholders in our prompt template which are automatically populated with the information from our entries in the vector store. The final response to our question \"Please show me all hotels that have a rooftop bar.\" is as follows: The answer correctly refers to the following entry in our hotels.csv This example shows very well how the use of Microsoft Semantic Kernel achieves a significant reduction in complexity through its well thought abstractions, as well as enabling a very high level of flexibility. By changing a single line of code, for example, the vector store or the AI services used can be replaced without having to refactor any other part of the code. At the same time, the framework provides an enormous set of high-level functionality, such as the `InvokePrompt` function, or the template or search plugin system. The complete demo application can be found in the Elasticsearch vector store connector repository. What else is possible with ES Elasticsearch new semantic_text mapping: Simplifying semantic search Semantic reranking in Elasticsearch with retrievers Advanced RAG techniques part 1: Data processing Advanced RAG techniques part 2: Querying and testing Building RAG with Llama 3 open-source and Elastic A tutorial on building local agent using LangGraph, LLaMA3 and Elasticsearch vector store from scratch What's next? We showed how the Elasticsearch vector store can be easily plugged into Semantic Kernel while building GenAI applications in .NET. Stay tuned for a Python integration next. As Semantic Kernel builds abstractions for advanced search features like hybrid search , the Elasticsearch connect will enable .NET developers to easily implement them while using Semantic Kernel. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to High-level scenario RAG Example What else is possible with ES What's next? Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to use Elasticsearch Vector Store Connector for Microsoft Semantic Kernel for AI Agent development - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-connector-microsoft-semantic-kernel",
+    "meta_description": " Microsoft Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase. With the release of Semantic Kernel Elasticsearch Vector Store Connector, developers using Semantic Kernel for building AI agents can now plugin Elasticsearch as a scalable enterprise-grade vector store while continuing to use Semantic Kernel abstractions."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial ES|QL Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau ES|QL December 31, 2024 Improving the ES|QL editor experience in Kibana With the new ES|QL language becoming GA, a new editor experience has been developed in Kibana to help users write faster and better queries. Features like live validation, improved autocomplete and quick fixes will streamline the ES|QL experience. ML By: Marco Liberati ES|QL Ruby +1 October 24, 2024 How to use the ES|QL Helper in the Elasticsearch Ruby Client Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results. FB By: Fernando Briano ES|QL Python +1 September 5, 2024 From ES|QL to native Pandas dataframes in Python Learn how to export ES|QL queries as native Pandas dataframes in Python through practical examples. QP By: Quentin Pradet ES|QL Python August 20, 2024 An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus Use Elasticsearch Query Language (ES|QL) to run statistical analysis on demographic data index in Elasticsearch. BA By: Baha Azarmi ES|QL How To June 5, 2024 Elasticsearch piped query language, ES|QL, now generally available Elasticsearch Query Language (ES|QL) is now GA. Explore ES|QL's capabilities, learn about ES|QL in Kibana and discover future advancements. CL GK By: Costin Leau and George Kobar ES|QL Javascript +1 June 3, 2024 ES|QL queries to TypeScript types with the Elasticsearch JavaScript client Explore how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects. JM By: Josh Mock ES|QL Java +1 May 2, 2024 ES|QL queries to Java objects Learn how to perform ES|QL queries with the Java client. Follow this guide for step-by-step instructions, including examples. LT By: Laura Trotta 1 2 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "ES|QL - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/esql",
+    "meta_description": "ES|QL articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Filtering in ES|QL using full text search 8.17 included match and qstr functions in ES|QL, that can be used to perform full text filtering. 8.18 removed limitations on their usage. This article describes what they do, how they can be used, the difference with the existing text filtering methods, current limitations and future improvements. Search Analytics CD By: Carlos Delgado On January 10, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. ES|QL now includes full text functions that can be used to filter your data using text queries. We will review the available text filtering methods and understand why these functions provide a better alternative. We will also look at the future improvements for full text functions in ES|QL. Filtering text with ES|QL Text data in logs is critical for understanding, monitoring, and troubleshooting systems and applications. The unstructured nature of text allows for flexibility in capturing all sorts of information. Being unstructured, we need ways of isolating specific patterns, keywords, or phrases. Be it searching for an error message, narrowing down results using tags, or looking for a specific host name, are things that we do all the time to refine our results and eventually obtain the information we're looking for. ES|QL provides different methods to help you work with text. Elasticsearch 8.17 adds the full text functions match and qstr in tech preview to help tackle more complex search use cases. Limitations of text filtering ES|QL already provided text filtering capabilities, including: Text equality, to compare full strings directly using the equality operator . String start and end, using the STARTS_WITH and ENDS_WITH functions. Pattern and regex matching with the LIKE and RLIKE operators. Text filtering is useful - but it can fall short on text oriented use cases: Multivalued fields Using ES|QL functions with multivalued fields can be tricky - functions return null when applied to a multivalued field. If you need to apply a function to a multivalued field, you first need to transform the value to a single value using MV_CONCAT so you can match on a single value: Analyzed text Analyzers are incredibly useful for full text search as they allow transforming text. They allow us to extract and modify the indexed text, and modify the queries so we can maximize the possibility of finding what we're looking for. Text is not analyzed when using text filtering. This means for example that you need to match the text case when searching, or create regexes / patterns that address possible case differences. This can become more problematic when looking for multilingual text (so you can't use ASCII folding ), trying to match on parts of paths ( path hierarchy ), or removing stopwords . Performance Pattern matching and regexes take time. Lucene can do a lot of the heavy lifting by creating finite automata to match using the indexed terms dictionary, but nonetheless it's a computationally intensive process. As you can see in our 8.17 release blog , using regular expressions can be up to 50-1000x slower than using full text functions for text filtering, depending on your data set. Enter full text functions Elasticsearch 8.17 and Serverless introduced two new functions as tech preview for text matching: MATCH and query string (abbreviated QSTR ). These functions address some of the limitations that existed for text filtering: They can be used directly on multivalued fields . They will return results when any of the values in a multivalued field matches the query. They use analyzers for text fields. The query will be analyzed using any existing analyzers for the target fields, which will allow matching regardless of case. This also unlocks ASCII folding, removing stopwords, and even using synonyms . They are performant . Instead of relying on pattern matching or regular expressions, they can directly use Lucene index structures to locate specific terms in your data. MATCH function MATCH allows matching a value on a specific field: Match function uses a match query under the hood. This means that it will create a boolean query when multiple terms are used, with OR as the default operator for combining them. Match function currently has some limitations: It does not provide a way to specify parameters . It will use the defaults for the match query. It can only be used in WHERE clauses. It can't be used after a STATS or LIMIT command The following limitations exist in 8.17 version: Only text or keyword fields can be used with MATCH . MATCH can be combined with other conditions as part of an AND expression, but not as part of an OR expression. WHERE match(message, \"connection lost\") AND length(message) > 10 can be used, but not WHERE match(message, \"connection lost\") OR length(message) > 10 . These limitations have been removed in 8.18 version and in Elastic Cloud Serverless , which is continuously up to date with our new work. Match operator The match operator (:) is equivalent to the match function above, but it offers a more succinct syntax: It is more convenient to use the match operator, but you can use whichever makes more sense to you. Match operator has the same limitations as the match function. Query string function Query string function ( QSTR ) uses the query string syntax to perform complex queries on one or several fields: Query string syntax allows to specify powerful full text options and operations, including fuzzy search , proximity searches and the use of boolean operators . Refer to the docs for more details. Query string is a very powerful tool, but currently has some limitations, very similar to the MATCH function: It does not provide a way to specify parameters like the match type or specifying the default fields to search for. It can only be used in WHERE clauses. It can't be used after STATS or LIMIT commands It can't be used after commands that modify columns, like SHOW, ROW, DISSECT, DROP, ENRICH, EVAL, GROK, KEEP, MV_EXPAND, or RENAME Similar to the MATCH function, we have a limitation for the OR conditions in version 8.17, which has been removed in 8.18 and Elastic Cloud Serverless . What's next What's coming for full text search? Quite a few things have been introduced in 8.18: Adding tuning options for the behaviour of MATCH and QSTR functions An additional KQL function that can be used to port your existing Kibana queries to ES|QL We also added scoring , so you can start using ES|QL for relevance matching and not just for filtering. This is quite exciting as this will define how the future of text search will be like in Elasticsearch! Check out ES|QL - Introducing scoring and semantic search for an overview of changes in 8.18 for text search. Give it a try MATCH and QSTR are available as tech preview on Elasticsearch 8.17, and of course they are always up to date in Serverless. What are you looking for in terms of text filtering? Let us know your feedback! Happy full text filtering! Report an issue Related content Search Analytics How To June 10, 2024 Storage wins for time-series data in Elasticsearch Explore Elasticsearch's storage improvements for time series data and best practices for configuring a TSDS with storage efficiency. MG KK By: Martijn Van Groningen and Kostas Krikellas Jump to Filtering text with ES|QL Limitations of text filtering Multivalued fields Analyzed text Performance Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Filtering in ES|QL using full text search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/filtering-in-esql-full-text-search-match-qstr",
+    "meta_description": "Learn how to use the MATCH and QSTR functions for efficient full-text filtering in ES|QL. Explore what they do and how they differ from existing text filtering methods."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. Generative AI How To NS By: Neha Saini On April 25, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The LangGraph Retrieval Agent Template is a starter project developed by LangChain to facilitate the creation of retrieval-based question-answering systems using LangGraph in LangGraph Studio. This template is pre-configured to integrate seamlessly with Elasticsearch, enabling developers to rapidly build agents that can index and retrieve documents efficiently. This blog focuses on running and customizing the LangChain Retrieval Agent Template using LangGraph Studio and LangGraph CLI. The template provides a framework for building retrieval-augmented generation (RAG) applications, leveraging various retrieval backends like Elasticsearch. We will walk you through setting up, configuring the environment, and executing the template efficiently with Elastic while customizing the agent flow. Prerequisites Before proceeding, ensure you have the following installed: Elasticsearch Cloud deployment or on-prem Elasticsearch deployment (or create a 14-day Free Trial on Elastic Cloud) - Version 8.0.0 or higher Python 3.9+ Access to an LLM provider such as Cohere (used in this guide), OpenAI , or Anthropic/Claude Creating the LangGraph app 1. Install the LangGraph CLI 2. Create LangGraph app from retrieval-agent-template You will be presented with an interactive menu that will allow you to choose from a list of available templates. Select 4 for Retrieval Agent and 1 for Python, as shown below: Troubleshooting : If you encounter the error “urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)> “ Please run the Install Certificate Command of Python to resolve the issue, as shown below. 3. Install dependencies In the root of your new LangGraph app, create a virtual environment and install the dependencies in edit mode so your local changes are used by the server: Setting up the environment 1. Create a .env file The .env file holds API keys and configurations so the app can connect to your chosen LLM and retrieval provider. Generate a new .env file by duplicating the example configuration: 2. Configure the . env file The .env file comes with a set of default configurations. You can update it by adding the necessary API keys and values based on your setup. Any keys that aren't relevant to your use case can be left unchanged or removed. Example .env file (using Elastic Cloud and Cohere) Below is a sample .env configuration for using Elastic Cloud as the retrieval provider and Cohere as the LLM, as demonstrated in this blog: Note: While this guide uses Cohere for both response generation and embeddings, you’re free to use other LLM providers such as OpenAI , Claude , or even a local LLM model depending on your use case. Make sure that each key you intend to use is present and correctly set in the .env file. 3. Update configuration file - configuration.py After setting up your .env file with the appropriate API keys, the next step is to update your application’s default model configuration. Updating the configuration ensures the system uses the services and models you’ve specified in your .env file. Navigate to the configuration file: The configuration.py file contains the default model settings used by the retrieval agent for three main tasks: Embedding model – converts documents into vector representations Query model – processes the user’s query into a vector Response model – generates the final response By default, the code uses models from OpenAI (e.g., openai/text-embedding-3-small ) and Anthropic (e.g., anthropic/claude-3-5-sonnet-20240620 and anthropic/claude-3-haiku-20240307 ). In this blog, we're switching to using Cohere models. If you're already using OpenAI or Anthropic, no changes are needed. Example changes (using Cohere): Open configuration.py and modify the model defaults as shown below: Running the Retrieval Agent with LangGraph CLI 1. Launch LangGraph server This will start up the LangGraph API server locally. If this runs successfully, you should see something like: Open Studio UI URL. There are two graphs available: Retrieval Graph : Retrieves data from Elasticsearch and responds to Query using an LLM. Indexer Graph : Indexes documents into Elasticsearch and generates embeddings using an LLM. 2. Configuring the Indexer Graph Open the Indexer Graph. Click Manage Assistants. Click on 'Add New Assistant ', enter the user details as specified, and then close the window. 3. Indexing sample documents Index the following sample documents, which represent a hypothetical quarterly report for the organization NoveTech: Once the documents are indexed, you will see a delete message in the thread, as shown below. 4. Running the Retrieval Graph Switch to the Retrieval Graph. Enter the following search query: The system will return relevant documents and provide an exact answer based on the indexed data. Customize the Retrieval Agent To enhance the user experience, we introduce a customization step in the Retrieval Graph to predict the next three questions a user might ask. This prediction is based on: Context from the retrieved documents Previous user interactions Last user query The following code changes are required to implement Query Prediction feature: 1. Update graph.py Add predict_query function: Modify respond function to return response Object , instead of message: Update graph structure to add new node and edge for predict_query: 2. Update prompts.py Craft prompt for Query Prediction in prompts.py : 3. Update configuration.py Add predict_next_question_prompt : 4. Update state.py Add the following attributes: 5. Re-run the Retrieval Graph Enter the following search query again: The system will process the input and predict three related questions that users might ask, as shown below. Conclusion Integrating the Retrieval Agent template within LangGraph Studio and CLI provides several key benefits: Accelerated development : The template and visualization tools streamline the creation and debugging of retrieval workflows, reducing development time. Seamless deployment : Built-in support for APIs and auto-scaling ensures smooth deployment across environments. Easy updates: Modifying workflows, adding new functionalities, and integrating additional nodes is simple, making it easier to scale and enhance the retrieval process. Persistent memory : The system retains agent states and knowledge, improving consistency and reliability. Flexible workflow modeling : Developers can customize retrieval logic and communication rules for specific use cases. Real-time interaction and debugging : The ability to interact with running agents allows for efficient testing and issue resolution. By leveraging these features, organizations can build powerful, efficient, and scalable retrieval systems that enhance data accessibility and user experience. The full source code for this project is available on GitHub . Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Creating the LangGraph app 1. Install the LangGraph CLI 2. Create LangGraph app from retrieval-agent-template 3. Install dependencies Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Build a powerful RAG workflow using LangGraph and Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/build-rag-workflow-langgraph-elasticsearch",
+    "meta_description": "In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Python Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Python Javascript +2 October 30, 2024 Export your Kibana Dev Console requests to Python and JavaScript Code The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application. MG By: Miguel Grinberg Generative AI Python October 23, 2024 GitHub Assistant: Interact with your GitHub repository using RAG and Elasticsearch This blog introduces a GitHub Assistant using RAG with Elasticsearch to enable semantic code queries, providing insights into GitHub repositories, which can be extended to PRs feedback, issues handling, and production readiness reviews. FS By: Fram Souza Vector Database .NET +2 October 9, 2024 Building a search app with Blazor and Elasticsearch Learn how to build a search application using Blazor and Elasticsearch, and how to use the Elasticsearch .NET client for hybrid search. GL By: Gustavo Llermaly Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Integrations Python +1 September 27, 2024 Elasticsearch open inference API for Google AI Studio Elasticsearch open inference API adds support for Google AI Studio JV By: Jeff Vestal ML Research Python September 19, 2024 Evaluating search relevance part 2 - Phi-3 as relevance judge Using the Phi-3 language model as a search relevance judge, with tips & techniques to improve the agreement with human-generated annotation. TP TV By: Thanos Papaoikonomou and Thomas Veasey 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Python - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/python-programming",
+    "meta_description": "Python articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Building a Multimodal RAG system with Elasticsearch: The story of Gotham City Learn how to build a Multimodal Retrieval-Augmented Generation (RAG) system that integrates text, audio, video, and image data to provide richer, contextualized information retrieval. Generative AI How To AS By: Alex Salgado On March 11, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this blog, you'll learn how to build a Multimodal RAG (Retrieval-Augmented Generation) pipeline using Elasticsearch. We'll explore how to leverage ImageBind to generate embeddings for various data types, including text, images, audio, and depth maps. You'll also discover how to efficiently store and retrieve these embeddings in Elasticsearch using dense_vector and k-NN search. Finally, we'll integrate a large language model (LLM) to analyze retrieved evidence and generate a comprehensive final report. How does the pipeline work? Collecting clues → Images, audio, texts, and depth maps from the crime scene in Gotham. Generating embeddings → Each file is converted into a vector using the ImageBind multimodal model. Indexing in Elasticsearch → The vectors are stored for efficient retrieval. Searching by similarity → Given a new clue, the most similar vectors are retrieved. The LLM analyzes the evidence → A GPT-4 model synthesizes the response and identifies the suspect! Technologies used ImageBind → Generates unified embeddings for various modalities. Elasticsearch → Enables fast and efficient vector search. LLM (GPT-4, OpenAI) → Analyzes the evidence and generates a final report. Who is this blog for? Elastic users interested in multimodal vector search. Developers looking to understand Multimodal RAG in practice. Anyone searching for scalable solutions for analyzing data from multiple sources. Prerequisites: Setting up the environment To solve the crime in Gotham City, you need to set up your technology environment. Follow this step-by-step guide: 1. Technical requirements Component Specification Sistem OS Linux, macOS, or Windows Python 3.10 or later RAM Minimum 8GB (16GB recommended) GPU Optional but recommended for ImageBind 2. Setting up the project All investigation materials are available on GitHub, and we'll be using Jupyter Notebook (Google Colab) for this interactive crime-solving experience. Follow these steps to get started: Setting up with Jupyter Notebook (Google Colab) 1. Access the notebook Open our ready-to-use Google Colab notebook: Multimodal RAG with Elasticsearch . This notebook contains all the code and explanations you need to follow along. 2. Clone the repository 3. Install dependencies 4. Configure credentials Note: The ImageBind model (~2GB) will be downloaded automatically on the first run. Now that everything is set up, let's dive into the details and solve the crime! Introduction: The crime in Gotham City On a rainy night in Gotham City, a shocking crime shakes the city. Commissioner Gordon needs your help to unravel the mystery. Clues are scattered across different formats: blurred images, mysterious audio, encrypted texts, and even depth maps. Are you ready to use the most advanced AI technology to solve the case? In this blog, you will be guided step by step through building a Multimodal RAG (Retrieval-Augmented Generation) system that unifies different types of data ( images, audio, texts, and depth maps ) into a single search space. We will use ImageBind to generate multimodal embeddings, Elasticsearch to store and retrieve these embeddings, and a Large Language Model (LLM) to analyze the evidence and generate a final report. Fundamentals: Multimodal RAG architecture What is a Multimodal RAG? The rise of Retrieval-Augmented Generation (RAG) Multimodal is revolutionizing the way we interact with AI models. Traditionally, RAG systems work exclusively with text, retrieving relevant information from databases before generating responses. However, the world is not limited to text— images, videos, and audio also carry valuable knowledge . This is why multimodal architectures are gaining prominence, allowing AI systems to combine information from different formats for richer and more precise responses . Three main approaches for Multimodal RAG To implement a Multimodal RAG, three strategies are commonly used. Each approach has its own advantages and limitations, depending on the use case: 1. Shared vector space Data from different modalities are mapped into a common vector space using multimodal models like ImageBind. This allows text queries to retrieve images, videos, and audio without explicit format conversion. Advantages: Enables cross-modal retrieval without requiring explicit format conversion. Provides a fluid integration between different modalities, allowing direct retrieval across text, image, audio, and video. Scalable for diverse data types, making it useful for large-scale retrieval applications . Disadvantages: Training requires large multimodal datasets , which may not always be available. The shared embedding space may introduce semantic drift , where relationships between modalities are not perfectly preserved. Bias in multimodal models can impact retrieval accuracy, depending on the dataset distribution. 2. Single grounded modality All modalities are converted to a single format , usually text , before retrieval. For example, images are described through automatically generated captions , and audio is transcribed into text. Advantages: Simplifies retrieval , as everything is converted into a uniform text representation . Works well with existing text-based search engines , eliminating the need for specialized multimodal infrastructure. Can improve interpretability since retrieved results are in a human-readable format. Disadvantages: Loss of information : Certain details (e.g., spatial relationships in images, tone in audio) may not be fully captured in text descriptions. Dependent on captioning/transcription quality : Errors in automatic annotations can reduce retrieval effectiveness. Not optimal for purely visual or auditory queries since the conversion process might remove essential context. 3. Separate retrieval Maintains distinct models for each modality. The system performs separate searches for each data type and later merges the results . Advantages: Allows custom optimization per modality , improving retrieval accuracy for each type of data. Less reliance on complex multimodal models , making it easier to integrate existing retrieval systems. Provides fine-grained control over ranking and re-ranking as results from different modalities can be combined dynamically. Disadvantages: Requires fusion of results , making the retrieval and ranking process more complex. May generate inconsistent responses if different modalities return conflicting information. Higher computational cost since independent searches are performed for each modality, increasing processing time. Our choice: Shared vector space with ImageBind Among these approaches, we chose shared vector space , a strategy that aligns perfectly with the need for efficient multimodal searches . Our implementation is based on ImageBind , a model capable of representing multiple modalities ( text, image, audio, and video ) in a common vector space . This allows us to: Perform cross-modal searches between different media formats without needing to convert everything to text. Use highly expressive embeddings to capture relationships between different modalities. Ensure scalability and efficiency , storing optimized embeddings for fast retrieval in Elasticsearch. By adopting this approach, we built a robust multimodal search pipeline , where a text query can directly retrieve images or audio without additional pre-processing. This method expands practical applications from intelligent search in large repositories to advanced multimodal recommendation systems . The following figure illustrates the data flow within the Multimodal RAG pipeline, highlighting the indexing, retrieval, and response generation process based on multimodal data: How does the embedding space work? Traditionally, text embeddings come from language models (e.g., BERT, GPT). Now, with native multimodal models like Meta AI’s ImageBind , we have a backbone that generates vectors for multiple modalities: Text : Sentences and paragraphs are transformed into vectors of the same dimension. Images (vision) : Pixels are mapped into the same dimensional space used for text. Audio : Sound signals are converted into embeddings comparable to images and text. Depth Maps : Depth data is processed and also results in vectors. Thus, any clue ( text, image, audio, depth ) can be compared to any other using vector similarity metrics like cosine similarity . If a laughing audio sample and an image of a suspect's face are “close” in this space, we can infer some correlation (e.g., the same identity). Stage 1 - Collecting crime scene clues Before analyzing the evidence, we need to collect it. The crime in Gotham left traces that may be hidden in images, audio, texts, and even depth data. Let's organize these clues to feed into our system. What do we have? Commissioner Gordon sent us the following files containing evidence collected from the crime scene in four different modalities: Track description and modality a) Images (2 photos) crime_scene1.jpg, crime_scene2.jpg → Photos taken from the crime scene. Shows suspicious traces on the ground. suspect_spotted.jpg → Security camera image showing a silhouette running away from the scene. b) Audio (1 recording) joker_laugh.wav → A microphone near the crime scene captured a sinister laugh. c) Text (1 message) Riddle.txt, note2.txt → Some mysterious notes were found at the location, possibly left by the criminal. d) Depth (1 depth map) depth_suspect.png → A security camera with a depth sensor captured a suspect in a nearby alley. jdancing-depth.png → A security camera with a depth sensor captured a suspect going down the subway station. These pieces of evidence are in different formats and cannot be analyzed directly in the same way. We need to transform them into embeddings—numerical vectors that will allow cross-modal comparison. File organization Before starting processing, we need to ensure that all clues are properly organized in the data/ directory so the pipeline runs smoothly. Expected directory structure: Code to verify clue organization Before proceeding, let's ensure that all required files are in the correct location. Running the file Expected output (if all files are correct): Expected output (if any file is missing): This script helps prevent errors before we start generating embeddings and indexing them into Elasticsearch. Stage 2 - Organizing the evidence Generating embeddings with ImageBind To unify the clues, we need to transform them into embeddings—vector representations that capture the meaning of each modality. We will use ImageBind , a model by Meta AI that generates embeddings for different data types ( images, audio, text, and depth maps ) within a shared vector space. How does ImageBind work? To compare different types of evidence ( images, audio, text, and depth maps ), we need to transform them into numerical vectors using ImageBind . This model allows any type of input to be converted into the same embedding format, enabling cross-modal searches between modalities. Below is an optimized code ( src/embedding_generator.py ) to generate embeddings for any type of input using the appropriate processors for each modality: A tensor is a fundamental data structure in machine learning and deep learning, especially when working with models like ImageBind. In our context: Here, the tensor represents the input data (image, audio, or text) converted into a mathematical format that the model can process. Specifically: For images : The tensor represents the image as a multidimensional matrix of numerical values (pixels organized by height, width, and color channels). For audio : The tensor represents sound waves as a sequence of amplitudes over time. For text : The tensor represents words or tokens as numerical vectors. Testing embedding generation: Let's test our embedding generation with the following code. Save it in 02-stage/test_embedding_generation.py and execute it with this command: Expected output: Now, the image has been transformed into a 1024-dimensional vector . Stage 3 - Storage and search in Elasticsearch Now that we have generated the embeddings for the evidence, we need to store them in a vector database to enable efficient searches. For this, we will use Elasticsearch , which supports dense vectors ( dense_vector ) and allows similarity searches. This step consists of two main processes: Indexing the embeddings → Stores the generated vectors in Elasticsearch. Similarity search → Retrieves the most similar records to a new piece of evidence. Indexing the evidence in Elasticsearch Each piece of evidence processed by ImageBind (image, audio, text, or depth) is converted into a 1024-dimensional vector . We need to store these vectors in Elasticsearch to enable future searches. The following code ( src/elastic_manager.py ) creates an index in Elasticsearch and configures the mapping to store the embeddings. Running the indexing Now, let's index a piece of evidence to test the process. Expected output in Elasticsearch (summary of the indexed document): To index all multimodal evidence, please execute the following Python command: Now, the evidence is stored in Elasticsearch and is ready to be retrieved when needed. Verifying the indexing process After running the indexing script, let's verify if all our evidence was correctly stored in Elasticsearch. You can use Kibana's Dev Tools to run some verification queries: 1. First, check if the index was created: 2. Then, verify the document count per modality: 3. Finally, examine the indexed document structure: Expected results: An index named `multimodal_content` should exist. Around 7 documents distributed across different modalities (vision, audio, text, depth). Each document should contain: embedding, modality, description, metadata, and content_path fields. This verification step ensures that our evidence database is properly set up before we proceed with the similarity searches. Searching for similar evidence in Elasticsearch Now that the evidence has been indexed, we can perform searches to find the most similar records to a new clue. This search uses vector similarity to return the closest records in the embedding space . The following code performs this search. Testing the search - Using audio as a query for multimodal results Now, let's test the search for evidence using a suspicious audio file . We need to generate an embedding for the file in the same way and search for similar embeddings: Expected output in the terminal: Now, we can analyze the retrieved evidence and determine its relevance to the case. Beyond audio - Exploring multimodal searches Reversing the roles: Any modality can be a \"question\" In our Multimodal RAG system, every modality is a potential search query . Let's go beyond the audio example and explore how other data types can initiate investigations . 1. Searching by text (deciphering the criminal’s note) Scenario: You found an encrypted text message and want to find related evidence. Expected results: 2. Image search (tracking the suspicious crime scene) Scenario: A new crime scene ( crime_scene2.jpg ) needs to be compared with other evidence. Output: 3. Depth map search (3D pursuit) Scenario: A depth map ( jdancing-depth.png ) reveals image escape patterns . Output Why does this matter? Each modality reveals unique connections : Text → Linguistic patterns of the suspect. Images → Recognition of locations and objects. Depth → 3D scene reconstruction. Now, we have a structured evidence database in Elasticsearch , enabling us to store and retrieve multimodal evidence efficiently . Summary of what we've done: Stored multimodal embeddings in Elasticsearch. Performed similarity searches , finding evidence related to new clues. Tested the search using a suspicious audio file , ensuring the system works correctly. Next step: We will use an LLM (Large Language Model) to analyze the retrieved evidence and generate a final report . Stage 4 - Connecting the dots with the LLM Now that the evidence has been indexed in Elasticsearch and can be retrieved by similarity, we need a LLM (Large Language Model) to analyze it and generate a final report to send to Commissioner Gordon. The LLM will be responsible for identifying patterns, connecting clues, and suggesting a possible suspect based on the retrieved evidence. For this task, we will use GPT-4 Turbo , formulating a detailed prompt so that the model can interpret the results efficiently. LLM integration To integrate the LLM into our system, we created the LLMAnalyzer class ( src/llm_analyzer.py ), which receives the retrieved evidence from Elasticsearch and generates a forensic report using this evidence as the prompt context. Temperature setting in LLM analysis: For our forensic analysis system, we use a moderate temperature of 0.5. This balanced setting was chosen because: It represents a middle ground between deterministic (too rigid) and highly random outputs; At 0.5, the model maintains enough structure to provide logical and justifiable forensic conclusions; This setting allows the model to identify patterns and make connections while staying within reasonable forensic analysis parameters; It balances the need for consistent, reliable outputs with the ability to generate insightful analysis. This moderate temperature setting helps ensure our forensic analysis is both reliable and insightful, avoiding both overly rigid and overly speculative conclusions. Running the evidence analysis Now that we have the LLM integration , we need a script that connects all system components. This script will: Search for similar evidence in Elasticsearch. Analyze the retrieved evidence using the LLM to generate a final report. Code: Evidence analysis script Expected LLM output Conclusion: Case solved With all the clues gathered and analyzed , the Multimodal RAG system has identified a suspect: The Joker . By combining images, audio, text, and depth maps into a shared vector space using ImageBind , the system was able to detect connections that would have been impossible to identify manually. Elasticsearch ensured fast and efficient searches , while the LLM synthesized the evidence into a clear and conclusive report . However, the true power of this system goes beyond Gotham City . The Multimodal RAG architecture opens doors to numerous real-world applications : Urban surveillance: Identifying suspects based on images, audio, and sensor data . Forensic analysis: Correlating evidence from multiple sources to solve complex crimes . Multimedia recommendation: Creating recommendation systems that understand multimodal contexts (e.g., suggesting music based on images or text). Social media trends: Detecting trending topics across different data formats. Now that you’ve learned how to build a Multimodal RAG system , why not test it with your own clues ? Share your discoveries with us and help the community advance in the field of multimodal AI ! Special thanks I would like to thank Adrian Cole for his valuable contribution and review during the process of defining the deployment architecture of this code. References Build a multimodal image retrieval system using KNN search and CLIP embeddings k-Nearest Neighbor (kNN) Search PyTorch Official Documentation on Tensors ImageBind: a new way to ‘link’ AI across the senses Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to How does the pipeline work? Technologies used Who is this blog for? Prerequisites: Setting up the environment 1. Technical requirements Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Building a Multimodal RAG system with Elasticsearch: The story of Gotham City - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/building-multimodal-rag-system",
+    "meta_description": "Learn how to build a Multimodal RAG system that integrates text, audio, video, and image data to provide richer, contextualized information retrieval."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. Vector Database Search Relevance Generative AI SM By: Sunile Manjee On March 12, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Imagine searching for “recently renovated accommodations 250m from Belongil Beach with at least 4 stars and a pool and gym” and having your search engine return exactly what you needed. Intelligent search understands intent and reasons with your queries. Reasoning and intention discovery require sophistication beyond a heuristics-only approach. This is where large language model functions and Elasticsearch search templates come together to deliver a truly intelligent search experience. Try it yourself If you're not a believer, no worries. This entire end-to-end example is available within a Python notebook here . The notebook includes data, index mapping, inferencing endpoints, search templates, and LLM functions. You'll need an Elastic Cloud instance, Azure OpenAI instance, and Google Maps API key. The Problem: Capturing complex constraints Search approaches such as keyword or even vector-based methods are challenged when users pack multiple nuances into a single request. Consider a hotel finder scenario. A person wants: A hotel near Belongil Beach, within 250 meters. A minimum rating of 4 stars. A recently renovated property. Specific amenities like a pool and gym. Trying to encode all of these requirements using naive keyword matches or basic similarity scores might return incomplete or irrelevant results, reducing user trust and experience. Schema-Based Searches with LLMs Elasticsearch is built on an index schema architecture. When you index data, you define fields like “rating,” “geopoint,” or “amenities,” making it much easier to filter and rank results accurately. However, this structure demands that queries be equally structured. That’s where LLMs (like GPT-based or Generation models) become the linchpin. An LLM can interpret a user’s natural language query, extract key attributes (“distance = 250m,” “rating >= 4,” “amenities = pool, gym,” “near Belongil Beach,” etc.), and even call a geocoder service when a geo component has been detected. It then outputs a JSON payload ready to be slotted into an Elasticsearch search template —a parameterized query that cleanly separates the query logic from the dynamic values. With this approach, we capitalize on both the semantic understanding of LLMs and the schema-based filtering and faceting of Elasticsearch. Example in action Suppose your user’s query is: “recently renovated accommodations 250m from Belongil Beach with at least 4 stars and with a pool and gym.” LLM processing: The LLM breaks down the text, recognizes that it needs a distance-based filter (250m), a minimum rating (4 stars), relevant amenities (pool, gym), and even a contextual hint about being “recently renovated.” It also calls a geocoding service for “Belongil Beach,” returning precise latitude and longitude coordinates. Search template: You create an Elasticsearch search template that expects parameters like rating , distance , latitude , longitude , and possibly a text query for any free-form conditions. Once the LLM provides these parameters, your application simply fills in the placeholders and calls Elasticsearch. Here, not only is filtering leveraged, but also hybrid queries using vectors, ELSER, and lexical search. Results: The response precisely matches accommodations within 250 meters of Belongil Beach, has at least 4 stars, is flagged as recently renovated, and includes a pool and gym. An example result might be: Hotel name : Belongil Beach Apartment Rating : 4 stars City : Byron Bay, New South Wales Country : Australia Rather than depending solely on a vector space or hybrid search, you can input precise filters, which will make the recall more comprehensive and the precision more accurate. Why this approach works Precision and recall : By structuring the query according to the index schema, you remove ambiguity, ensuring you don’t miss valid results (high recall) and keep out irrelevant ones (high precision). This is often observed when relying solely on a vector space, which doesn’t naturally offer distillation features. Scalability : Elasticsearch is designed for massive data volumes. Once the parameters are extracted, the query itself remains blazing fast, even on huge indexes. Flexibility: If new attributes appear (e.g., “EV charging station”), the LLM functions should capture the attribute as a hotel amenity and inject it into the Elasticsearch search template. Resilience to complexity : No matter how complex a user’s query, the LLM’s semantic parsing ensures every relevant detail is captured: distance constraints, star ratings, location-based conditions, and more. Why search templates Elasticsearch templates enable the creation of parameterized queries, separating the query logic from dynamic values. This is particularly valuable when building dynamic queries based on user input or other variable data. For example, consider the hotel index with fields Description Attractions Rating Facilities Location Latitude Longitude Users might include any combination of these attributes in their search query. As the number of fields increases, manually constructing a query for every potential input combination becomes impractical. Search templates provide a solution by dynamically generating the appropriate query based on the user's input. If the user specifies a rating and attractions, the corresponding query is generated. Similarly, if the user provides a location and rating, the search template generates a query that reflects those inputs. Search templates are defined using a JSON format that includes placeholders for dynamic values. When you execute a search template, Elasticsearch replaces the placeholders with the actual values and then executes the query. Search templates can be used to perform a variety of tasks, such as: Filtering results based on user input Boosting the relevance of certain results Adding custom scoring functions Aggregating results Below is an example of the search template that dynamically creates a query based on input parameters. LLM functions Large Language Model functions utilize strong reasoning capabilities to determine the optimal subsequent action, such as parsing data, calling an API, or requesting additional information. When combined with search templates, LLMs can determine if a user query contains an attribute supported by a search template. If a supported attribute is identified, the LLM will execute the corresponding user-defined method call. Within the notebook , there are a few LLM functions. Each function is defined with the tools list. Let’s briefly review each one. The role of the `extract_hotel_search_parameters` LLM function is to extract the parameters from the user query that the search template supports. The `geocode_location` LLM function would be invoked if a location attribute such as \"500 meters from Belongil Beach\" is identified. The LLM function `query_elasticsearch` will be called using the `geocode_location` (if it was found within the user query) and the parameters from the LLM function `extract_hotel_search_parameters`. The completions API registers each LLM function as a tool. This list of tools was detailed earlier in the article. Azure OpenAI The notebook uses an Azure OpenAI completions model and to run it, you will need the Azure OpenAI Key (either Key 1 or Key 2), Endpoint, Deployment Name, and Version number. All this information can be found under Azure OpenAI → Keys and Endpoint. Deploy a completion model. That is the deployment name used within the notebook . Under Chat playground, click on View code to find the api version. Google Maps API The notebook uses the Google Maps API to geocode locations identified within user queries. This functionality requires a Google account and an API key, which can be generated here . Putting LLM functions and search templates into Action The LLM uses reasoning to determine the necessary functions and their order of execution based on the given query. Once a query is executed such as \"recently renovated accommodations 250m from Belongil Beach with at least 4 stars and with a pool and gym\", the LLM reasoning layer is exposed: Extract parameters The initial LLM function call is designed to pull parameters from the query. Geocoding The LLM then determined that the query contained a 'from' location and that a geocoder function should be called next. Intelligent query The reasoning layer of the LLM uses the parameters from previous function calls to execute an Elasticsearch query with search templates. Precise results Using LLM functions along with a search template to execute an intelligent query, a perfect match has been found. Conclusion Combining the power of Large Language Models functions with Elasticsearch search templates ushers in capabilities for query intent and reasoning. Rather than treating a query as an unstructured blob of text, we methodically break it down, match it against a known schema, and let Elasticsearch handle the heavy lifting of search, filtering and scoring. The result is a highly accurate, user-friendly search experience that feels almost magical—users simply speak (or type) their minds, and the system understands precisely what they mean. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to Try it yourself The Problem: Capturing complex constraints Example in action Why this approach works Why search templates Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Unifying Elastic vector database and LLM functions for intelligent query - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/llm-functions-elasticsearch-intelligent-query",
+    "meta_description": " Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Joining two indices in Elasticsearch Explaining how to use the terms query and the enrich processor for joining two indices in Elasticsearch. How To KB By: Kofi Bartlett On May 7, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In Elasticsearch, joining two indices is not as straightforward as it is in traditional SQL databases. However, it is possible to achieve similar results using certain techniques and features provided by Elasticsearch. This article will delve into the process of joining two indices in Elasticsearch, focusing on the use of the terms query and the enrich processor. Using the terms query for joining two indices The terms query is one of the most effective ways to join two indices in Elasticsearch. This query is used to retrieve documents that contain one or more exact terms in a specific field. Here’s how you can use it to join two indices: First, you need to retrieve the required data from the first index. This can be done using a simple GET request. Once you have the data from the first index, you can use it to query the second index. This is done using the terms query, where you specify the field and the values you want to match. Here is an example: In this example, field_in_second_index is the field in the second index that you want to match with the values from the first index. value1_from_first_index and value2_from_first_index are the values from the first index that you want to match in the second index. The terms query also provides support to perform the two above steps in a single shot using a technique called terms lookup. Elasticsearch will take care of transparently retrieving the values to match from another index. For example, you have a teams index containing a list of players: Now, it is possible to query a people index for all the people playing in team1, as shown below: In the example above, Elasticsearch will transparently retrieve the player names from the team1 document present in the teams index (i.e. “john”, “bill”, “michael”) and find all people documents with a name field that contains any of those values. The equivalent SQL query would be: Using the enrich processor for joining two indices The enrich processor is another powerful tool that can be used to join two indices in Elasticsearch. This processor enriches the data of incoming documents by adding data from a pre-defined enrich index. Here’s how you can use the enrich processor to join two indices: 1. First, you need to create an enrich policy. This policy defines which index to use for enrichment and which field to match on. Here is an example: 2. Once the policy is created, you need to execute it: 3. After executing the policy, you can use the enrich processor in an ingest pipeline to enrich the data of incoming documents: In this example, field_in_second_index is the field in the second index that you want to enrich with data from the first index. enriched_field is the new field that will contain the enriched data. One drawback of this approach is that if the data changes in first_index , the enrich policy needs to be re-executed as the enriched index is not updated or synchronized automatically from the source index it has been built from. However, if first_index is relatively stable, then this approach works great. Conclusion In conclusion, while Elasticsearch does not support traditional join operations, it provides features like the terms query and the enrich processor that can be used to achieve similar results. It’s important to note that these methods have their limitations and should be used judiciously based on the specific requirements and the nature of the data. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Using the terms query for joining two indices Using the enrich processor for joining two indices Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Joining two indices in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-join-two-indexes",
+    "meta_description": "Explaining how to use the terms query and the enrich processor for joining two indices in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Export your Kibana Dev Console requests to Python and JavaScript Code The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application. Python Javascript Developer Experience How To MG By: Miguel Grinberg On October 30, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Have you used the Kibana Dev Console? This is a fantastic prototyping tool that allows you to build and test your Elasticsearch requests interactively. But what do you do after you have a working request in the Console? In this article we'll take a look at the new code generation feature in the Kibana Dev Console, and how it can significantly reduce your development effort by generating ready to use code for you. This feature is available in our Serverless platform and in Elastic Cloud and self-hosted releases 8.16 and up. The Kibana Dev Console This section provides a quick introduction to the Kibana Dev Console , in case you have never used it before. Skip to the next section if you are already familiar with it. While you are in any part of the Search section in Kibana, you will notice a \"Console\" link at the bottom of your browser's page: When you click this link, the Console expands to cover the page. Click it again to collapse it. In the left-side panel of the Dev Console, you can enter Elasticsearch requests, with the help of an interactive editor that provides auto-completion and checks your syntax. Some example requests are already pre-populated so that you have something to start experimenting with. When the cursor is on a request, a \"play\" button appears to its right. You can click this button to send the request to your Elasticsearch server. After you execute a request, the response from the server appears in the panel on the right. Code Export feature in Kibana Dev Console The Dev Console makes it easy to prototype your requests or queries until you get exactly what you want. But what happens next? If you need to convert the request to code so that you can incorporate it into your application, then you can save time using the new code export feature. Next to the Play button you will find the three dot or \"kebab\" button, which opens a menu of options. The first option provides access to the code export feature. If you've never used this feature before, it will appear with a \"Copy as curl\" label. If you select this option, your clipboard will be loaded with a curl command that is equivalent to the selected request. Now, things get more interesting when you click the \"Change\" link, which allows you to switch to a different target language. In this initial release, the code export adds support for Python and JavaScript. More languages are expected to be added in future releases. You can now select your desired language and click \"Copy code\" to put the exported code in your clipboard. You can also change the default language that is offered in the menu. The exported code is a complete script in the selected language, using the official Elasticsearch client for that language. Here is an example of how the PUT /my-index request shown above looks when exported to the Python language: To use the exported code follow these steps: Paste the code from the clipboard to a new file with the correct extension ( .py for Python, or .js for JavaScript). In your terminal, add an environment variable called ELASTIC_API_KEY with a valid API Key for your Elasticsearch cluster. You can create an API key right in Kibana if you don't have one yet. Execute the script with the python or node commands depending on your language, making sure the official Elasticsearch client is installed. Now you are ready to adapt the exported code as needed to integrate it into your application! Conclusion In this article you have learned about the new Code Export feature in the Kibana Dev Console. We hope this feature will streamline your development process with Elasticsearch! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to The Kibana Dev Console Code Export feature in Kibana Dev Console Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Export your Kibana Dev Console requests to Python and JavaScript Code - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/kibana-dev-console-code-export",
+    "meta_description": "Learn how to export your Kibana Dev Console requests to Python and JavaScript using the Code Export feature."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Understanding Elasticsearch scoring and the Explain API Diving into the scoring mechanism of Elasticsearch and exploring the Explain API. How To KB By: Kofi Bartlett On May 5, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch is a powerful search engine that provides fast and relevant search results by calculating a score for each document in the index. This score is a crucial factor in determining the order of the search results. In this article, we will delve into the scoring mechanism of Elasticsearch and explore the Explain API, which helps in understanding the scoring process. Scoring mechanisms in Elasticsearch Elasticsearch uses a scoring model called the Practical Scoring Function (BM25) by default. This model is based on the probabilistic information retrieval theory and takes into account factors such as term frequency, inverse document frequency, and field-length normalization. Let’s briefly discuss these factors: Term Frequency (TF): This represents the number of times a term appears in a document. A higher term frequency indicates a stronger relationship between the term and the document. Inverse Document Frequency (IDF): This factor measures the importance of a term in the entire document collection. A term that appears in many documents is considered less important, while a term that appears in fewer documents is considered more important. Field-length Normalization : This factor accounts for the length of the field in which the term appears. Shorter fields are given more weight, as the term is considered more significant in a shorter field. Using the Explain API The Explain API in Elasticsearch is a valuable tool for understanding the scoring process. It provides a detailed explanation of how the score for a specific document was calculated. To use the Explain API, you need to send a GET request to the following endpoint: In the request body, you need to provide the query for which you want to understand the scoring. Here’s an example: The response from the Explain API will include a detailed breakdown of the scoring process, including the individual factors (TF, IDF, and field-length normalization) and their contributions to the final score. Here’s a sample response: In this example, the response shows that the score of 1.2 is a product of the IDF value (2.2) and the tfNorm value (0.5). The detailed explanation helps in understanding the factors contributing to the score and can be useful for fine-tuning the search relevance. Conclusion Elasticsearch scoring is a critical aspect of providing relevant search results. By understanding the scoring mechanisms and using the Explain API, you can gain insights into the factors affecting the search results and optimize your search queries for better relevance and performance. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Scoring mechanisms in Elasticsearch Using the Explain API Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Understanding Elasticsearch scoring and the Explain API - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-scoring-and-explain-api",
+    "meta_description": "Diving into the scoring mechanism of Elasticsearch and exploring the Explain API."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. Generative AI How To TM By: Tomás Murúa On March 31, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This article provides a practical comparison of Retrieval Augmented Generation (RAG) and fine-tuning by examining their performance in a chat box scenario for a fictional e-commerce store. The article is organized as follows: RAG vs Fine Tuning Chatbot test case: Pear Store Approach 1: Fine tuning Approach 2: RAG RAG vs. Fine-tuning RAG RAG (Retrieval Augmented Generation) combines large language models (LLMs) with information retrieval systems so that the generated answers feed on updated and specific data coming from a knowledge base. Advantages: RAG allows us to use external data without modifying the base model and provides precise, safe, and traceable answers. Implementation: In Elasticsearch, data can be indexed using optimized indexes for semantic search and document-level security. Challenges: RAG relies on external knowledge, making accuracy dependent on retrieved information. Retrieval can be costly in terms of context window size. RAG also faces integration and privacy challenges, especially with sensitive data across different sources. Fine-tuning Fine-tuning involves training a pre-trained model on a specific dataset. This process adjusts the model's internal weights, enabling it to learn patterns and generate customized answers. Fine-tuning can also be used for model distillation , a technique where a smaller model is trained on the outputs of a larger model to improve performance on a specific task. This approach allows leveraging the capabilities of a larger model at a reduced cost. Advantages: It offers a high level of optimization, adapting answers to specific tasks, making it ideal for static contexts or domains where knowledge does not change frequently. Implementation: It requires training the model with structured data using an input-output format. OpenAI fine-tuning makes this flow easier using a UI where you can upload the dataset (JSONL) and then train and test it in a controlled environment. Challenges: The retraining process consumes time and computer resources. Precision depends on the quality and size of the dataset, so small or unbalanced ones can result in generic or out-of-context answers; it requires expertise and effort to get it right. There is no grounding or per-user data segmentation. From OpenAI docs : “ We recommend first attempting to get good results with prompt engineering, prompt chaining (breaking complex tasks into multiple prompts), and function calling…” Fine-tuning and RAG comparison Aspect Fine-Tuning RAG Supported data Static Dynamic Setup cost High (training and resources) Low (index configuration) Scalability Low, requires model retraining High, real-time updates Update time Hours/Days Minutes Precision with recent changes Low when not trained with new data High thanks to semantic search Chatbot test case: Pear Store We will use a test case based on a fictional online store called 'Pear Store'. Pear Store needs an assistant to answer specific questions about its policies, promotions, and products. These answers must be truthful and consistent with the store information and useful for both employees and customers. Fine-tuning Dataset We'll use a training dataset with specific questions and their answers regarding products, policies and promotions. For example: Question : What happens if a product is defective? Answer : If a product is defective, we'll send you a free gift of one kilogram of pears along with the replacement. RAG Dataset For the RAG implementation, we will use the same dataset, converted into a PDF and indexed into Elasticsearch using Playground. Here's the PDF file content: Approach 1: Fine-tuning First, we prepare the dataset in JSONL format, as shown below: Make sure each line in the JSONL file is a valid JSON object and there are no trailing commas. Next, using the OpenAI UI, we can go to Dashboard > Fine-tuning and hit Create Then you can upload the JSONL file we just created. Now click Create to start training. After the job is finished, you can hit Playground, and you will have a convenient interface to compare the results with and without the fine-tuned model against a particular question. On the right side, we can see that the model provided the custom answer about defective products: a free kilogram of pears along with the replacement. However, the model's response was inconsistent. A subsequent attempt with the same question yielded an unexpected answer. Although fine-tuning allows us to customize the model's answers, we can see that the model still deviated and provided answers that were just generic and not aligned with our dataset. This is probably because fine-tuning needs more adjustments or a bigger dataset. Now, if we want to change the source data, we will have to repeat the fine-tuning process. Approach 2: RAG To test the dataset using RAG, we will use Playground to create the RAG application and upload the dataset to Kibana. To upload a PDF using the UI and configure the semantic text field, follow the steps from this video: To learn more about uploading PDFs and interacting with them using Playground, you can read this article . Now we're ready to interact with our data using Playground! Using the UI, we can change the AI instructions and check the source of the document used to provide an answer. When we ask the same question in Playground: \" What happens if a product is defective?\" we receive the correct answer: \" If a product is defective, we send you a free gift of one kilogram of pears along with the replacement.\". Additionally, we get a citation to verify the answer´s source and can review the instructions the model followed: If we want to change the data, we just have to update the index with the information about the Q/A . Final thoughts The choice between fine-tuning and RAG depends on the requirements of each system. A common pattern is using some domain specific fine tuned model, like FinGPT for finance, LEGAL-BERT for legal, or medAlpaca for medical to acquire common terminology. Then, frame the answers context, and build a RAG system on top of it with company specific documents. Fine-tuning is useful when you want to manage the model's behavior , and doing so through prompt engineering is not possible, or it requires so many tokens that it’s better to add that information to the training. Or perhaps the task is so narrow and structured that model distillation is the best option. RAG, on the other hand, excels at integrating knowledge through dynamic data and ensuring accurate, up-to-date responses in real-time. This makes it especially useful for scenarios like the Pear Store, where policies and promotions change frequently. RAG also provides data that is grounded in the answer and offers the ability to segment information delivered to users via document-level security. Combining fine-tuning and RAG can also be an effective strategy to leverage the strengths of both approaches and tailor solutions to specific project needs. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to RAG vs. Fine-tuning RAG Fine-tuning Fine-tuning and RAG comparison Chatbot test case: Pear Store Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "RAG vs. Fine Tuning, a practical approach - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/rag-vs-fine-tuning",
+    "meta_description": "Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. How To KB By: Kofi Bartlett On May 9, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In Elasticsearch, it is a common requirement to delete a field from a document. This can be useful when you want to remove unnecessary or outdated information from your index. In this article, we will discuss different methods to delete a field from a document in Elasticsearch, along with examples and step-by-step instructions. Method 1: Using the Update API The Update API allows you to update a document by providing a script that modifies the document’s source. You can use this API to delete a field from a document by setting the field to null. Here’s a step-by-step guide on how to do this: 1. Identify the index, document type (if using Elasticsearch 6.x or earlier), and document ID of the document you want to update. 2. Use the Update API with a script that sets the field to null, or even better, removes it from the source document. The following example demonstrates how to delete the “field_to_delete” field from a document with ID “1” in the “my_index” index: 3. Execute the request. If successful, Elasticsearch will return a response indicating that the document has been updated. Note: This method only removes the field from the specified document. The field will still exist in the mapping and other documents in the index. Method 2: Reindexing with a Modified Source If you want to delete a field from all documents in an index, you can use the Reindex API to create a new index with the modified source. Here’s how to do this: 1. Create a new index with the same settings and mappings as the original index. You can use the Get Index API to retrieve the settings and mappings of the original index. 2. Use the Reindex API to copy documents from the original index to the new index, while removing the field from the source. The following example demonstrates how to delete the “field_to_delete” field from all documents in the “my_index” index: 3. Verify that the new index contains the correct documents with the field removed. 4. If everything looks good, you can delete the original index and, if necessary, add an alias to the new index having the name of the original index name. Method 3: Updating the Mapping and Reindexing If you want to delete a field from the mapping and all documents in an index, you can update the mapping and then reindex the documents. Here’s how to do this: 1. Create a new index with the same settings as the original index. 2. Retrieve the mappings of the original index using the Get Mapping API. 3. Modify the mappings by removing the field you want to delete. 4. Apply the modified mappings to the new index using the Put Mapping API. 5. Use the Reindex API to copy documents from the original index to the new index, as described in Method 2. 6. Verify that the new index contains the correct documents with the field removed and that the field is not present in the mapping. 7. If everything looks good, you can delete the original index and, if necessary, add an alias to the new index having the name of the original index name. Conclusion In this article, we discussed three methods to delete a field from a document in Elasticsearch: using the Update API, reindexing with a modified source, and updating the mapping and reindexing. Each method has its own use cases and trade-offs, so choose the one that best fits your requirements. Always remember to test your changes and verify the results before applying them to production environments. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett How To May 14, 2025 Elasticsearch Index Number_of_Replicas Explaining how to configure the number_of_replicas, its implications and best practices. KB By: Kofi Bartlett Jump to Method 1: Using the Update API Method 2: Reindexing with a Modified Source Method 3: Updating the Mapping and Reindexing Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Deleting a field from a document in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-delete-field-from-document%E2%80%8B",
+    "meta_description": "Exploring methods for deleting a field from a document in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. Developer Experience MC By: Mike Cote On April 18, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Kibana alerting has been the monitoring solution of choice for many large organizations over the last few years. As adoption has continued to grow, so has the number of alerting rules users have created to monitor their systems. With more organizations relying on Kibana for alerting at scale, we have seen an opportunity to improve efficiency and ensure sufficient performance for future workload needs. Between Kibana 8.16 and 8.18, we tackled these issues head-on, introducing key improvements that shattered previous scalability barriers. Before these enhancements, Kibana Alerting could only support up to 3,200 rules per minute with at least 16 Kibana nodes before hitting significant performance bottlenecks. By Kibana 8.18, we’ve increased the scalability ceiling of rules per minute by 50x, supporting up to 160,000 lightweight alerting rules per minute. This was achieved by making Kibana efficiently scale beyond 16 Kibana nodes and increasing per-node throughput from 200 to up to 3,500 rules per minute. These enhancements make all alerting rules run faster, with fewer delays and more efficiently. In this blog, we’ll explore the scaling challenges we overcame, the key innovations that made it possible, and how you can leverage them to run Kibana Alerting at scale efficiently. How Kibana Alerting scales with Task Manager Kibana Alerting allows users to define rules that trigger alerts based on real-time data. Behind the scenes, the Kibana Task Manager schedules and runs these rules. The Task Manager is Kibana’s built-in job scheduler, designed to handle asynchronous background tasks separately from user interactions. Its key responsibilities include: Running one-time and recurring tasks such as alerting rules, connector actions, and reports. Dynamically distributing workloads as Kibana background nodes join or leave the cluster. Keeping the Kibana UI responsive by offloading tasks to dedicated background processes. Each alerting rule translates into a recurring background task. Each background task is an Elasticsearch document, meaning it is stored, fetched and updated as an Elasticsearch document. As the number of alerting rules increases, so do the background tasks Kibana must manage. However, each Kibana node has a limit on how many tasks it can handle simultaneously. Once capacity is reached, additional tasks must wait, leading to delays and slower task run times. The problem: Why scaling was limited Before these improvements, Task Manager faced several scalability constraints, preventing it from scaling beyond 3,200 tasks per minute and 16 Kibana nodes. At this scale, we observed diminishing returns as contention and resource inefficiencies limited further scale. These numbers were based on internal performance testing using a basic Elasticsearch query alerting rule performing a no-op query. The diminishing returns observed included: Task claiming contention Task Manager uses a distributed polling approach to claim tasks within an Elasticsearch index. Kibana nodes periodically query for tasks and attempt to claim them using Elasticsearch’s optimistic concurrency control , which prevents conflicting document updates. If another node updates the task first, the original node drops it, reducing overall efficiency. With too many Kibana nodes competing for tasks, document update conflicts increase drastically, limiting efficiency beyond 16 nodes and reducing system throughput. Inefficient per-node throughput Each Kibana node has a limit on the number of tasks that can run concurrently (default: 10 tasks at a time) to prevent memory and CPU overload. This safeguard often results in underutilized CPU and memory, requiring more nodes than necessary. Additionally, the polling interval (default: 3000ms) defines how often Task Manager claims new tasks. A shorter interval reduces task delays but increases contention as nodes compete more for updates. Resource inefficiencies When running a high volume of alerting rules, Kibana nodes perform repetitive Elasticsearch queries, repeatedly loading the same objects and lists for each alerting rule run, consuming more CPU, memory, and Elasticsearch resources than necessary. Scaling up requires costly infrastructure expansions to support the increasing request loads. Why it’s important Breaking these barriers is crucial for Kibana’s continued evolution. Improved scalability unlocks: Cost optimization : Reducing infrastructure costs for large-scale operations. Faster recovery : Enhancing Kibana’s ability to recover from node or cluster failures. Future expansion : Enabling scalability for additional workloads, such as scheduled reports and event-driven automation. Key innovations in Kibana Task Manager To achieve a 50x scalability boost, we introduced several innovations: Kibana discovery service: smarter scaling Previously, Kibana nodes were unaware of each other’s presence, leading to inefficient task distribution. The new Kibana discovery service dynamically monitors active nodes and assigns task partitions accordingly, ensuring even load distribution and reducing contention. Task partitioning: eliminating contention To prevent nodes from competing for the same tasks, we introduced task partitioning. Tasks are now distributed across 256 partitions, ensuring only a subset of Kibana background nodes attempt to claim the same tasks at any given time. By default, each partition is assigned to a maximum of two Kibana nodes, while a single Kibana node can be responsible for multiple partitions. Task costing: smarter resource allocation Not all background tasks consume the same resources. We implemented a task costing system that assigns task weights based on CPU and memory usage. This allows Task Manager to dynamically adjust the number of tasks to claim, optimize resource allocation, and ensure efficient performance. New task claiming algorithm The old algorithm relied on update-by-query with forced index refresh to identify claimed tasks. This approach was inefficient and introduced unnecessary load on Elasticsearch. The new algorithm avoids this by searching for tasks without requiring an immediate refresh. Instead, it performs the following operations on the task manager index; a _search to find candidate tasks, followed by an _mget which returns documents that may have been updated more recently but are not yet reflected in the refreshed index state. By comparing document versions from _search and _mget results, it discards mismatches before proceeding with bulk updates. This approach increases efficiency in Elasticsearch and offers finer control to support task costing. By factoring in the poll interval, task concurrency and the index refresh rate, we can calculate the upper limit of expected conflicts and adjust the _search page size accordingly. This helps ensure enough tasks are retrieved so the _mget doesn’t discard all the search results due to document version mismatches. More frequent polling for tasks By ensuring a fixed number of nodes compete for the same tasks with task partitioning and a new lightweight task claiming algorithm, Task Manager can now poll for tasks more frequently without additional stress on Elasticsearch. This reduces delays between a task completing and the next one starting, increasing overall system throughput. Performance optimizations in Kibana Alerting Before our optimizations using Elastic APM , we analyzed alerting rule performance and found that the alerting framework required at least 20 Elasticsearch queries to run any alerting rule. After the optimizations, we reduced this to just 3 queries - an 85% reduction, significantly improving run times and reducing CPU overhead. Additionally, Elasticsearch previously relied on the resource-intensive pbkdf2 hashing algorithm for API key authentication, introducing excessive overhead at scale. We optimized authentication by switching to the more efficient SHA-256 algorithm, allowing us to eliminate the use of an internal Elasticsearch cache that was severely limited by the number of API keys used concurrently. Impact: How users are benefiting Early adoption has demonstrated: 50% faster rule run times , reducing overall system load. Increased task capacity , enabling more tasks to run on existing infrastructure. Fewer under-provisioned clusters , minimizing the need for scaling infrastructure to meet demand. Drop in average task delay because of increased per-node throughput and making the cluster properly provisioned Drop in rule run duration because of alerting framework optimizations Drop in Elasticsearch requests because of alerting framework optimizations Getting started: How to scale efficiently Upgrading to Kibana 8.18 unlocks most of these benefits automatically. For additional optimization, consider adjusting the xpack.task_manager.capacity setting to maximize per-node throughput while ensuring p999 resource usage remains below 80% for memory, CPU, and event loop utilization and below 500ms for event loop delay. By default, Kibana has a guardrail of 32,000 alerting rules per minute. If you plan to exceed this limit, you can modify the xpack.alerting.rules.maxScheduledPerMinute setting accordingly. The new xpack.task_manager.capacity setting makes Kibana handle workload distributions more effectively, making the following settings unnecessary in most cases and should be removed from your kibana.yml settings: xpack.task_manager.max_workers xpack.task_manager.poll_interval If you’re running Kibana on-prem and want to isolate background tasks into dedicated nodes, you can use the node.roles setting to separate UI-serving nodes from those handling background tasks. If you’re using Kibana on Elastic Cloud Hosted (ECH), scaling to 8GB or higher will automatically enable this isolation. What’s next? We’re not stopping at 50x. Our roadmap aims for 100x+ scalability, further eliminating Elasticsearch bottlenecks. Beyond scaling, we’re also focusing on improving system monitoring at scale. Upcoming integrations will provide system administrators with deeper insights into background task performance, making it easier to decide when and how to scale. Additionally, with task costing, we plan to increase task concurrency for Elastic Cloud Hosted (ECH) customers when configured with more CPU and memory (e.g., Kibana clusters with 2GB, 4GB, or 8GB+ of memory). Stay tuned for even more advancements as we continue to push the limits of Kibana scalability! The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Jump to How Kibana Alerting scales with Task Manager The problem: Why scaling was limited Why it’s important Key innovations in Kibana Task Manager Performance optimizations in Kibana Alerting Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Kibana Alerting: Breaking past scalability limits & unlocking 50x scale - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/kibana-alerting-task-manager-scalability",
+    "meta_description": "Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Search relevance tuning: Balancing keyword and semantic search This blog offers practical strategies for tuning search relevance that can be complementary to semantic search. Vector Database How To KD By: Kathleen DeRusso On May 14, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Introduction Tuning for relevance is an essential part of user search experience. Semantic search in particular faces several challenges, many of which are solved through hybrid search and application of relevance tuning practices that have been honed by decades of research in lexical search. We'll go into some of these strategies and how you can effectively use them to tune relevance in a hybrid search world. This blog shares some of the highlights from the Haystack 2024 talk, Retro Relevance: Lessons Learned Balancing Keyword and Semantic Search . Lexical search toolbox for relevance tuning Text search algorithms like BM25 have been around for decades, and in fact BM25 is often used synonymously with text search. This blog post goes into how BM25 works in detail. Analyzers, tokenizers, filters, field weights and boosts are all tools in our lexical search toolbox that give us the power to transform text in very specific ways to support both general and very specialized search use cases. But we also have a lot of other tools at our disposal: Reranking is another powerful tool in this toolbox, whether it pertains to Learn to Rank, semantic reranking, etc. Synonyms are heavily used in keyword search to differentiate slang, domain specific jargon, and so on. General models may not handle very niche synonyms well. These tools are used to impact relevance, but also importantly to accommodate business rules. Business rules are custom rules and their use cases vary widely, but commonly include diversifying result sets or showing sponsored content based on contextual query results or other personalization factors. Challenges with semantic search Semantic search is impressively effective at representing the intent of what you're looking for, returning matching results even if they don't contain the exact keywords you specified. However - if you’re developing a search application and incorporating semantic search into your existing tech stack, semantic search is not without some pitfalls. These pitfalls largely fall under three categories: Cost Features that semantic search inherently doesn't have yet Queries that semantic search by itself doesn't do well with Cost can be money (training or licensing models, compute), or it can be time. Time can be latency (inference latency at ingest or search), or it can be the cost of development time. We don't want to spend valuable engineering time on things that are easily solved with existing tools, and instead, use that time to focus on solving the hard problems that require engineering focus. There are also many features that people have come to expect in their search solutions; for example, highlighting, spelling correction, and typo tolerance. These are all things that semantic search struggles with out of the box today, but many UI/UX folks consider these table stakes in terms of user functionality. As far as queries that semantic search may not do well with, these are typically niche queries. Examples include: Exact matches such as model numbers Domain specific jargon We also have to consider requirements including business rules (for example boosting based on popularity, conversions, or campaigns), which semantic search by itself may not handle natively. Query understanding is another issue. This could be as simple as handling numeric conversions and units of measurement, or it could be very complex such as handling negatives. You may have had a frustrating experience searching for a negative, such as “I want a restaurant that doesn't serve meat”. LLMs may be OK at returning vegetarian restaurants here, but most semantic search is going to return you restaurants that serve meat! These are hard problems to solve and they're the ones we want to spend our valuable engineering time on. Benefits of hybrid search Hybrid search is the best of both worlds: it combines the precision and functionality of BM25 text search with the semantic understanding of vector search. This leads to both better recall and better overall relevance. To help put this in perspective, let's look at some examples: Real estate: Modern farmhouse with lots of land and an inground pool in the 12866 zip code. Whether the house has a pool and its ZIP code can be filters, and semantic search can be used over the style description. eCommerce: Comfortable Skechers with memory foam insoles in purple. The color and brand can be filters, and the rest can be covered with semantic search. Job hunting: Remote software engineer jobs using Elasticsearch and cloud native technologies. The job title and preference for remote work can be filters, and the job skills can be handled with semantic search. In all the above examples, the query has something specific to filter on along with more vague text that benefits from semantic understanding. What does a hybrid search look like in Elasticsearch? The phrase \"hybrid search\" is a little buzzwordy right now, and people might think of it differently in various scenarios. In some systems, where you have a separate vector database, this might involve multiple calls to different data stores and combining them with a service. But, one of the superpowers of Elasticsearch is that all of this can be combined in one single index and one search call. In Elasticsearch, a hybrid search may be as simple as a Boolean query . Here's an example of a Boolean query structure in Elasticsearch that combines text search, KNN searches, text expansion queries, and other supported query types. This can of course be combined with rescores, and everything else that makes Elasticsearch so powerful. Boolean queries are a very easy way to combine these text and vector searches into one, single query. One note about this example is that KNN was introduced as a query in addition to the top level search in 8.12, making this query structure even more powerful. Another option is to use retrievers , which starting with Elasticsearch 8.14.0 are an easier way of describing these complex retrieval pipelines. Here is an example that combines a standard query as a retriever, with a kNN query as a retriever, all rolled up to use Reciprocal Rank Fusion (RRF) to rank the results. Combining result sets Now that you have a hybrid search query, how do you combine all this into a single result set? This is a hard problem, especially when the scores are virtually guaranteed to be vastly different depending on how the results were retrieved. The classic way, using the Boolean query example, is with linear combination where you can apply boosts to each individual clause in the larger query. This is tried and true, nice old technology that we all know and love, but it can be finicky. It requires tuning to get right and then you may never get it perfect. If you're using retrievers you can also use RRF . This is easier - you can rely on an algorithm and don't have to do any tuning. There are some trade-offs - you have less fine grained control over your result sets. RRF doesn't take BM25 boosting into account, so if you're boosting on business rules, you might not get the results you want out of the box. Ultimately the method you should choose depends on your data and your use case. Tuning for relevance Once you've created your query, tuning for relevance is a hard problem to solve, but you have several tools at your disposal: Business metrics. These are the most important metrics in a lot of ways: Are users clicking on results, and in eCommerce use cases for example better yet are they completing purchases? Is your conversion rate increasing? Are users spending a decent amount of time reading the content on your site? These are all measures of user experience but they’re gathered through analytics and they’re direct proof of whether your search is providing results that are actually useful. For use cases like RAG, where the results are custom, subjective, and subject to change, this might be the only way to really measure the impact of your search changes. User surveys. Why not ask users if they thought the results were good and bad? You do have to take some things into account such as whether users will provide truthful responses, but it’s a good way of taking a pulse of what users think of your search engine. Quantitative ways of measuring relevance such as MAP and NDCG. These metrics require judgment lists which can then also be used for Learn to Rank. The single biggest trap that people can fall into, though, is tuning for one or a few “pet” queries: the handful of queries that you - or maybe your boss - enters. You can change everything about your algorithm to get that best top result for that one query, but then it can have cascading effects downstream, because now you’ve unknowingly messed up the bulk of your other queries. The good news is that there are some tools available to help! Applying tools for search relevance tuning Query rules Remember that pet query? Well I have good news for you - you can still have great results for that pet query without modifying your relevance algorithm, using the concept of pinned or promoted documents. At Elastic, we call these query rules. Query rules allow you to send in some type of context, such as a user-entered query string, and if it matches some criteria we can configure specific documents that we want to rank first, second, third, etc. One great use case for query rules is the application of business rules. Another use case is “fixing” relevance. Overall relevance shouldn't be nitpicky, and we should rely on methods like ranking, reranking, and/or RRF to get it right. But there are always exceptions. Maybe overall relevance is good enough, but you have a couple of queries that people complain about? OK, just set up a rule. But you can go further if you want: it can potentially be a worthwhile investment to take a quick pass through your head queries to make sure that they're returning the right information and these users are getting a good search experience. It's not cheating to correct some of the common user-entered queries, and then focus on improving your torso and tail queries through the power of semantic search where it really shines. So how does this work? Elastic query rules are defined by creating a query ruleset, or a list of one or more rules. Each rule has criteria that must match the rule in order for a query to be applied, and then actions that we take on the rule if it matches. A rule can have multiple criteria, based on the metadata you send in from the client. In this example, a user's query string and their location was sent in a rule - both of those criteria would have to be met in order for the rule to match. To trigger these rules at search time, you send in a corresponding rule query that specifies the metadata that you want to match on. We'll apply all matching rules in the ruleset, in order, and pin the documents that you want to come back. We're currently working on plans to make this feature generally available and extend the functionality: for example to support tokenizers and analyzers on rule criteria, making it easier for non-technical people to manage query rules, and to potentially provide additional actions on top of just pinning documents. You can read more about query rules and how to use them in this guide and corresponding blog post . Synonyms Next, let's talk about synonyms. Maybe you have some domain specific jargon that is unique to only your business and isn't in any of the current models - and you don't necessarily want to take on the expense to fine tune and train your own model. For example: while ELSER will recognize both pug and beagle as related to dog , it will not recognize puggle (a crossbeed of pug and beagle) as a dog . Synonyms can help here! Synonyms are a great way of translating this domain specific terminology, slang, and alternate ways of saying a word that may just be too specialized for a model to return the matches we want. In Elasticsearch, we used to manage this in a way that required a lot of manual overhead - you had to upload synonyms files and reload analyzers. In Elasticsearch 8.10, we introduced a synonyms API that makes this easier. Similar to query rules, you create synonyms sets with one or more defined synonyms, and then when you use the API to add, update, or remove synonyms it reloads analyzers for you - pretty easy! You can then update your mappings to define a custom analyzer that uses this synonyms set. The nice thing about synonyms being supported with analyzers is that when we do support analyzers in query rules in the future, we'll be able to support synonyms as well out of the box. You can read more about the synonyms API and how to use it in this guide and corresponding blog post . Wrapping up Semantic search doesn't replace BM25 search, it's an enhancement to existing search technologies. Hybrid search solves many problems innate to semantic search and is the best of both worlds in terms of both recall and functionality. Semantic search really shines with long tail and torso queries. Tools like query rules and synonyms can help provide the best search experience possible while freeing up valuable developer time to focus on solving important problems. As the landscape evolves, we're getting better and better at solving some of the hard problems that come with semantic search, and making it easier to use both semantic and hybrid search through simplification and tooling. Our goal as search practitioners is to return the best results possible. Our other goal is to do this as easily as possible, and minimize costs - those costs include money and time, and time can mean latency or engineering overhead. We don't want to waste that valuable engineering time - we want to spend it solving hard problems! You can try the features I've talked about out in Cloud today! Be sure to head over to our discuss forums and let us know what you think. Watch the Haystack talk Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Introduction Lexical search toolbox for relevance tuning Challenges with semantic search Benefits of hybrid search What does a hybrid search look like in Elasticsearch? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Search relevance tuning: Balancing keyword and semantic search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/search-relevance-tuning-in-semantic-search",
+    "meta_description": "This blog offers practical strategies for tuning search relevance that can be complementary to semantic search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ML Research ST By: Shubha Anjur Tupil On December 10, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Take your search experiences up to level 11 with our new state-of-the-art cross-encoder Elastic Rerank model (in Tech Preview). Reranking models provide a semantic boost to any search experience, without requiring you to change the schema of your data, giving you room to explore other relevance tools for semantic relevance on your own time and within your budget. Semantic boost your keyword search : Regardless of where or how your data is stored, indexed or searched today, semantic reranking is an easy additional step that allows you to boost your existing search results with semantic understanding. You have the flexibility to apply this as needed– without requiring changes to your existing data or indexing pipelines and you can do this with an Elastic foundational model as your easy first choice. Flexibility of choice for any budget : All search experiences can be improved with the addition of semantic meaning which is typically applied by utilizing a dense or sparse vector model such as ELSER. However, achieving your relevance goals doesn’t require a one-size-fits-all solution, it’s about mixing and matching tools to balance performance and cost. Hybrid search is one such option, improving relevance by combining semantic search with keyword search using reciprocal rank fusion (RRF) in Elasticsearch. The Elastic Rerank model is now an additional lever to enhance search relevance in place of semantic search, giving you the flexibility to optimize for both relevance and budget. First made available on serverless, but now available in tech preview in 8.17 for Elasticsearch, the benefits of our model exceed those of other models in the market today. Performant and efficient : The Elastic Rerank model outperforms other significantly larger reranking models. Built on the DeBERTa v3 architecture, it has been fine-tuned by distillation on a diverse dataset. Our detailed testing shows a 40% uplift on a broad range of retrieval tasks and up to 90% on question answering data sets. As a comparison, the Elastic Rerank model is significantly better or comparable in terms of relevance even with much larger models. In our testing a few models, such as bge-re-ranker-v2-gemma , came closest in relevance but are an order of magnitude larger in terms of parameter count. That being said, we provide integrations in our Open Inference API to enable access to other third-party rerankers, so you can easily test and see for yourself. Using the Elastic Rerank model Not only are the performance and cost characteristics of the Elastic Rerank model great, we have also made it really easy to use to improve the relevance for lexical search. We want to provide easy to use primitives that help you build effective search, quickly, and without having to make lots of decisions; from which models to use, to how to use them in your search pipeline. We make it easy to get started and to scale. You can now use Elastic Rerank using the Inference API with the text_similiarity_reranker retriever. Once downloaded and deployed each search request can handle a full hybrid search query and rerank the resulting set in one simple _search query. It’s really easy to integrate the Elastic Rerank model in your code, to combine different retrievers to combine hybrid search with reranking. Here is an example that uses ELSER for semantic search, RRF for hybrid search and the reranker to rank the results. If you have a fun dataset like mine that combines the love of AI with Cobrai Kai you will get something meaningful. Summary: The Elastic Rerank Model English only cross-encoder model Semantic Boost your Keyword Search with little to no changes how data is indexed and searched already More control and flexibility over the cost of semantic boosting decoupled from indexing and search Reuse the data you already have in Elasticsearch Delivers significant improvements in relevance and performance (40% better on average for a large range of retrieval tasks and up to 90% better on question answering tasks as compared to significantly larger models, tested with over 21 datasets with an average of +13 points nDCG@10 improvement) Easy-to-use, out-of-the-box; built into the Elastic Inference API, easy to load and use in search pipelines Available in technical preview on across our product suite, easiest way to get started is on Elasticsearch Serverless If you want to read all the details of how we built this, head over to our blog on Search Labs . Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 5, 2024 Exploring depth in a 'retrieve-and-rerank' pipeline Select an optimal re-ranking depth for your model and dataset. TP TV QH By: Thanos Papaoikonomou , Thomas Veasey and Quentin Herreros Jump to Using the Elastic Rerank model Summary: The Elastic Rerank Model Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-rerank-model-introduction",
+    "meta_description": "Learn about the Elastic Rerank model and explore how to use it through practical examples, including how to to integrate the rerank model in your code."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. Lucene Vector Database BT By: Benjamin Trent On February 27, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. For years, Apache Lucene and Elasticsearch have supported filtered search with kNN queries, allowing users to retrieve the nearest neighbors that meet a specified metadata filter. However, performance has always suffered when dealing with semi-restrictive filters. In Apache Lucene, we are introducing a variation of ACORN-1 —a new approach for filtered kNN search that achieves up to 5x faster searches with little drop in recall. This blog goes over the challenges of filtered HNSW search, explaining why performance slows as filtering increases, and how we improved HNSW vector search in Apache Lucene with the ACORN-1 algorithm. Why searching fewer docs is actually slower Counterintuitively, filtering documents—thereby reducing the number of candidates—can actually make kNN searches slower. For traditional lexical search, fewer documents means fewer scoring operations, meaning faster search. However, in an HNSW graph, the primary cost is the number of vector comparisons needed to identify the k nearest neighbors. At certain filter set sizes, the number of vector comparisons can increase significantly, slowing down search performance. Here is an example of unfiltered graph search. Note there are about 6 vector operations. Because the HNSW graph in Apache Lucene has no knowledge of filtering criteria when built, it constructs purely based on vector similarity. When applying a filter to retrieve the k nearest neighbors, the search process traverses more of the graph. This happens because the natural nearest neighbors within a local graph neighborhood may be filtered out , requiring deeper exploration and increasing the number of vector comparisons. Here is an example of the current filtered graph search. The “dashed circles” are vectors that do not match the filter. We even make vector comparisons against the filtered out vectors, resulting in more vector ops, about 9 total. You may ask, why perform vector comparisons against nodes that don’t match the filter at all? Well, HNSW graphs are already sparsely connected. If we were to consider only matching nodes during exploration, the search process could easily get stuck , unable to traverse the graph efficiently. Note the filtered “gulf” between the entry point and the first valid filtered set. In a typical graph, it's possible for such a gap to exist, causing exploration to end prematurely and resulting in poor recall. We gotta make this faster Since the graph doesn’t account for filtering criteria, we have to explore the graph more. Additionally, to avoid getting stuck, we must perform vector comparisons against filtered-out nodes. How can we reduce the number of vector operations without getting stuck? This is the exact problem tackled by Liana Patel et. al. in their ACORN paper. While the paper discusses multiple graph techniques, the specific algorithm we care about with Apache Lucene is their ACORN-1 algorithm. The main idea is that you only explore nodes that satisfy your filter. To compensate for the increased sparsity, ACORN-1 extends the exploration beyond the immediate neighborhood. Now instead of exploring just the immediate neighbors, each neighbor’s neighbor is also explored. This means that for a graph with 32 connections, instead of only looking at the nearest 32 neighbors, exploration will attempt to find matching neighbors in 32*32=1024 extended neighborhood. Here you can see the ACORN algorithm in action. Only doing vector comparisons and exploration for valid matching vectors, quickly expanding from the immediate neighborhood. Resulting in much fewer vector ops, about 6 in total. Within Lucene, we have slightly adapted the ACORN-1 algorithm in the following ways. The extended neighborhoods are only explored if more than 10% of the vectors are filtered out in the immediate neighborhood. Additionally, the extended neighborhood isn’t explored if we have already scored at least neighborCount * 1.0/(1.0 - neighborFilterRatio) . This allows the searcher to take advantage of more densely connected neighborhoods where the neighborhood connectedness is highly correlated with the filter. We also have noticed both in inversely correlated filters (e.g. filters that only match vectors that are far away from the query vector) or exceptionally restrictive filters, only exploring the neighborhood of each neighbor isn’t enough. The searcher will also attempt branching further than the neighbors’ neighbors when no valid vectors passing the filter are found. However, to prevent getting lost in the graph, this additional exploration is bounded. Numbers don’t lie Across multiple real-world datasets, this new filtering approach has delivered significant speed improvements . Here is randomly filtering at 0.05% 1M Cohere vectors : Up and to the left is “winning”, which shows that the candidate is significantly better. Though, to achieve the same recall, search parameters (e.g. num_candidates ) need to be adjusted. To further investigate this reduction in improvement as more vectors pass the filter, we did another test over an 8M Cohere Wiki document data set . Generally, no matter the number of vectors filtered, you want higher recall, with fewer visited vectors. A simple way to quantify this is by examining the recall-to-visited ratio . Here we see how the new filtered search methodology achieves much better recall vs. visited ratio. It's clear that near 60%, the improvements level off or disappear. Consequently, in Lucene, this new algorithm will only be utilized when 40% or more of the vectors are filtered out. Even our nightly Lucene benchmarks saw an impressive improvement with this change. Apache Lucene runs over 8M 768 document vectors with a random filter that allows 5% of the vectors to pass. These kinds of graphs make me happy. Gotta go fast Filtering kNN search over metadata is key for real world use-cases. In Lucene 10.2, we have made it as much as 5x faster, using fewer resources, and keeping high recall. I am so excited about getting this in the hands of our users in a future Elasticsearch v9 release. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Why searching fewer docs is actually slower We gotta make this faster Numbers don’t lie Gotta go fast Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Filtered HNSW search, fast mode - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/filtered-hnsw-knn-search",
+    "meta_description": "Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog GenAI for Customer Support — Part 3: Designing a chat interface for chatbots... for humans This series gives you an inside look at how we’re using generative AI in customer support. Join us as we share our journey in designing a GenAI chatbot interface. Inside Elastic IM By: Ian Moersen On August 9, 2024 Part of Series GenAI for customer support Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog series reveals how our Field Engineering team used the Elastic stack with generative AI to develop a lovable and effective customer support chatbot. If you missed other installments in the series, be sure to check out part one , part two , part four , the launch blog , and part five . The idea of chatting via a web app has been around for a very long time. So you might think that means a GenAI chatbot would be a standard, boring interface to build. But it turns out an AI chatbot presents a few interesting and novel challenges. I’ll mention a few of them here, and hopefully, if you’re looking to build your own chat interface, you can use some of these tips and tricks to help you out. In my role as a UI designer, I like to make big deals about tiny things. Is the hex color for an avatar a shade too dark? I’m definitely complaining. Is the animation on this tooltip not eased properly? Let’s spend the time to track down the right bezier curves. No no, trust me, it’s definitely worth it. Is the font rendering slightly different on the new page? Oh yeah, you’re definitely going to be hearing about it from Ian. So when my team began work on a new automated Support Assistant, we had to decide: Do we pull a library off the shelf to handle the chat interface? Do we develop our own from scratch? For me, I hardly wanted to consider the former. Getting the small things right for our chatbot is a designer’s dream. Let’s do this. 1. Choosing the library So when I said “develop our own from scratch” earlier, I didn’t mean from scratch scratch. Sorry folks, this is 2024 AD, most people don’t develop UI components from scratch anymore. Many developers rely on component libraries to build new things, and at Elastic we’re no exception. Although we are pretty exceptional in one respect: We have our very own Elastic UI component library , and it’s free for anyone to use. EUI currently has no “ChatBot” component, but it does provide the avatars, “panels”, text areas, etc, one might need to create a nice little chat window. If you want to follow along with the rest of this post, feel free to open this sample EUI chat interface I made in another tab and you can give it a spin yourself. Have fun! 2. Animations... With some unlikely help After designing & assembling the major building blocks of the chat interface (which you can check out in the sandbox link above), one of our next challenges was how to keep users engaged during the sometimes-lengthy period of time the chatbot took to respond. To make matters worse, the first LLM endpoint we were using (for internal alpha-testing) wasn’t streaming its responses; it simply generated and sent the entire answer back to us in a single HTTP response body. This took forever. Not great. Action From -> to Approx. observed latency Initial request Client -> server 100 - 500ms RAG search Server -> cluster 1 - 2.5s Call to LLM Server -> LLM 1 - 2.5s First streamed byte LLM -> server -> client 3 - 6s Total 5.1 - 11.5 seconds Our first line of defense here was a compelling “loading” animation. I wanted something custom, interesting to look at, but also one that stuck very closely to Elastic’s overall brand guidelines. To that end, I decided to use Elastic’s existing EuiIcon component to display three dots, then use Elastic brand colors and EUI’s default animation bezier curves —those mathematical descriptions of how animations appear to accelerate and decelerate—to keep things feeling “Elastic” as it pulsed, blinked and changed colors. Choreographing bouncing, color-changing and opacity fading in CSS was a bit outside my comfort zone. So instead of spending a whole day guessing at values to use, it occurred to me I had someone I could ask sitting right in front of me. That’s right, I asked for (an early version of) the chatbot to program its own loading animation. It came up with something very nearly perfect on its first try. After a bit of fine-tuning and code refactoring, this was the result: (Bonus points if you can figure out which props to edit in the sandbox link above to see these loading dots yourself. All the code’s there!) This resulted in a pleasing little loading animation that I still enjoy looking at for a few seconds at a time; just what we needed! Now, whether a chatbot programming itself is existentially worrying?... That’s a question I’ll leave to the philosophers. But in my capacity as a web developer, I needed to focus on more practical matters. Like what we should do if the response from the LLM takes too long or drops entirely. 3. Killswitch engage Handling network timeouts and failures is pretty straightforward in most traditional web apps. Just check the error code of the response and handle those appropriately. Any additional handling for timeouts can be caught in a try/catch block or something similar. Generally, a typical HTTP fetch will know how to handle timeouts, which are usually configured to happen after a reasonably short amount of time and happen relatively rarely. The current state of generative AI API endpoints is not quite like that. Yes, occasionally, you’ll get a quick failure response with an error code, but remember that we’re streaming the LLM’s response here. Much more often than not, we receive a 200 OK from the API endpoint quickly, which tells us that the large language model is ready to begin streaming its response... But then it can take an exceedingly long time to receive any data at all. Or, partway through the stream, the trail goes cold, and the connection simply hangs. In either case, we didn’t want to rely on traditional network timeouts to give the user an option to retry their question. It is a much better user experience to have a short timeout on a failed attempt and then a quick, successful retry than a successful response that took way, way too long. So, after we found most failed streams will take over one minute to resolve, we went to work finding the shortest amount of time that would guarantee the stream was likely going to fail (or take an excessive amount of time to resolve). We kept cutting it shorter and shorter until we found that after only 10 seconds of radio silence, we could be nearly certain that the stream would either eventually fail or take longer than one minute to pick back up. Here’s some pseudocode illustrating the concept. It’s an example of the kind of code you might find in the primary function that calls a streaming LLM API after a user asks a question. By just being a little clever with AbortController signals and a setTimeout , you can implement a “killswitch” on the fetch() function to quickly return an error to the user if the stream goes dead for more than 10 seconds: So after solving these problems, and probably about a hundred others, it was time to focus on another challenge unique to site-wide generative AI interfaces: Context. 4. Chat history context While chatting with an AI assistant, you expect it to have the context of your previous messages. If you ask it to clarify its answer, for example, it needs to “remember” the question you asked it, as well as its own response. You can’t just send “Can you clarify that?” all by itself to the LLM and expect a useful response. Context, within a conversation, is straightforward to find and send. Just turn all previous chat messages into a JSON object and send it along with the latest question to the LLM endpoint. Although there may be a few smaller considerations to make—like how to serialize and store metadata or RAG results—it is comparatively uncomplicated. Here’s a bit of pseudocode illustrating how to enrich a default prompt with conversational context. But what about other types of context? For example: When you’re reading a support case and see a chat widget on the page, wouldn’t it make sense to ask the AI assistant “how long has this case been open?”. Well, to provide that answer, we’ll need to pass the support case itself as context to the LLM. But what if you’re reading that support case and one of the replies contains a term you don’t understand. It would make sense to ask the assistant to explain this highly technical term. Well for that, we’ll need to send a different context to the LLM (in our case, the results from a search of our knowledge base for that technical term). How do we convey to the user something as complex and unique as context, in order to orient them in the conversation? How can we also let users also choose which context to send? And, maybe hardest of all, how do we do all of this with such a limited number of pixels? After designing and evaluating quite a few options (breadcrumbs? a sticky alert bar within the chat window? tiny little badges??) we settled on a “prepended” element to the text input area. This keeps context right next to the “action item” it describes; context is attached only to your next question, not your last answer! UI Element Pros Cons Breadcrumbs Small footprint, easy to interact with Better for representing URLs and paths Banner at top Out of the way, allows for long description Not easy to interact with, can get lost Micro-badges Easy to display multiple contexts Difficult to edit context Prepended menu w/number badge Close to input field, easy to interact with Tight squeeze in the space available Additionally, an EUI context menu can be used to allow power users to edit their context. Let’s say you want to ask the Assistant something that would require both the case history and a thorough search of the Elastic knowledge base; those are two very different contexts. “How do I implement the changes the Elastic engineer is asking me to make?” for example. You could then use the context menu to ensure both sources of information are being used for the Assistant’s response. This also gives us more flexibility. If we want the LLM itself to determine context after each question, for example, we’d be able to display that to the user easily, and the little pink notification badge could alert users if there are any updates. These were just a handful of the many smaller problems we needed to solve while developing our GenAI Support Assistant interface. Even though it seems like everyone’s releasing a chatbot these days, I hadn’t seen many breakdowns of real-life problems one might encounter while engineering the interfaces and experiences. Building a frictionless interface with emphasis on making streaming experiences feel snappy, making affordances for unexpected timeouts and designing for complex concepts like chat context with only a few pixels to spare are just a few of the problems we needed to solve. Implementing an AI chatbot naturally puts the bulk of the engineering focus on the LLM and backend services. However, it’s important to keep in mind that the UX/UI components of a new tool will require adequate time and attention as well. Even though we’re building a generation of products that use AI technology, it’s always going to be important to design for humans. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Jump to 1. Choosing the library 2. Animations... With some unlikely help 3. Killswitch engage 4. Chat history context Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "GenAI for Customer Support — Part 3: Designing a chat interface for chatbots... for humans - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/genai-elastic-elser-chat-interface",
+    "meta_description": "Discover how we designed a GenAI chatbot interface for customer support. Learn about AI chat interfaces​, animations, chat history context, handling timeouts, and more."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. AutoOps How To SF By: Sachin Frayne On November 20, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. There are a number of ways that hotspotting can occur in an Elasticsearch cluster. Some we can control, like noisy neighbors, and some we have less control over, like the shard allocation algorithm in Elasticsearch. The good news is that the new desired_balance cluster.routing.allocation.type algorithm (see shards-rebalancing-heuristics ) is much better at determining which nodes in the cluster should get the new shards. If there is an imbalance present, it will figure out the optimal balance for us. The bad news is that older Elasticsearch clusters are still using the balanced allocation algorithm which has a more limited calculation that is prone to making mistakes when choosing nodes that can lead to imbalanced or hotspotted clusters. In this blog we will explore this old algorithm, how it is supposed to work and when it does not work, and what we can do to address it. We will then go through the new algorithm and how it solves this problem, and finally we will look at how we used AutoOps to highlight this issue for a customer use case. We will however not go into all the causes for hotspotting, nor will we go into all the specific solutions as they are quite numerous. Balanced allocation In Elasticsearch 8.5 and earlier we used the following method to determine which node to place a shard, this method mostly came down to choosing the node with the least number of shards: https://github.com/elastic/elasticsearch/blob/8.5/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java#L242 node.numShards() : the number of shards allocated to a specific node in the cluster balancer.avgShardsPerNode() : the mean of the shards across all the nodes in the cluster node.numShards(index) : the number of shards for a specific index allocated to a specific node in the cluster balancer.avgShardsPerNode(index) : the mean of the shards for a specific index across all the nodes in the cluster theta0: ( cluster.routing.allocation.balance.shard ) weight factor for the total number of shards, defaults to 0.45f, increasing this raises the tendency to equalise the number of shards per node, (see Shard balancing heuristics settings ) theta1 : ( cluster.routing.allocation.balance.index ) weight factor for the total number of shards per index, defaults to 0.55f, increasing this raises the tendency to equalise the number of shards per index per node, (see Shard balancing heuristics settings ) The target value for this algorithm across the cluster is to pick a node in such a way that the weight across all the nodes in the cluster gets us back to 0 or gets us the closest to 0. Example Let's explore a situation where we have 2 nodes with 1 index made up of 3 primary shards, and let's assume we have 1 shard on node 1 and 2 shards on node 2. What should happen when we add a new index to the cluster with 1 shard? w e i g h t N o d e 1 = 0.45 f ( 1 − 1.5 ) + 0.55 f ( 0 − 0 ) = − 0.225 weightNode1 = 0.45f(1 - 1.5) + 0.55f(0 - 0) = -0.225 w e i g h tN o d e 1 = 0.45 f ( 1 − 1.5 ) + 0.55 f ( 0 − 0 ) = − 0.225 w e i g h t N o d e 2 = 0.45 f ( 2 − 1.5 ) + 0.55 f ( 0 − 0 ) = 0.225 weightNode2 = 0.45f(2 - 1.5) + 0.55f(0 - 0) = 0.225 w e i g h tN o d e 2 = 0.45 f ( 2 − 1.5 ) + 0.55 f ( 0 − 0 ) = 0.225 Since the new index has no shards anywhere else in the cluster, the weightIndex term reduces to 0, as we can see in the next calculation adding the shard to node 1 will bring the balance back to 0 so we choose node 1. w e i g h t N o d e 1 = 0.45 f ( 2 − 2 ) + 0.55 f ( 0 − 0 ) = 0 weightNode1 = 0.45f(2 - 2) + 0.55f(0 - 0) = 0 w e i g h tN o d e 1 = 0.45 f ( 2 − 2 ) + 0.55 f ( 0 − 0 ) = 0 w e i g h t N o d e 2 = 0.45 f ( 2 − 2 ) + 0.55 f ( 0 − 0 ) = 0 weightNode2 = 0.45f(2 - 2) + 0.55f(0 - 0) = 0 w e i g h tN o d e 2 = 0.45 f ( 2 − 2 ) + 0.55 f ( 0 − 0 ) = 0 Now let's add another index with 2 shards, the first shard will go randomly to one of the nodes since we are now balanced. Assuming node 1 was chosen for the first shard, the second shard will go to node 2. w e i g h t N o d e 1 = 0.45 f ( 3 − 2.5 ) + 0.55 f ( 1 − 0.5 ) = 0.5 weightNode1 = 0.45f(3 - 2.5) + 0.55f(1 - 0.5) = 0.5 w e i g h tN o d e 1 = 0.45 f ( 3 − 2.5 ) + 0.55 f ( 1 − 0.5 ) = 0.5 w e i g h t N o d e 2 = 0.45 f ( 2 − 2.5 ) + 0.55 f ( 0 − 0.5 ) = − 0.5 weightNode2 = 0.45f(2 - 2.5) + 0.55f(0 - 0.5) = -0.5 w e i g h tN o d e 2 = 0.45 f ( 2 − 2.5 ) + 0.55 f ( 0 − 0.5 ) = − 0.5 The new balance will finally be: w e i g h t N o d e 1 = 0.45 f ( 3 − 3 ) + 0.55 f ( 0 − 0 ) = 0 weightNode1 = 0.45f(3 - 3) + 0.55f(0 - 0) = 0 w e i g h tN o d e 1 = 0.45 f ( 3 − 3 ) + 0.55 f ( 0 − 0 ) = 0 w e i g h t N o d e 2 = 0.45 f ( 3 − 3 ) + 0.55 f ( 0 − 0 ) = 0 weightNode2 = 0.45f(3 - 3) + 0.55f(0 - 0) = 0 w e i g h tN o d e 2 = 0.45 f ( 3 − 3 ) + 0.55 f ( 0 − 0 ) = 0 This algorithm will work well if all indices/shards in the cluster are doing approximately the same amount of work in terms of ingest, search and storage requirements. In reality, most Elasticsearch use cases are not this simple and the load across the shards is not always the same, imagine the following scenario. Image 1: Elasticsearch cluster (the exaggerated size of the shards represent how “busy” the shards actually are) Index 1, small search use case with a few thousand documents, incorrect number of shards, Index 2, very large index, but not being actively written to and occasional searching, Index 3, light indexing and searching, Index 4, heavy ingest application logs. Let’s suppose, we have 3 nodes and 4 indices with only primary shards, deliberately in an unbalanced state. To visually understand what is going on I have exaggerated the size of the shards according to how busy they are and what busy could mean (write, read, cpu, ram or storage). Even though node 3 already has the busiest index, new shards will route to that node. Index lifecycle management (ILM) won’t solve this situation for us, when the index is rolled over, the new shards will be placed on node 3. We could manually ease this problem by forcing Elasticsearch to spread the shards evenly using cluster reroute , but this does not scale, as our distributed system should take care of this. Still, without any rebalance or other kinds of intervention, this situation will remain and potentially get worse. What’s more, while this example is fake, this kind of distribution is inevitable in older Elasticsearch clusters with mixed use cases (i.e., search, logging, security) especially when one or more of the use cases is heavy ingest, determining when this will occur is not trivial. While the timeframe to predict this issue is complicated, a good solution that works well in some use cases is to keep the shard density across all indices the same, this is achieved by rolling all indices when their shards get to a predetermined size in Gigabytes, (see size your shards ). This does not work in all use cases, as we will see in the cluster caught by AutoOps below. Desired balance allocation To address this issue and a few others, a new algorithm that can take into account both write load and disk usage was initially released in 8.6 and underwent some minor, yet meaningful, changes in versions 8.7 and 8.8: https://github.com/elastic/elasticsearch/blob/8.8/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java#L305 node.writeLoad() : the write or indexing load of a specific node balancer.avgWriteLoadPerNode() : the mean write load across the cluster node.diskUsageInBytes() : the disk usage for a specific node balancer.avgDiskUsageInBytesPerNode() : the mean disk usage across the cluster theta2 : ( cluster.routing.allocation.balance.write_load ) weight factor for the write load, defaults to 10.0f, increasing this raises the tendency to equalise the write load per node, (see Shard balancing heuristics settings ) theta3 : ( cluster.routing.allocation.balance.disk_usage ) weight factor for the disk usage, defaults to 2e-11f, increasing this raises the tendency to equalise the disk usage per node, (see Shard balancing heuristics settings ) I will not go into detail in this blog on the calculations that this algorithm is doing, however the data that is used by Elasticsearch to decide where the shards should live is available via an API: Get desired balance . It is still a best practice to follow our guidance when you size your shards and there are still good reasons to separate out use cases into dedicated Elasticsearch clusters. Yet this algorithm is much better at balancing Elasticsearch, so much so that it resolved the balancing issues for our customer below. (If you are facing the issue described in this blog, I recommend that you upgrade to 8.8). One final thing to note, this algorithm does not take into account search load, this is not trivial to measure and even harder to predict. Adaptive replica selection , introduced in 6.1, goes a long way to addressing search load. In a future blog we will dive deeper into the topic of search performance and specifically how we can use AutoOps to catch our search performance issues before they occur. Detecting hotspotting in AutoOps Not only is the situation described above difficult to predict, but when it occurs it used to be difficult to detect, it takes a lot of internal understanding of Elasticsearch and very specific conditions for our clusters to end up in this state. Now with AutoOps detecting this issue is a cinch. Let’s see a real world example; In this setup there is a queueing mechanism in front of Elasticsearch for spikes in the data, however the use case is near real time logs - sustained lag is not acceptable. We had a situation with sustained lag that we had to troubleshoot. Starting in the cluster view we pick up some useful information, in the image below we learn that there are 3 master nodes, 8 data nodes (and 3 other nodes that are not interesting to the case). We also learn that the cluster is red, (this could be networking or performance issues), the version is 8.5.1 and there are 6355 shards; these last 2 will become important later. Image 2: Cluster Info There is a lot going on in this cluster, it is going red a lot, these are related to the nodes leaving the cluster. The nodes are leaving the cluster around the time that we observe indexing rejections and the rejections are happening shortly after the indexing queues are getting filled up too frequently, the darker the yellow, the more high indexing events in the time block. Image 3: Timeline of events in the cluster, (importantly let’s highlight Data Node Disconnected) Moving to the node view and focusing in on the timeframe around the last node disconnect we can see that another node, node 9, has a much higher indexing rate than the rest of the nodes, followed by the second highest indexing rate observed on node 4, which has had some disconnects earlier in the month. You will also notice that there is a fairly large drop in indexing rate around the same timeframe, this was in fact also related to intermittent latency in this particular cluster between the compute resources and the storage. Image 4: Data node 9, high indexing rate. AutoOps by default will only report nodes that disconnect for more than 300 seconds, but we know that other nodes including node 9 are frequently leaving the cluster, as can be seen in the image below, the number of shards on the node are growing too fast to be moving shards, therefore they must be re-initialising after a node disconnect/restart. With these pieces of information we can safely conclude that the cluster is experiencing a performance issue, but not only that it is a hotspotting performance issue. Since Elasticsearch works as a cluster, it can only work as fast as its slowest node and since node 9 is being asked to do more work than the other nodes and it can’t keep up, the other nodes are always waiting for it and are occasionally getting disconnected themselves. Image 5: Data node 9, number of shards increasing. We do not need more information at this point, but to further illustrate the power of AutoOps below is another image which shows how much more work node 9 is doing than the other nodes, specifically how much data it is writing to disk. Image 6: Disk write and IOPS. We decided to move all the shards off of node 9, by randomly sending them to the rest of the nodes in the cluster; this was achieved with the following command. After this the indexing performance of the whole cluster improved and the lag disappeared. Now that we have observed, confirmed and circumvented the issue, we need to find a long term solution to the problem, which brings us back to the technical analysis at the beginning of the blog. The best practices were being followed, the shards rolled at a predetermined size and we were even limiting the number of shards for a specific index per node. We hit an edge case that the algorithm could not deal with, heavy index and frequently rolled indices. We thought about whether we could rebalance the cluster manually, but with around 2000 indices made up of 6355 shards, this was not going to be trivial, not to mention, with this level of indexing we would be racing against ILM to rebalance. This is exactly what the new algorithm was designed for and so our final recommendation is to upgrade the cluster. Final thoughts This blog is a summary of a fairly specific but complicated set of circumstances that can cause a problem with Elasticsearch performance. You may even see some of these issues in your cluster today but may never get into a position where your cluster is affected as badly as this user was. This case underscores the importance of keeping up with the latest versions of Elasticsearch to consistently take advantage of the latest innovations in managing data better and it helps to showcase the power of AutoOps in finding/diagnosing and alerting us to issues, before they become full production incidents. Thinking about migrating to at least version 8.8 https://www.elastic.co/guide/en/elasticsearch/reference/8.8/migrating-8.8.html Report an issue Related content AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Balanced allocation Example Desired balance allocation Detecting hotspotting in AutoOps Final thoughts Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Hotspotting in Elasticsearch and how to resolve them with AutoOps - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/hotspot-elasticsearch-autoops",
+    "meta_description": "Explore hotspotting in Elasticsearch and how to resolve it using AutoOps.\n"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. Search Relevance IT By: Ioana Tagirta On April 16, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Search with ES|QL With Elasticsearch 8.18 and 9.0, ES|QL adds a host of new functionalities, including: support for scoring semantic search more configuration options for the match function a new KQL function In this blog, we will review these 8.18 features and other exciting new features that we plan to add to ES|QL, reinforcing our investment in making ES|QL a modern search language ready to fit your needs, whether you are building a search application powered by ES|QL or analyzing your data in Kibana Discover. Introducing scoring In 8.17 we added the ability to filter documents using full text functions. If you are unfamiliar with full text filtering in ES|QL, we suggest reading our original blog post about it. With 8.18 and 9.0 we introduce support for scoring , making it possible to return documents in order of their relevance. To access the score for each document, simply add the metadata _score field to your ES|QL query: We retrieve the same scores we get from the equivalent search API query: Full text search functions such as match , qstr and kql can only be used in the context of a WHERE condition and are the only ones that contribute to the score. The _score column can not only be used to sort documents by relevance, but also in custom scoring formulas. In the next example, we keep only the most relevant results using a score threshold and then add a score boost based on the reader rating: Improving the match function In ES|QL, the match function simply translates to a Query DSL match query . In 8.18 and 9.0, we expanded the match function's capabilities to include all options that are currently available in Query DSL. It is now possible to set well-known match options such as boost, fuzziness and operator in ES|QL too: Enter semantic search The 8.18 release comes with the exciting announcement that semantic search is now generally available. We've expanded the match function to support querying over semantic_text field types. In ES|QL, executing a semantic query is now as simple as performing a full-text query, as shown in this example: In this example, we set semantic_title to use the semantic_text field type. Mapping your index fields as semantic_text is all it takes to set up your index for semantic search. Check our search with semantic text tutorial for more details. Hybrid search with ES|QL ES|QL makes it straightforward to do both semantic and lexical search at the same time. It is also possible to set different boosts, prioritizing results from semantic search or lexical search, depending on your use case: Transitioning from KQL If you are a long-term user of Kibana Discover and use KQL ( Kibana Query Language ) to query and visualize your data and you'd like to try ES|QL but don't know where to start, don't worry, we got you! In 8.18 and 9.0, ES|QL adds a new function which allows you to use KQL inside ES|QL . This is as simple as: ES|QL is already available in Kibana Discover. This way, you get the best of both worlds: you can continue to use KQL and start getting more familiar with ES|QL at your own pace. Check out our getting started with ES|QL guide for more information. Beyond 8.18 and 9.0 In future releases, we'll be adding more and more search capabilities to ES|QL, including vector search, semantic reranking, enhanced score customization options, and additional methods for combining hybrid search results, such as Reciprocal Rank Fusion (RRF). Try it out yourself These changes are available starting with Elasticsearch 8.18, but they are already available in Elasticsearch Serverless. For Elasticsearch Serverless, start a free trial cloud today or try Elastic on your local machine now! Follow the Search and filter in ES|QL tutorial for a hands-on introduction to the features described in this blog post! Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Jump to Search with ES|QL Introducing scoring Improving the match function Enter semantic search Hybrid search with ES|QL Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "ES|QL, you know, for Search - Introducing scoring and semantic search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-introducing-scoring-semantic-search",
+    "meta_description": "With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Elasticsearch in JavaScript the proper way Learn how to create a production-ready Elasticsearch backend in JavaScript, follow best practices, and run the Elasticsearch Node.js client in Serverless environments. Part1 Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Part2 Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch in JavaScript the proper way - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/elasticsearch-in-javascript",
+    "meta_description": "Learn how to create a production-ready Elasticsearch backend in JavaScript, follow best practices, and run the Elasticsearch Node.js client in Serverless environments.\n\n"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Semantic Text: Simpler, better, leaner, stronger Our latest semantic_text iteration brings a host of improvements. In addition to streamlining representation in _source, benefits include reduced verbosity, more efficient disk utilization, and better integration with other Elasticsearch features. You can now use highlighting to retrieve the chunks most relevant to your query. And perhaps best of all, it is now a generally available (GA) feature! Vector Database MP By: Mike Pellegrini On March 13, 2025 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. We’ve gone on quite a journey with the semantic_text field type, and this latest iteration promises to make semantic search simpler than ever. In addition to streamlining semantic_text representation in _source , benefits include reduced verbosity, more efficient disk utilization, and better integration with other Elasticsearch features. You can now use highlighting to retrieve the chunks most relevant to your query. And perhaps best of all, it is now a generally available (GA) feature! Semantic search evolution Our approach to semantic search has evolved over the years, with the goal of making it as simple as possible. Prior to the semantic_text field type, performing semantic search required manually: Configuring your mappings to be compatible with your embeddings. Configuring an ingest pipeline to use a ML model to generate embeddings. Using the pipeline to ingest your docs. Using the same ML model at query time to generate embeddings for your query text. We called this setup “easy” at the time , but we knew we could make it far simpler than this. Enter semantic_text . In the beginning We introduced semantic_text in Elasticsearch 8.15 with the goal of simplifying semantic search. If semantic_text is new to you, we suggest reading our original blog post about it first for background about our approach. We released semantic_text as a beta feature first for a good reason. It’s a well-known truism in software development that making something simple can be quite difficult, and semantic_text is no exception. There are a lot of moving pieces behind the scenes that enable the magical semantic_text experience. We wanted to take the time to make sure we had those pieces right before moving the feature out of beta. That time was well spent: We iterated on our original approach, adding features and streamlining storage to create a simpler, leaner version of semantic_text that is more supportable in the long-term. Our original implementation relied on modifying _source to store inference results. This meant that semantic_text fields had a relatively complex subfield structure: This structure created a few problems: It was needlessly verbose. In addition to the original value, it contained metadata and chunk information, which made API responses hard to read and larger than necessary. It increased index sizes on disk. Embeddings, which can be quite large, were effectively being stored twice: once in the Lucene index for retrieval purposes and again in _source . This significantly impacted the ability to use semantic_text at scale for larger datasets. It was unintuitive to manage. The original value provided was under the text subfield, which meant special handling was required to get this value for follow-up actions. This meant that semantic_text field values didn’t act like other field values in the text family, which had numerous knock-on effects that complicated our efforts to integrate it into higher-level workflows. Semantic text as text Our revised implementation elegantly improves on those friction points with a focused simplification in approach to how we represent semantic_text in _source . Instead of using a complex subfield structure to store metadata and chunk information directly within the semantic_text field, we use a hidden metafield for this purpose. This means we no longer need to modify _source to store inference results. In practical terms, it means that the document _source that you provide to us for indexing is the same _source that you will get back upon document retrieval. Notice that there are no longer subfields like text or inference in the _source representation. Instead, the _source is as you provided it. So much simpler! 🚨 Note that if you parse semantic_text field values returned in search results or Get APIs , this is a breaking change. That is to say, if you parse the infer_field.text subfield value, you will need to update your code to instead parse the infer_field value. We try our best to avoid breaking changes, but this was an unavoidable side-effect of removing the subfield structure from _source . There are numerous benefits to this _source representation simplification: Simpler to work with. You no longer need to parse a complex subfield structure to get the original text value, you can just take the field value as the original value. Less verbose. Metadata and chunk information does not clutter up API responses. More efficient disk utilization. Embeddings are no longer stored in _source . Better integration. It allows semantic_text to integrate better with other Elasticsearch features, such as multi-fields, partial document updates, and reindexing. Let’s expand on that last point a bit because it covers a few areas. With this simplification, semantic_text fields can now be used as the source and target of multi-fields : Semantic_text fields now also support partial document updates through the Bulk API : And you can now reindex into a semantic_text field that uses a different inference_id : Semantic highlighting One of the most requested semantic_text features is the ability to retrieve the most relevant chunks within a semantic_text field. This functionality is critical for RAG use cases. Up until now, we have (unofficially) accommodated this with some hacky workarounds involving inner_hits . However, we are retiring inner_hits in favor of a more streamlined solution: highlighting. Highlighting is a well-known lexical search technique one can apply to text fields. As a member of the text field family, it only makes sense to adapt the technique for semantic_text . To this end, we have added a semantic highlighter that you can use to retrieve the chunks that are most relevant to your query: See the semantic_text documentation for more information about how to use highlighting. Ready for primetime With the _source representation change in place, we are now officially announcing that semantic_text is a generally available (GA) feature 🎉! This means that we are committed to not making any more breaking changes to the feature and supporting it in production environments. As a customer, you should feel comfortable integrating semantic_text into your production workflows knowing that Elastic is committed to supporting you and providing long-term continuity. Migrating from beta To enable an orderly migration from the beta implementation, all indices with semantic_text fields created in Elasticsearch 8.15 to 8.17 or created in Serverless prior to January 30th will continue to operate as they do today. That is to say, they will continue to use the beta _source representation . We recommend migrating to the GA _source representation at your earliest convenience. You can do so by reindexing into a new index: Note the use of the script param to account for the _source representation change. The script is taking the value from the text subfield and assigning it directly to the semantic_text field value. Try it out yourself These changes will be available in stack hosted Elasticsearch 8.18+, but if you want to try them today, they are already available in Serverless. They also pair well with semantic search simplifications we are rolling out at the same time. Use both to take semantic search to the next level! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Semantic search evolution In the beginning Semantic text as text Semantic highlighting Ready for primetime Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Semantic Text: Simpler, better, leaner, stronger - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/semantic-text-ga",
+    "meta_description": "he Elasticsearch semantic_text field type is now GA. Explore the latest improvements: semantic highlighting, simpler representation in _source and more."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. Vector Database AL By: Andre Luiz On May 13, 2025 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. The use of embeddings to improve the relevance and accuracy of information retrieval has grown significantly over the years. Tools like Elasticsearch have evolved to support this type of data through specialized field types such as dense vectors, sparse vectors, and semantic text. However, to achieve good results, it is essential to understand how to properly map embeddings to the available Elasticsearch field types: semantic_text, dense_vector, and sparse_vector. In this article, we will discuss these field types, when to use each one, and how they relate to embedding generation and usage strategies, both during indexing and querying. Dense vector type The dense_vector field type in Elasticsearch is used to store dense vectors, which are numerical representations of text where almost all dimensions are relevant. These vectors are generated by language models, such as OpenAI, Cohere, and Hugging Face, and are designed to capture the overall semantic meaning of a text, even when it does not share exact terms with other documents. In Elasticsearch, dense vectors can have up to 4096 dimensions depending on the model used. For example, the all-MiniLM-L6-v2 model generates vectors with 384 dimensions, while OpenAI’s text-embedding-ada-002 produces vectors with 1536 dimensions. The dense_vector field is commonly adopted as the default type for storing this kind of embedding when greater control is needed, such as using pre-generated vectors, applying custom similarity functions, or integrating with external models. When and why to use dense_vector type? Dense vectors are excellent for capturing semantic similarity between sentences, paragraphs, or entire documents. They work very well when the goal is to compare the overall meaning of texts, even if they do not share the same terms. The dense vector field is ideal when you already have an external embedding generation pipeline using models such as OpenAI, Cohere, or Hugging Face and only want to store and query these vectors manually. This type of field offers high compatibility with embedding models and full flexibility in generation and querying, allowing you to control how the vectors are produced, indexed, and used during search. In addition, it supports different forms of semantic search, with queries such as KNN or script_score for cases where it is necessary to adjust the ranking logic. These possibilities make the dense vector ideal for applications such as RAG (Retrieval-Augmented Generation), recommendation systems, and personalized searches based on similarity. Finally, the field allows you to customize the relevance logic, using functions such as cosineSimilarity, dotProduct or l2norm to adapt the ranking according to the needs of your use case. Dense vectors remain the best option for those who need flexibility, customization, and compatibility with advanced use cases like the ones mentioned above. How to use the query for dense vector type? Searches on fields defined as dense_vector use the k-nearest neighbor query. This query is responsible for finding documents whose dense vector is closest to the query vector. Below is an example of how to apply a Knn query to a dense vector field: In addition to the Knn query, if there is a need to customize the document scoring, it is also possible to use the script_score query, combining it with vector comparison functions such as cosineSimilarity, dotProduct, or l2norm to calculate relevance in a more controlled way. See the example: If you want to dive deeper, I recommend exploring the article How to set up vector search in Elasticsearch. Sparse vector type The sparse_vector field type is used to store sparse vectors, which are numerical representations where most values are zero and only a few terms have significant weights. This type of vector is common in term-based models such as SPLADE or ELSER (Elastic Learned Sparse EncodeR). When and why to use sparse vector type? Sparse vectors are ideal when you need a more precise search in lexical terms, without sacrificing semantic intelligence. They represent the text as token/value pairs, highlighting only the most relevant terms with associated weights, which provides clarity, control and efficiency. This type of field is especially useful when you generate vectors based on terms, such as in the ELSER or SPLADE models, which assign different weights to each token based on its relative importance in the text. For the occasions when you want to control the influence of specific words in the query, sparse vector types allow you to manually adjust the weight of the terms to optimize the ranking of the results. Among the main benefits are transparency in the search since it is possible to clearly understand why a document was considered relevant, and storage efficiency since only tokens with a non-zero value are saved, unlike dense vectors that store all dimensions. Furthermore, sparse vectors are the ideal complement in hybrid search strategies, and can even be combined with dense vectors to combine lexical precision with semantic understanding. How to use the query for sparse vector type? The sparse_vector query allows you to search for documents based on a query vector in token/value format. See an example of the query below: If you prefer to use a trained model, it is possible to use an inference endpoint that automatically transforms the query text into a sparse vector: To explore this topic further, I suggest reading Understanding sparse vector embeddings with trained ML models . Semantic text type The semantic_text field type is the simplest and most straightforward way to use semantic search in Elasticsearch. It automatically handles embedding generation, both at indexing and query time, through an inference endpoint. This means you don’t have to worry about generating or storing vectors manually. When and why to use semantic text? The semantic_text field is is ideal for those who want to get started with minimal technical effort and without having to handle vectors manually. This field automates steps like embedding generation and vector search mapping, making the setup faster and more convenient. You should consider using semantic_text when you value simplicity and abstraction , as it eliminates the complexity of manually configuring mappings, embedding generation, and ingestion pipelines . Just select the inference model, and Elasticsearch takes care of the rest. Key advantages include automatic embedding generation, performed during both indexing and querying, and ready-to-use mapping , which comes preconfigured to support the selected inference model. In addition, the field offers native support for automatic splitting of long texts (text chunking) , allowing large texts to be divided into smaller passages, each with its own embedding, which improves search precision. This greatly boosts productivity, especially for teams that want to deliver value quickly without dealing with the underlying engineering of semantic search. However, while semantic_text provides speed and simplicity, this approach has some limitations. It allows the use of market standard models, as long as they are available as inference endpoints in Elasticsearch. But it does not support externally generated embeddings , as is possible with the dense_vector field. If you need more control over how vectors are generated, want to use your own embeddings, or need to combine multiple fields for advanced strategies, the dense_vector and sparse_vector fields provide the flexibility required for more customized or domain-specific scenarios. How to use the query for semantic text type Before semantic_text , it was necessary to use a different query depending on the type of embedding (dense or sparse). A sparse_vector query was used for sparse fields, while dense_vector fields required KNN queries. With the semantic text type, the search is performed using the semantic query , which automatically generates the query vector and compares it with the embeddings of the indexed documents. The semantic_text type allows you to define an inference endpoint to embed the query, but if none is specified, the same endpoint used during indexing will be applied to the query. To learn more, I suggest reading the article Elasticsearch new semantic_text mapping: Simplifying semantic search . Conclusion When choosing how to map embeddings in Elasticsearch, it is essential to understand how you want to generate the vectors and what level of control you need over them. If you are looking for simplicity, the semantic text field enables automatic and scalable semantic search, making it ideal for many initial use cases. When more control, fine-tuned performance, or integration with custom models is required, the dense vector and sparse vector fields provide the necessary flexibility. The ideal field type depends on your use case, available infrastructure, and the maturity of your machine learning stack. Most importantly, Elastic offers the tools to build modern and highly adaptable search systems. References Semantic text field type Sparse vector field type Dense vector field type Semantic query Sparse vector query kNN search Elasticsearch new semantic_text mapping: Simplifying semantic search Understanding sparse vector embeddings with trained ML models Report an issue Related content Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Jump to Dense vector type When and why to use dense_vector type? How to use the query for dense vector type? Sparse vector type When and why to use sparse vector type? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/mapping-embeddings-to-elasticsearch-field-types",
+    "meta_description": "Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Improving information retrieval in the Elastic Stack This series explores steps to improve search relevance, benchmarking passage retrieval, ELSER, and hybrid retrieval. Part1 Generative AI July 13, 2023 Improving information retrieval in the Elastic Stack: Steps to improve search relevance In this first blog post, we will list and explain the differences between the primary building blocks available in the Elastic Stack to do information retrieval. GC QH TV By: Grégoire Corbière , Quentin Herreros and Thomas Veasey Part2 Generative AI July 13, 2023 Improving information retrieval in the Elastic Stack: Benchmarking passage retrieval In this blog post, we'll examine benchmark solutions to compare retrieval methods. We use a collection of data sets to benchmark BM25 against two dense models and illustrate the potential gain using fine-tuning strategies with one of those models. GC QH TV By: Grégoire Corbière , Quentin Herreros and Thomas Veasey Part3 ML Research June 21, 2023 Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model Learn about the Elastic Learned Sparse Encoder (ELSER), its retrieval performance, architecture, and training process. TV QH By: Thomas Veasey and Quentin Herreros Part4 Generative AI July 20, 2023 Improving information retrieval in the Elastic Stack: Hybrid retrieval In this blog we introduce hybrid retrieval and explore two concrete implementations in Elasticsearch. We explore improving Elastic Learned Sparse Encoder’s performance by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores. QH TV By: Quentin Herreros and Thomas Veasey Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving information retrieval in the Elastic Stack - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/improving-information-retrieval-in-the-elastic-stack",
+    "meta_description": "This series explores steps to improve search relevance, benchmarking passage retrieval, ELSER, and hybrid retrieval.\n"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. Vector Database How To SF JG By: Sachin Frayne and Jessica Garson On April 23, 2025 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector search provides the foundation when implementing semantic search for text or similarity search for images, videos, or audio. With vector search, the vectors are mathematical representations of data that can be huge and sometimes sluggish. Better Binary Quantization (hereafter referred to as BBQ) works as a compression method for vectors. It allows you to find the right matches while shrinking the vectors to make them faster to search and process. This article will cover BBQ and rescore_vector, a field only available for quantized indices that automatically rescores vectors. All complete queries and outputs mentioned in this article can be found in our Elasticsearch Labs code repository . Why would you implement this in your use case? Note: for an in-depth understanding of how the mathematics behind BBQ works, please check out the “Further learning” section below. For the purposes of this blog, the focus is on the implementation. While the mathematics is intriguing and it is crucial if you want to fully grasp why your vector searches remain precise. Ultimately, this is all about compression since it turns out that with the current vector search algorithms you are limited by the read speed of data. Therefore if you can fit all of that data into memory, you get a significant speed boost when compared to reading from storage ( memory is approximately 200x faster than SSDs ). There are a few things to keep in mind: Graph-based indices like HNSW (Hierarchical Navigable Small World) are the fastest for vector retrieval. HNSW: An approximate nearest neighbor search algorithm that builds a multi-layer graph structure to enable efficient high-dimensional similarity searches. HNSW is fundamentally limited in speed by data read speed from memory, or in the worst case, from storage. Ideally, you want to be able to load all of your stored vectors into memory. Embedding models generally produce vectors with float32 precision, 4 bytes per floating point number. And finally, depending on how many vectors and/or dimensions you have, you can very quickly run out of memory to keep all of your vectors in. Taking this for granted, you see that a problem arises quickly once you start ingesting millions or even billions of vectors, each with potentially hundreds or even thousands of dimensions. The section entitled “ Approximate numbers on the compression ratios ” provides some rough numbers. What do you need to get started? To get started, you will need the following: If you are using Elastic Cloud or on-prem, you will need a version of Elasticsearch higher than 8.18. While BBQ was introduced in 8.16, in this article, you will use vector_rescore , which was introduced in 8.18. Additionally, you will also need to ensure there is a machine learning (ML) node in your cluster. (Note: an ML node with a minimum of 4GB is needed to load the model, but you will likely need much larger nodes for full production workloads.) If you are using Serverless, you will need to select an instance that is optimized for vectors. You will also need a base level of knowledge regarding vector databases. If you aren’t already familiar with vector search concepts in Elastic, you may want to first check out the following resources: Navigating an Elastic Vector Database The big ideas behind retrieval augmented generation Implementation To keep this blog simple, you will use built-in functions when they are available. In this case, you have the .multilingual-e5-small vector embedding model that will run directly inside Elasticsearch on a machine learning node. Note that you can replace the text_embedding model with the embedder of your choosing ( OpenAI , Google AI Studio , Cohere and plenty more. If your preferred model is not yet integrated, you can also bring your own dense vector embeddings .) First, you will need to create an inference endpoint to generate vectors for a given piece of text. You will run all of these commands from the Kibana Dev Tools Console . This command will download the .multilingual-e5-small . If it does not already exist, then it will set up your endpoint; this may take a minute to run. You can see the expected output in the file 01-create-an-inference-endpoint-output.json in the Outputs folder. Once this has returned, your model will be set up and you can test that the model works as expected with the following command. You can see the expected output in the file 02-embed-text-output.json in the Outputs folder. If you run into issues around your trained model not being allocated to any nodes, you may need to start your model manually. Now let's create a new mapping with 2 properties, a standard text field ( my_field ) and a dense vector field ( my_vector ) with 384 dimensions to match the output from the embedding model. You will also override the index_options.type to bbq_hnsw . You can see the expected output in the file 03-create-byte-qauntized-index-output.json in the Outputs folder. To ensure Elasticsearch generates your vectors, you can make use of an Ingest Pipeline . This pipeline will require 3 things: the endpoint, ( model_id ), the input_field that you want to create vectors for and the output_field to store those vectors in. The first command below will create an inference ingest pipeline, which uses the inference service under the hood, and the second will test that the pipeline is working correctly. You can see the expected output in the file 04-create-and-simulate-ingest-pipeline-output.json in the Outputs folder. You are now ready to add some documents with the first 2 commands below and to test that your searches work with the 3rd command. You can check out the expected output in the file 05-bbq-index-output.json in the Outputs folder. As recommended in this post , rescoring and oversampling are advised when you scale to non-trivial amounts of data because they help maintain high recall accuracy while benefiting from the compression advantages. From Elasticsearch version 8.18, you can do it this way using rescore_vector . The expected output is in the file 06-bbq-search-8-18-output.json in the Outputs folder. How do these scores compare to those you would get for raw data? If you do everything above again but with index_options.type: hnsw , you will see that the scores are very comparable. You can see the expected output in the file 07-raw-vector-output.json in the Outputs folder. Approximate numbers on the compression ratios Storage and memory requirements can quickly become a significant challenge when working with vector search. The following breakdown illustrates how different quantization techniques dramatically reduce the memory footprint of vector data. Vectors (V) Dimensions (D) raw (V x D x 4) int8 (V x (D x 1 + 4)) int4 (V x (D x 0.5 + 4)) bbq (V x (D x 0.125 + 4)) 10,000,000 384 14.31GB 3.61GB 1.83GB 0.58GB 50,000,000 384 71.53GB 18.07GB 9.13GB 2.89GB 100,00,0000 384 143.05GB 36.14GB 18.25GB 5.77GB Conclusion BBQ is an optimization you can apply to your vector data for compression without sacrificing accuracy. It works by converting vectors into bits, allowing you to search the data effectively and empowering you to scale your AI workflows to accelerate searches and optimize data storage. Further learning If you are interested in learning more about BBQ, be sure to check out the following resources: Binary Quantization (BBQ) in Lucene and Elasticsearch Better Binary Quantization (BBQ) vs. Product Quantization Optimized Scalar Quantization: Even Better Binary Quantization Better Binary Quantization (BBQ): From Bytes to BBQ, The Secret to Better Vector Search by Ben Trent Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Why would you implement this in your use case? What do you need to get started? Implementation Approximate numbers on the compression ratios Conclusion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to implement Better Binary Quantization (BBQ) into your use case and why you should - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/bbq-implementation-into-use-case",
+    "meta_description": "Learn how to implement Better Binary Quantization (BBQ) into your use case and why you should."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model Learn about the Elastic Learned Sparse Encoder (ELSER), its retrieval performance, architecture, and training process. ML Research TV QH By: Thomas Veasey and Quentin Herreros On June 21, 2023 Part of Series Improving information retrieval in the Elastic Stack Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this blog, we discuss the work we've been doing to augment Elastic's out-of-the-box retrieval with a pre-trained language model: the Elastic Learned Sparse Encoder (ELSER). In our previous blog post in this series, we discussed some of the challenges applying dense models to retrieval in a zero-shot setting. This is well known and was highlighted by the BEIR benchmark, which assembled diverse retrieval tasks as a proxy to the performance one might expect from a model applied to an unseen data set. Good retrieval in a zero-shot setting is exactly what we want to achieve, namely a one-click experience that enables textual fields to be searched using a pre-trained model. This new capability fits into the Elasticsearch _search endpoint as just another query clause, a text_expansion query. This is attractive because it allows search engineers to continue to tune queries with all the tools Elasticsearch already provides. Furthermore, to truly achieve a one-click experience, we've integrated it with the new Elasticsearch Relevance Engine . However, rather than focus on the integration, this blog digs a little into ELSER's model architecture and the work we did to train it. We had another goal at the outset of this project. The natural language processing (NLP) field is fast moving, and new architectures and training methodologies are being introduced rapidly. While some of our users will keep on top of the latest developments and want full control over the models they deploy, others simply want to consume a high quality search product. By developing our own training pipeline, we have a playground for implementing and evaluating the latest ideas, such as new retrieval relevant pre-training tasks or more effective distillation tasks , and making the best ones available to our users. Finally, it is worth mentioning that we view this feature as complementary to the existing model deployment and vector search capabilities in the Elastic Stack, which are needed for those more custom use cases like cross-modal retrieval. ELSER performance results Before looking at some of the details of the architecture and how we trained our model, the Elastic Learned Sparse Encoder (ELSER) , it's interesting to review the results we get, as ultimately the proof of the pudding is in the eating. As we discussed before , we use a subset of BEIR to evaluate our performance. While this is by no means perfect, and won't necessarily represent how the model behaves on your own data, we at least found it challenging to make significant improvements on this benchmark. So we feel confident that improvements we get on this translate to real improvements in the model. Since absolute performance numbers on benchmarks by themselves aren't particularly informative, it is nice to be able to compare with other strong baselines, which we do below. The table below shows the performance of Elastic Learned Sparse Encoder compared to Elasticsearch's BM25 with an English analyzer broken down by the 12 data sets we evaluated. We have 10 wins, 1 draw, and 1 loss and an average improvement in NDCG@10 of 17%. NDCG@10 for BEIR data sets for BM25 and Elastic Learned Sparse Encoder (referred to as “ELSER” above, note higher is better) In the following table, we compare our average performance to some other strong baselines. The Vespa results are based on a linear combination of BM25 and their implementation of ColBERT as reported here , the Instructor results are from this paper, the SPLADEv2 results are taken from this paper and the OpenAI results are reported here . Note that we've separated out the OpenAI results because they use a different subset of the BEIR suite. Specifically, they average over ArguAna, Climate FEVER, DBPedia, FEVER, FiQA, HotpotQA, NFCorpus, QuoraRetrieval, SciFact, TREC COVID and Touche. If you follow that link, you will notice they also report NDCG@10 expressed as a percentage. We refer the reader to the links above for more information on these approaches. Average NDCG@10 for BEIR data sets vs. various high quality baselines (higher is better). Note: OpenAI chose a different subset, and we report our results on this set separately. Finally, we note it has been widely observed that an ensemble of statistical (a la BM25) and model based retrieval, or hybrid search, tends to outperform either in a zero-shot setting. Already in 8.8, Elastic allows one to do this for text_expansion with linear boosting and this works well if you calibrate to your data set. We are also working on Reciprocal Rank Fusion (RRF), which performs well without calibration. Stay tuned for our next blog in this series, which will discuss hybrid search. Having seen how ELSER performs, we next discuss its architecture and some aspects of how it is trained. What are learned sparse models and why are they attractive? We showed in our previous blog post that, while very effective if fine-tuned, dense retrieval tends not to perform well in a zero-shot setting. By contrast cross-encoder architectures , which don't scale well for retrieval, tend to learn robust query and document representations and work well on most text. It has been suggested that part of the reason for this difference is the bottleneck of the query and document interacting only via a relatively low dimensional vector “dot product.” Based on this observation, a couple of model architectures have been recently proposed that try to reduce this bottleneck — these are ColBERT and SPLADE. From our perspective, SPLADE has some additional advantages: Compared to ColBERT, it is extremely storage efficient. Indeed, we find that document passages expand to about 100 tokens on average and we've seen approximate size parity with normal text indices. With some caveats , retrieval can make use of inverted indices for which we already have very mature implementations in Lucene. Compared to ANN, these use memory extremely efficiently. It provides natural controls as it is being trained that allow us to trade retrieval quality for retrieval latency. In particular, the FLOPS regularizer, which we discuss below, allows one to add a term to the loss for the expected retrieval cost. We plan to take advantage of this as we move toward GA. One last clear advantage compared to dense retrieval is that SPLADE allows one a simple and compute efficient route to highlight words generating a match. This simplifies surfacing relevant passages in long documents and helps users better understand how the retriever is working. Taken together, we felt that these provided a compelling case for adopting the SPLADE architecture for our initial release of this feature. There are multiple good detailed descriptions of this architecture — if you are interested in diving in, this , for example, is a nice write up by the team that created the model. In very brief outline, the idea is rather than use a distributed representation, say averaging BERT token output embeddings, instead use the token logits, or log-odds the tokens are predicted, for masked word prediction. When language models are used to predict masked words, they achieve this by predicting a probability distribution over the tokens of their vocabulary. The BERT vocabulary, for WordPiece, contains many common real words such as cat, house, and so on. It also contains common word endings — things like ##ing (with the ## simply denoting it is a continuation). Since words can't be arbitrarily exchanged, relatively few tokens will be predicted for any given mask position. SPLADE takes as a starting point for its representation of a piece of text the tokens most strongly predicted by masking each word of that text. As noted, this is a naturally disentangled or sparse representation of that text. It is reasonable to think of these token probabilities for word prediction as roughly capturing contextual synonyms. This has led people to view learned sparse representations, such as SPLADE, as something close to automatic synonym expansion of text, and we see this in multiple online explanations of the model. In our view, this is at best an oversimplification and at worst misleading. SPLADE takes as the starting point for fine-tuning the maximum token logits for a piece of text, but it then trains on a relevance prediction task, which crucially accounts for the interaction between all shared tokens in a query and document. This process somewhat re-entangles the tokens, which start to behave more like components of a vector representation (albeit in a very high dimensional vector space). We explored this a little as we worked on this project. We saw as we tried removing low score and apparently unrelated tokens in the expansion post hoc that it reduced all quality metrics, including precision(!), in our benchmark suite. This would be explained if they were behaving more like a distributed vector representation, where zeroing individual components is clearly nonsensical. We also observed that we can simply remove large parts of BERT's vocabulary at random and still train highly effective models as the figure below illustrates. In this context, parts of the vocabulary must be being repurposed to account for the missing words. Margin MSE validation loss for student models with different vocabulary sizes Finally, we note that unlike say generative tasks where size really does matter a great deal, retrieval doesn't as clearly benefit from having huge models. We saw in the result section that this approach is able to achieve near state-of-the-art performance with only 100M parameters, as compared to hundreds of millions or even billions of parameters in some of the larger generative models. Typical search applications have fairly stringent requirements on query latency and throughput, so this is a real advantage. Exploring the training design space for ELSER In our first blog , we introduced some of the ideas around training dense retrieval models. In practice, this is a multi stage process and one typically picks up a model that has already been pre-trained. This pre-training task can be rather important for achieving the best possible results on specific downstream tasks. We don't discuss this further because to date this hasn't been our focus, but note in passing that like many current effective retrieval models, we start from a co-condenser pre-trained model . There are many potential avenues to explore when designing training pipelines. We explored quite a few, and suffice to say, we found making consistent improvements on our benchmark was challenging. Multiple ideas that looked promising on paper didn't provide compelling improvements. To avoid this blog becoming too long, we first give a quick overview of the key ingredients of the training task and focus on one novelty we introduced, which provided the most significant improvements. Independent of specific ingredients, we also made some qualitative and quantitative observations regarding the role of the FLOPS regularization, which we will discuss at the end. When training models for retrieval, there are two common paradigms: contrastive approaches and distillation approaches. We adopted the distillation approach because this was shown to be very effective for training SPLADE in this paper. The distillation approach is slightly different from the common paradigm, which informs the name, of shrinking a large model to a small, but almost as accurate, “copy.” Instead the idea is to distill the ranking information present in a cross-encoder architecture. This poses a small technical challenge: since the representation is different, it isn't immediately clear how one should mimic the behavior of the cross-encoder with the model being trained. The standard idea we used is to present both models with triplets of the form (query, relevant document, irrelevant document). The teacher model computes a score margin, namely s c o r e ( q u e r y , r e l e v a n t d o c u m e n t ) − s c o r e ( q u e r y i r r e l e v a n t d o c u m e n t ) score(query, relevant\\;document) - score(query\\;irrelevant\\;document) score ( q u ery , re l e v an t d oc u m e n t ) − score ( q u ery i rre l e v an t d oc u m e n t ) , and we train the student model to reproduce this score margin using MSE to penalize the errors it makes. Let's think a little about what this process does since it motivates the training detail we wish to discuss. If we recall that the interaction between a query and document using the SPLADE architecture is computed using the dot product between two sparse vectors, of non-negative weights for each token, then we can think about this operation as wanting to increase the similarity between the query and the higher scored document weight vectors. It is not 100% accurate, but not misleading, to think of this as something like “rotating” the query in the plane spanned by the two documents' weight vectors toward the more relevant one. Over many batches, this process gradually adjusts the weight vectors starting positions so the distances between queries and documents captures the relevance score provided by the teacher model. This leads to an observation regarding the feasibility of reproducing the teacher scores. In normal distillation, one knows that given enough capacity the student would be able to reduce the training loss to zero. This is not the case for cross-encoder distillation because the student scores are constrained by the properties of a metric space induced by the dot product on their weight vectors. The cross-encoder has no such constraint. It is quite possible that for particular training queries q 1 q_1 q 1 ​ and q 2 q_2 q 2 ​ and documents d 1 d_1 d 1 ​ and d 2 d_2 d 2 ​ we have to simultaneously arrange for q 1 q_1 q 1 ​ to be close to d 1 d_1 d 1 ​ and d 2 d_2 d 2 ​ , and q 2 q_2 q 2 ​ to be close to d 1 d_1 d 1 ​ but far from d 2 d_2 d 2 ​ . This is not necessarily possible, and since we penalize the MSE in the scores, one effect is an arbitrary reweighting of the training triplets associated with these queries and documents by the minimum margin we can achieve. One of the observations we had while working on training ELSER was the teacher was far from infallible. We initially observed this by manually investigating query-relevant document pairs that were assigned unusually low scores. In the process, we found objectively misscored query-document pairs. Aside from manual intervention in the scoring process, we also decided to explore introducing a better teacher. Following the literature, we were using MiniLM L-6 from the SBERT family for our initial teacher. While this shows strong performance in multiple settings, there are better teachers, based on their ranking quality. One example is a ranker based on a large generative model: monot5 3b . In the figure below, we compare the query-document score pair distribution of these two models. The monot5 3b distribution is clearly much less uniform, and we found when we tried to train our student model using its raw scores the performance saturated significantly below using MiniLM L-6 as our teacher. As before, we postulated that this was down to many important score differences in the peak around zero getting lost with training worried instead about unfixable problems related to the long lower tail. Monot5 3b and MiniLM L-6 score distributions on a matched scale for a random sample of query-document pairs from the NQ data set. Note: the X-axis does not show the actual scores returned by either of the models. It is clear that all rankers are of equivalent quality up to monotonic transforms of their scores. Specifically, it doesn't matter if we use s c o r e ( q u e r y , d o c u m e n t ) score(query, document) score ( q u ery , d oc u m e n t ) or f ( s c o r e ( q u e r y , d o c u m e n t ) ) f(score(query, document)) f ( score ( q u ery , d oc u m e n t )) provided f ( ⋅ ) f(\\cdot) f ( ⋅ ) is a monotonic increasing function; any ranking quality measure will be the same. However, not all such functions are equivalently effective teachers. We used this fact to smooth out the distribution of monot5 3b scores, and suddenly our student model trained and started to beat the previous best model. In the end, we used a weighted ensemble of our two teachers. Before closing out this section, we want to briefly mention the FLOPS regularizer. This is a key ingredient of the improved SPLADE v2 training process. It was proposed in this paper as a means of penalizing a metric directly related to the compute cost for retrieval from an inverted index. In particular, it encourages tokens that provide little information for ranking to be dropped from the query and document representations based on their impact on the cost for retrieving from an inverted index. We had three observations: Our first observation was that the great majority of tokens are actually dropped while the regularizer is still warming up. In our training recipe, the regularizer uses quadratic warm up for the first 50,000 batches. This means that in the first 10,000 batches, it is no more than 1/25th its terminal value, and indeed we see that the contribution to the loss from MSE in the score margin is orders of magnitude larger than the regularization loss at this point. However, during this period the number of query and document tokens per batch activated by our training data drops from around 4k and 14k on average to around 50 and 300, respectively. In fact, 99% of all token pruning happens in this phase and seems largely driven by removing tokens which actually hurt ranking performance. Our second observation was that we found it contributes to ELSER's generalization performance for retrieval. Both turning down the amount of regularization and substituting regularizers that induce more sparseness, such as the sum of absolute weight values, reduced average ranking performance across our benchmark. Our final observation was that larger batches and diverse batches both positively impacted retrieval quality; we tried by contrast query clustering with in-batch negatives. So why could this be, since it is primarily aimed at optimizing retrieval cost? The FLOPS regularizer is defined as follows: it first averages the weights for each token in the batch across all the queries and separately the documents it contains, it then sums the squares of these average weights. If we consider that the batch typically contains a diverse set of queries and documents, this acts like a penalty that encourages something analogous to stop word removal. Tokens that appear for many distinct queries and documents will dominate the loss, since the contribution from rarely activated tokens is divided by the square of the batch size. We postulate that this is actually helping the model to find better representations for retrieval. From this perspective, the fact that the regularizer term only gets to observe the token weights of queries and documents in the batch is undesirable. This is an area we'd like to revisit. Conclusion We have given a brief overview of the model, the Elastic Learned Sparse Encoder (ELSER), its rationale, and some aspects of the training process behind the feature we're releasing in a technical preview for the new text_expansion query and integrating with the new Elasticsearch Relevance Engine . To date, we have focused on retrieval quality in a zero-shot setting and demonstrated good results against a variety of strong baselines. As we move toward GA, we plan to do more work on operationalizing this model and in particular around improving inference and retrieval performance. Stay tuned for the next blog post in this series, where we'll look at combining various retrieval methods using hybrid retrieval as we continue to explore exciting new retrieval methods using Elasticsearch. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid retrieval Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to ELSER performance results What are learned sparse models and why are they attractive? Exploring the training design space for ELSER Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-learned-sparse-encoder-elser-retrieval-performance",
+    "meta_description": "using sparse vectors in Elasticsearch, with a foundation model based on SPLADE concepts"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. ML Research TV By: Thomas Veasey On December 19, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. At Elastic we continue to work on innovations that improve the cost and performance of our vector indices. In this post we’re going to discuss in detail the implementation and intuition for a new version of scalar quantization we’ve been working on which we’re calling optimized scalar quantization or OSQ. This provides a further enhancement to our BBQ index format . Indeed, we set new state-of-the-art performance for 32 × \\times × compression. It also allows us to get very significant improvements in accuracy compared to naive scalar quantization approaches while retaining similar index compression and query performance. We plan to roll out 2, 4 and 7 bit quantization schemes and indeed unify all our index compression options to use this same underlying technique. Furthermore, we anticipate that with 7 bit quantization we should be able to discard the raw floating point vectors and plan to evaluate this thoroughly. Measuring success For any compression scheme we need to worry about its impact on the query behavior. For nearest neighbor queries there is a metric that much prior art focuses on. This is the recall at k k k . Specifically, one measures the fraction of true nearest neighbor vectors that a query returns. In fact, we already exploit the relaxation that we don't find the true nearest neighbors, even when we search with uncompressed vectors in Lucene. Linear complexity is unavoidable if one wants to find the true nearest neighbors in a high dimensional vector space. The data structure that Lucene uses to index vector data is an HNSW graph , which allows it to run approximate nearest neighbor queries significantly faster than a linear scan. In this blog we study a metric that we’ll abbreviate as recall@ k ∣ n k|n k ∣ n , which is the recall at k k k retrieving n ≥ k n\\geq k n ≥ k candidates and reranking them using the uncompressed vector distance calculation. Note that if n = k n=k n = k then it's simply equal to the recall at k k k . We also look at the quality of the approximation of the uncompressed vector distances. We’ll discuss all this in more detail when we discuss our results. Motivation Scalar quantization, as we’ve discussed before , is an approach which allows one to represent floating point vectors by integer vectors. There are two reasons to do this: It allows you to reduce the amount of memory needed to represent a vector, by using fewer than 32 bits per integer, and It allows you to accelerate the distance calculation between vectors, which is the bottleneck for performing nearest neighbor queries in vector databases. Recently, a paper , which we’ve discussed before , proposed an approach called RaBitQ that is able to achieve good recall using only 1 bit per component. This is exciting because 32 × \\times × compression is competitive with Product Quantization (PQ), which was the previous state-of-the-art approach when compression was paramount. A key advantage of RaBitQ is that it’s relatively straightforward to accelerate distance calculations with SIMD operations. Certainly, when it is compared to PQ that uses lookups to compute distance, for which it is much harder to exploit hardware parallelism. The authors performed extensive experiments and showed they were able to achieve consistently higher recall as a function of query latency than PQ using the same compression ratio with an IVF index. Although the RaBitQ approach is conceptually rather different to scalar quantization, we were inspired to re-evaluate whether similar performance could be unlocked for scalar quantization. In our companion piece we will discuss the result of integrating OSQ with HNSW and specifically how it compares to the baseline BBQ quantization scheme, which is inspired by RaBitQ. As an incentive to keep reading this blog we note that we were able to achieve systematically higher recall in this setting, sometimes by as much as 30%. Another advantage of OSQ is it immediately generalizes to using n n n bits per component. For example, we show below that we’re able to achieve excellent recall in all cases we tested using minimal or even no reranking with only a few bits per component. This post is somewhat involved. We will take you through step-by-step the innovations we’ve introduced and explain some of the intuition at each step. Dot products are enough In the following we discuss only the dot product, which would translate to a MIP query in Elasticsearch. In fact, this is sufficient because well known reductions exist to convert the other metrics Elasticsearch supports, Euclidean and cosine, to computing a dot product. For the Euclidean distance we note that ∥ y − x ∥ 2 = ∥ x ∥ 2 + ∥ y ∥ 2 − 2 x t y \\|y-x\\|^2 = \\|x\\|^2 + \\|y\\|^2 - 2x^ty ∥ y − x ∥ 2 = ∥ x ∥ 2 + ∥ y ∥ 2 − 2 x t y and also that ∥ x ∥ \\|x\\| ∥ x ∥ and ∥ y ∥ \\|y\\| ∥ y ∥ can be precomputed and cached, so we need only compute the dot product y t x y^t x y t x in order to compare two vectors. For cosine we simply need to normalize vectors and then y t x ∥ y ∥ ∥ x ∥ = y t x \\frac{y^t x}{\\|y\\|\\|x\\|} = y^t x ∥ y ∥∥ x ∥ y t x ​ = y t x Scalar quantization refresher Scalar quantization is typically defined by the follow componentwise equation x ^ i = round ( 2 n − 1 b − a ( clamp ( x i , a , b ) − a ) ) \\hat{x}_i = \\text{round}\\left(\\frac{2^n-1}{b-a} \\left(\\text{clamp}(x_i, a, b) - a\\right)\\right) x ^ i ​ = round ( b − a 2 n − 1 ​ ( clamp ( x i ​ , a , b ) − a ) ) Here, clamp ( ⋅ , a , b ) = max ⁡ ( min ⁡ ( ⋅ , b ) , a ) \\text{clamp}(\\cdot, a, b) = \\max(\\min(\\cdot, b), a) clamp ( ⋅ , a , b ) = max ( min ( ⋅ , b ) , a ) . The quantized vector x ^ \\hat{x} x ^ is integer valued with values less than or equal to 2 n − 1 2^n-1 2 n − 1 , that is we’re using n n n bits to represent each component. The interval [ a , b ] [a,b] [ a , b ] is called the quantization interval. We also define the quantized approximation of the original floating point vector, which is our best estimate of the floating point vector after we’ve applied quantization. We never directly work with this quantity, but it is convenient to describe the overall approach. This is defined as follows x ˉ = a 1 + b − a 2 n − 1 x ^ \\bar{x} = a 1 + \\frac{b-a}{2^n-1} \\hat{x} x ˉ = a 1 + 2 n − 1 b − a ​ x ^ Here, 1 1 1 denotes a vector whose components are all equal to one. Note that we abuse notation here and below by using 1 1 1 to denote both a vector and scalar and we rely on the context to make the meaning clear. The need for speed A key requirement we identified for scalar quantization is that we can perform distance comparisons directly on integer vectors. Integer arithmetic operations are more compute efficient than floating point ones. Furthermore, higher throughput specialized hardware instructions exist for performing them in parallel. We’ve discussed how to achieve this in the context of scalar quantization before. Below we show that as long as we cache a couple of corrective factors we can use a different quantization interval and number of bits for each vector we quantize and still compute the dot product using integer vectors. This is the initial step towards achieving OSQ. First of all observe that our best estimate of the dot product between two quantized vectors is y t x = ( a y 1 + b y − a y 2 n y − 1 y ^ ) t ( a x 1 + b x − a x 2 n x − 1 x ^ ) y^t x = \\left(a_y 1 + \\frac{b_y-a_y}{2^{n_y} - 1}\\hat{y}\\right)^t \\left(a_x 1 + \\frac{b_x-a_x}{2^{n_x} - 1}\\hat{x}\\right) y t x = ( a y ​ 1 + 2 n y ​ − 1 b y ​ − a y ​ ​ y ^ ​ ) t ( a x ​ 1 + 2 n x ​ − 1 b x ​ − a x ​ ​ x ^ ) Expanding we obtain y ^ t x ^ = a y a x 1 t 1 + b y − a y 2 n y − 1 a x 1 t y ^ + b x − a x 2 n x − 1 a y 1 t x ^ + b y − a y 2 n y − 1 b x − a x 2 n x − 1 y ^ t x ^ \\hat{y}^t \\hat{x} = a_y a_x 1^t 1 + \\frac{b_y-a_y}{2^{n_y} - 1} a_x 1^t \\hat{y} + \\frac{b_x-a_x}{2^{n_x} - 1} a_y 1^t \\hat{x} + \\frac{b_y-a_y}{2^{n_y} - 1} \\frac{b_x-a_x}{2^{n_x} - 1} \\hat{y}^t \\hat{x} y ^ ​ t x ^ = a y ​ a x ​ 1 t 1 + 2 n y ​ − 1 b y ​ − a y ​ ​ a x ​ 1 t y ^ ​ + 2 n x ​ − 1 b x ​ − a x ​ ​ a y ​ 1 t x ^ + 2 n y ​ − 1 b y ​ − a y ​ ​ 2 n x ​ − 1 b x ​ − a x ​ ​ y ^ ​ t x ^ Focusing on the vector dot products, since these dominate the compute cost, we observe that 1 t 1 1^t 1 1 t 1 is just equal to the vector dimension, and 1 t y ^ 1^t\\hat{y} 1 t y ^ ​ and 1 t x ^ 1^t\\hat{x} 1 t x ^ are the sums of the quantized query and document vector components, respectively. For the query this can be computed once upfront and for the documents these can be computed at index time and stored with the quantized vector. Therefore, we need only compute the integer vector dot product y ^ t x ^ \\hat{y}^t\\hat{x} y ^ ​ t x ^ per comparison. The geometry of scalar quantization To build a bit more intuition about OSQ we digress to understand more about how scalar quantization represents a vector. Observe that the set of all possible quantized vectors lie inside a cube centered on the point a + b 2 1 \\frac{a+b}{2}1 2 a + b ​ 1 with side length b − a b−a b − a . If only 1 bit is being used then the possible vectors lie at the 2 d 2^d 2 d corners of the cube. Otherwise, the possible vectors lie on a regular grid with 2 n d 2^{nd} 2 n d points. By changing a a a and b b b , we are only able to expand or contract the cube or slide it along the line spanned by the 1 1 1 vector. In particular, suppose the vectors in the corpus are centered around some point m m m . We can decompose this into a component parallel to the 1 1 1 vector and a component perpendicular to it as follows m = 1 1 t d m + ( I − 1 1 t d ) m m = \\frac{1 1^t}{d} m + \\left(I - \\frac{1 1^t}{d}\\right) m m = d 1 1 t ​ m + ( I − d 1 1 t ​ ) m If we center the cube on the point at 1 t m d 1 \\frac{1^t m}{d}1 d 1 t m ​ 1 then it must still expand it to encompass the offset m − 1 t m d 1 m-\\frac{1^t m}{d}1 m − d 1 t m ​ 1 before it covers even the center of the data distribution. This is illustrated in the figure below for 2 dimensions. Since the quantization errors will be proportional on average to the side length of the cube, this suggests that we want to minimize ∥ m − 1 t m d 1 ∥ \\| m-\\frac{1^t m}{d}1 \\| ∥ m − d 1 t m ​ 1∥ . An easy way to do this is to center the query and document vectors before quantizing. We show below that if we do this we can still recover the dot product between the original vectors. Note that y t x = ( y − m + m ) t ( x − m + m ) = ( y − m ) t ( x − m ) + m t y + m t x − m t m \\begin{align*} y^t x &= (y - m + m)^t (x - m + m) \\\\ &= (y - m)^t (x - m) + m^t y + m^t x - m^t m \\end{align*} y t x ​ = ( y − m + m ) t ( x − m + m ) = ( y − m ) t ( x − m ) + m t y + m t x − m t m ​ The values m t y m^t y m t y and m t x m^t x m t x , which are the inner product between the query and document vector and the centroid of the vectors in the corpus, can be precomputed and in the case of the document vector cached with its quantized representation. The quantity m t m m^t m m t m is a global constant. This means we only need to estimate ( y − m ) t ( x − m ) (y−m)^t(x−m) ( y − m ) t ( x − m ) when comparing a query and document vector. In other words, we can quantize the centered vectors and recover an estimate of the actual dot product. The distribution of centered vectors So far we’ve shown that we can use a different bit count and a different quantization interval per vector. We next show how to exploit this to significantly improve the accuracy of scalar quantization. We propose an effective criterion and procedure for optimizing the choice of the constants a a a and b b b . However, before discussing this it is useful to see some examples of the component distribution in real centered embedding vectors. We observe that the values are all fairly normally distributed and will use this observation to choose our initial quantization interval. Looking at these plots one might guess it would be beneficial to scale components. Specifically, it seems natural to standardize the component distributions of the Cohere v2 and the gte-base-en-v1.5 embeddings. In general, this would amount to applying a scaling matrix to the document vectors before quantization as follows Diag ( σ ) − 1 ( x − m ) \\text{Diag}(\\sigma)^{-1} (x - m) Diag ( σ ) − 1 ( x − m ) Here, Diag ( σ ) \\text{Diag}(\\sigma) Diag ( σ ) is a diagonal matrix whose diagonal entries are the standard deviations of the components for the corpus as a whole. We can apply this operation and still compute the dot product efficiently because we simply need to apply the inverse transformation to the query vector before quantizing it ( y − m ) t ( x − m ) = ( Diag ( σ ) ( y − m ) ) t ( Diag ( σ ) − 1 ( x − m ) ) (y - m)^t (x - m) = \\left(\\text{Diag}(\\sigma) (y - m)\\right)^t \\left(\\text{Diag}(\\sigma)^{-1} (x - m)\\right) ( y − m ) t ( x − m ) = ( Diag ( σ ) ( y − m ) ) t ( Diag ( σ ) − 1 ( x − m ) ) The effect is not symmetric because, as we’ll discuss below, we use a higher bit count for the query. Referring back to our geometric picture this would amount to stretching the different edges of the cube. We tried this but found it didn’t measurably improve effectiveness once we optimize the interval directly for each vector so avoid the extra complexity. Initializing the quantization interval Let’s consider a natural criterion for setting a global quantization interval for normally distributed data. If we pick a vector, X X X , at random from the corpus then the quantization error is given by ∥ X − x ( a , b ) ∥ 2 = ∑ i = 1 d ( X i − ( a + b − a 2 n − 1 x ^ i ) ) 2 \\| X - x(a,b)\\|^2 = \\sum_{i=1}^d \\left(X_i - \\left(a+\\frac{b-a}{2^n-1}\\hat{x}_i\\right)\\right)^2 ∥ X − x ( a , b ) ∥ 2 = i = 1 ∑ d ​ ( X i ​ − ( a + 2 n − 1 b − a ​ x ^ i ​ ) ) 2 By assumption, and as we showed empirically is often the case, each component of X X X is normally distributed or X i ∼ N ( 0 , σ i ) X_i \\sim N(0,\\sigma_i) X i ​ ∼ N ( 0 , σ i ​ ) . In such a scenario it is reasonable to minimize the expected square error. Specifically, we seek a ∗ , b ∗ = a r g min ⁡ a , b E X [ ∑ i = 1 d ( X i − ( a + b − a 2 n − 1 x ^ i ) ) 2 ] a^*,b^* = arg\\min_{a,b} \\; \\mathbb{E}_{X} \\left[ \\sum_{i=1}^d \\left(X_i - \\left(a+\\frac{b-a}{2^n-1}\\hat{x}_i\\right)\\right)^2 \\right] a ∗ , b ∗ = a r g a , b min ​ E X ​ [ i = 1 ∑ d ​ ( X i ​ − ( a + 2 n − 1 b − a ​ x ^ i ​ ) ) 2 ] Since the expectation is a linear operator it distributes over the summation and we can focus on a single term. Without loss of generality we can assume that X i X_i X i ​ ​ is a unit normal since we can always rescale the interval by the data standard deviation. To compute this expectation we make use of the following quantity I ( x , c ) = 1 2 π ∫ x ( t − c ) 2 e − t 2 / 2 d t = 1 2 ( c 2 + 1 ) erf ( x 2 ) + 1 2 π e − x 2 / 2 ( 2 c − x ) + constant \\begin{align*} I(x,c) &= \\frac{1}{\\sqrt{2\\pi}} \\int^x (t-c)^2 e^{-t^2 / 2} dt \\\\ &= \\frac{1}{2}\\left(c^2+1\\right) \\text{erf}\\left(\\frac{x}{\\sqrt{2}}\\right) + \\frac{1}{\\sqrt{2\\pi}} e^{-x^2/2} (2c-x) + \\text{constant} \\end{align*} I ( x , c ) ​ = 2 π ​ 1 ​ ∫ x ( t − c ) 2 e − t 2 /2 d t = 2 1 ​ ( c 2 + 1 ) erf ( 2 ​ x ​ ) + 2 π ​ 1 ​ e − x 2 /2 ( 2 c − x ) + constant ​ This is the expectation of the square error when rounding a normally distributed value to a fixed point c c c expressed as an indefinite integral, alternatively, before we’ve determined the range the value can take. In order to minimize the expected quantization error we should snap floating point values to their nearest grid point. This means we can express the expected square quantization error as follows E r r o r e ( a , b | n ) = I ( a + s 2 , a ) − I ( − ∞ , a ) + ∑ i = 1 2 n − 2 I ( a + 2 i + 1 2 s , a + i s ) − I ( a + 2 i − 1 2 s , a + i s ) + I ( ∞ , b ) − I ( a + 2 n + 1 − 3 2 s , b ) \\begin{align*} Error_e\\left(a,b\\; \\middle| \\; n\\right) =& \\; I(a+\\frac{s}{2}, a) - I(-\\infty, a)\\;+ \\\\ & \\sum_{i=1}^{2^n-2} I\\left(a+\\frac{2i+1}{2}s, a+is\\right) - I\\left(a+\\frac{2i-1}{2}s, a+is\\right) + \\\\ & I(\\infty, b) - I\\left(a+\\frac{2^{n+1}-3}{2}s, b\\right) \\end{align*} E rro r e ​ ( a , b ∣ n ) = ​ I ( a + 2 s ​ , a ) − I ( − ∞ , a ) + i = 1 ∑ 2 n − 2 ​ I ( a + 2 2 i + 1 ​ s , a + i s ) − I ( a + 2 2 i − 1 ​ s , a + i s ) + I ( ∞ , b ) − I ( a + 2 2 n + 1 − 3 ​ s , b ) ​ Where we defined s = b − a 2 n − 1 s=\\frac{b−a}{2^n−1} s = 2 n − 1 b − a ​ ​. The integration limits are determined by the condition that we snap to the nearest grid point. We now have a function, in terms of the interval endpoints a a a and b b b , for the expected square quantization error using a reasonable assumption about the vector components’ distribution. It is relatively straightforward to show that this quantity is minimized by an interval centered on the origin. This means that we need to search for the value of a single variable z z z which minimizes E r r o r e ( − z , z | n ) Error_e\\left(-z,z\\; \\middle| \\; n\\right) E rro r e ​ ( − z , z ∣ n ) . The figure below shows the error as a function of z z z for various bit counts. Optimizing this function numerically for various choices of n n n gives the following quantization intervals [ a ( 1 ) , b ( 1 ) ] = [ − 0.798 , 0.798 ] [ a ( 2 ) , b ( 2 ) ] = [ − 1.493 , 1.493 ] [ a ( 3 ) , b ( 3 ) ] = [ − 2.051 , 2.051 ] [ a ( 4 ) , b ( 4 ) ] = [ − 2.514 , 2.514 ] [ a ( 7 ) , b ( 7 ) ] = [ − 3.611 , 3.611 ] \\begin{align*} \\left[a_{(1)}, b_{(1)}\\right] &= [-0.798, 0.798] \\\\ \\left[a_{(2)}, b_{(2)}\\right] &= [-1.493, 1.493] \\\\ \\left[a_{(3)}, b_{(3)}\\right] &= [-2.051, 2.051] \\\\ \\left[a_{(4)}, b_{(4)}\\right] &= [-2.514, 2.514] \\\\ \\left[a_{(7)}, b_{(7)}\\right] &= [-3.611, 3.611] \\\\ \\end{align*} [ a ( 1 ) ​ , b ( 1 ) ​ ] [ a ( 2 ) ​ , b ( 2 ) ​ ] [ a ( 3 ) ​ , b ( 3 ) ​ ] [ a ( 4 ) ​ , b ( 4 ) ​ ] [ a ( 7 ) ​ , b ( 7 ) ​ ] ​ = [ − 0.798 , 0.798 ] = [ − 1.493 , 1.493 ] = [ − 2.051 , 2.051 ] = [ − 2.514 , 2.514 ] = [ − 3.611 , 3.611 ] ​ We’re denoting the interval for n n n bits [ a ( n ) , b ( n ) ] \\left[a_{(n)}, b_{(n)}\\right] [ a ( n ) ​ , b ( n ) ​ ] . Finally, we need to map these fixed intervals to the specific interval to use for a vector x x x . To do this we shift by the mean of its components m x m_x m x ​ ​ and scale by their standard deviation σ x \\sigma_x σ x ​ ​. It’s clear that we should always choose a ≥ x m i n a\\geq x_{min} a ≥ x min ​ ​ and b ≤ x m a x b\\leq x_{max} b ≤ x ma x ​ . Therefore, our initial estimate for the quantization interval for a vector x x x using n n n bits per component is [ max ⁡ ( m x + a ( n ) σ x , x m i n ) , min ⁡ ( m x + b ( n ) σ x , x m a x ) ] \\left[\\max\\left(m_x+a_{(n)}\\sigma_x, x_{min}\\right), \\min\\left(m_x+b_{(n)}\\sigma_x, x_{max}\\right)\\right] [ max ( m x ​ + a ( n ) ​ σ x ​ , x min ​ ) , min ( m x ​ + b ( n ) ​ σ x ​ , x ma x ​ ) ] Refining the quantization interval The initialization scheme actually works surprisingly well. We present the results of quantizing using this approach alone as one of the ablation studies when we examine the performance of OSQ. However, we can do better. It has been noted in the context of PQ that targeting minimum square quantization error is not actually the best criterion when you care about recall. In particular, you know you’re going to be running nearest neighbor queries on the corpus and the nearest neighbors of a query are very likely to be fairly parallel to the query vector. Let’s consider what this means. Suppose we have query vector y ∣ ∣ y_{||} y ∣∣ ​ for which the document is relevant. Ignoring the quantization of the query vector, we can decompose the square of the error in the dot product into a component that is parallel to and a component which is perpendicular to the document vector as follows ( y ∣ ∣ t ( x − x ˉ ) ) 2 = ( ( x x t ∥ x ∥ 2 y ∣ ∣ ) t x x t ∥ x ∥ 2 ( x − x ˉ ) + ( ( I − x x t ∥ x ∥ 2 ) y ∣ ∣ ) t ( I − x x t ∥ x ∥ 2 ) ( x − x ˉ ) ) 2 \\left(y_{||}^t (x - \\bar{x})\\right)^2 = \\left(\\left(\\frac{x x^t}{\\|x\\|^2}y_{||}\\right)^t \\frac{x x^t}{{\\|x\\|^2}}(x - \\bar{x}) + \\left(\\left(I - \\frac{x x^t}{\\|x\\|^2}\\right) y_{||}\\right)^t \\left(I - \\frac{x x^t}{\\|x\\|^2}\\right) (x - \\bar{x}) \\right)^2 ( y ∣∣ t ​ ( x − x ˉ ) ) 2 = ( ( ∥ x ∥ 2 x x t ​ y ∣∣ ​ ) t ∥ x ∥ 2 x x t ​ ( x − x ˉ ) + ( ( I − ∥ x ∥ 2 x x t ​ ) y ∣∣ ​ ) t ( I − ∥ x ∥ 2 x x t ​ ) ( x − x ˉ ) ) 2 Now we expect ∥ ( I − x x t ∥ x ∥ 2 ) y ∣ ∣ ∥ ≪ ∥ x x t ∥ x ∥ 2 y ∣ ∣ ∥ \\left\\|\\left(I-\\frac{x x^t}{\\|x\\|^2}\\right) y_{||}\\right\\| \\ll \\left\\|\\frac{x x^t}{\\|x\\|^2}y_{||}\\right\\| ​ ( I − ∥ x ∥ 2 x x t ​ ) y ∣∣ ​ ​ ≪ ​ ∥ x ∥ 2 x x t ​ y ∣∣ ​ ​ . Furthermore, we would like to minimize the error in the dot product in this specific case, since this is when the document is relevant and our query should return it. Let ∥ ( I − x x t ∥ x ∥ 2 ) y ∣ ∣ ∥ = λ ∥ x x t ∥ x ∥ 2 y ∣ ∣ ∥ \\left\\|\\left(I-\\frac{x x^t}{\\|x\\|^2}\\right) y_{||}\\right\\| = \\sqrt{\\lambda} \\left\\|\\frac{x x^t}{\\|x\\|^2}y_{||}\\right\\| ​ ( I − ∥ x ∥ 2 x x t ​ ) y ∣∣ ​ ​ = λ ​ ​ ∥ x ∥ 2 x x t ​ y ∣∣ ​ ​ for some λ ≪ 1 \\lambda \\ll 1 λ ≪ 1 . We can bound our quantization error in the dot product as follows ( y ∣ ∣ t ( x − x ˉ ) ) 2 ≤ ∥ x x t ∥ x ∥ 2 y ∣ ∣ ∥ 2 ∥ x x t ∥ x ∥ 2 ( x − x ˉ ) + λ ( I − x x t ∥ x ∥ 2 ) ( x − x ˉ ) ∥ 2 \\left(y_{||}^t (x - \\bar{x})\\right)^2 \\leq \\left\\|\\frac{x x^t}{\\|x\\|^2}y_{||}\\right\\|^2 \\left\\|\\frac{x x^t}{\\|x\\|^2}(x - \\bar{x})+\\sqrt{\\lambda} \\left( I - \\frac{x x^t}{\\|x\\|^2}\\right)(x - \\bar{x}) \\right\\|^2 ( y ∣∣ t ​ ( x − x ˉ ) ) 2 ≤ ​ ∥ x ∥ 2 x x t ​ y ∣∣ ​ ​ 2 ​ ∥ x ∥ 2 x x t ​ ( x − x ˉ ) + λ ​ ( I − ∥ x ∥ 2 x x t ​ ) ( x − x ˉ ) ​ 2 Whatever way we quantize we can’t affect the quantity ∥ x x t ∥ x ∥ 2 y ∣ ∣ ∥ \\left\\|\\frac{x x^t}{\\|x\\|^2}y_{||}\\right\\| ​ ∥ x ∥ 2 x x t ​ y ∣∣ ​ ​ so all we care about is minimizing the second factor. A little linear algebra shows that this is equal to E r r o r ( a , b | λ ) = ( x ˉ ( a , b ) − x ) t ( x x t ∥ x ∥ 2 + λ ( I − x x t ∥ x ∥ 2 ) ) ( x ˉ ( a , b ) − x ) Error \\left(a, b\\; \\middle| \\; \\lambda \\right) = \\left(\\bar{x}(a,b) - x\\right)^t \\left( \\frac{x x^t}{\\|x\\|^2} + \\lambda \\left(I-\\frac{x x^t}{\\|x\\|^2}\\right)\\right) \\left(\\bar{x}(a,b) - x\\right) E rror ( a , b ∣ λ ) = ( x ˉ ( a , b ) − x ) t ( ∥ x ∥ 2 x x t ​ + λ ( I − ∥ x ∥ 2 x x t ​ ) ) ( x ˉ ( a , b ) − x ) Here, we’ve made the dependence of this expression on the quantization interval explicit. So far we’ve proposed a natural quantity to minimize in order to reduce the impact of quantization on MIP recall. We now turn our attention to how to efficiently minimize this quantity with respect to the interval [ a , b ] [a,b] [ a , b ] . There is a complication because the assignment of components of the vector to grid points depends on a a a and b b b while the optimal choice for a a a and b b b depends on this assignment. We use a coordinate descent approach, which alternates between computing the quantized vector x ^ \\hat{x} x ^ while holding a a a and b b b fixed, and optimizing the quantization interval while holding x ^ \\hat{x} x ^ fixed. This is described schematically as follows. 1 \\;\\; a 0 , b 0 ← max ⁡ ( m x + a ( n ) σ x , x m i n ) , min ⁡ ( m x + b ( n ) σ x , x m a x ) a_0,b_0 \\leftarrow \\max\\left(m_x+a_{(n)}\\sigma_x, x_{min}\\right), \\min\\left(m_x+b_{(n)}\\sigma_x, x_{max}\\right) a 0 ​ , b 0 ​ ← max ( m x ​ + a ( n ) ​ σ x ​ , x min ​ ) , min ( m x ​ + b ( n ) ​ σ x ​ , x ma x ​ ) 2 \\;\\; for k ∈ { 1 , 2 , . . . , r o u n d s } k \\in \\{1,2,...,rounds\\} k ∈ { 1 , 2 , ... , ro u n d s } do 3 \\;\\;\\;\\; compute x ^ ( k ) \\hat{x}_{(k)} x ^ ( k ) ​ from a k − 1 a_{k-1} a k − 1 ​ and b k − 1 b_{k-1} b k − 1 ​ 4 \\;\\;\\;\\; a k , b k ← a r g min ⁡ a , b E r r o r ( a , b | λ , x ^ ( k ) ) a_k,b_k \\leftarrow arg\\min_{a,b}\\; Error\\left(a,b\\; \\middle| \\; \\lambda, \\hat{x}_{(k)} \\right) a k ​ , b k ​ ← a r g min a , b ​ E rror ( a , b ​ λ , x ^ ( k ) ​ ) 5 \\;\\;\\;\\; if E r r o r ( a k , b k | λ , x ^ ( k ) ) > E r r o r ( a k − 1 , b k − 1 | λ , x ^ ( k − 1 ) ) Error\\left(a_k,b_k\\; \\middle| \\; \\lambda, \\hat{x}_{(k)} \\right) > Error\\left(a_{k-1},b_{k-1}\\; \\middle| \\; \\lambda, \\hat{x}_{(k-1)} \\right) E rror ( a k ​ , b k ​ ​ λ , x ^ ( k ) ​ ) > E rror ( a k − 1 ​ , b k − 1 ​ ​ λ , x ^ ( k − 1 ) ​ ) then break First we’ll focus on line 3. The simplest approach to compute x ^ \\hat{x} x ^ uses standard scalar quantization x ^ ( k ) , i = round ( 2 n − 1 b k − 1 − a k − 1 ( clamp ( x i , a k − 1 , b k − 1 ) − a k − 1 ) ) \\hat{x}_{(k), i} = \\text{round}\\left(\\frac{2^n-1}{b_{k-1}-a_{k-1}} \\left(\\text{clamp}(x_i, a_{k-1}, b_{k-1}) - a_{k-1}\\right)\\right) x ^ ( k ) , i ​ = round ( b k − 1 ​ − a k − 1 ​ 2 n − 1 ​ ( clamp ( x i ​ , a k − 1 ​ , b k − 1 ​ ) − a k − 1 ​ ) ) Specifically, this would amount to snapping each component of x x x to the nearest grid point. In practice this does not minimize E r r o r ( a , b | λ ) Error\\left(a,b\\; \\middle| \\; \\lambda\\right) E rror ( a , b ∣ λ ) as we illustrate below. Unfortunately, we can’t just enumerate grid points and find the minimum error one since there are 2 n d 2^{nd} 2 n d candidates; therefore, we tried the following heuristic: Snap to the nearest grid point, Coordinate-wise check if rounding in the other direction reduces the error. This isn’t guaranteed to find the global optimum, which in fact isn’t guaranteed to be one of the corners of the grid square containing the floating point vector. However, in practice we found it meant the error almost never increased in an iteration of the loop over k k k . By contrast, when snapping to the nearest grid point the loop frequently exits due to this condition. The heuristic yields a small but systematic improvement in the brute force recall. On average it amounted to +0.3% compared to just snapping to the nearest grid point. Given the impact is so small we decided it wasn’t worth the extra complexity and increased runtime. We now turn our attention to line 4. This expression decomposes as follows E r r o r ( a , b | λ , x ^ ) = 1 − λ ∥ x ∥ 2 ( x t ( x ˉ ( a , b ) − x ) ) 2 + λ ( x ˉ ( a , b ) − x ) t ( x ˉ ( a , b ) − x ) Error\\left(a,b\\; \\middle| \\; \\lambda, \\hat{x} \\right) = \\frac{1-\\lambda}{\\|x\\|^2} \\left(x^t(\\bar{x}(a,b)-x)\\right)^2 + \\lambda \\left(\\bar{x}(a,b)-x\\right)^t \\left(\\bar{x}(a,b)-x\\right) E rror ( a , b ∣ λ , x ^ ) = ∥ x ∥ 2 1 − λ ​ ( x t ( x ˉ ( a , b ) − x ) ) 2 + λ ( x ˉ ( a , b ) − x ) t ( x ˉ ( a , b ) − x ) It’s fairly easy to see that this is a convex quadratic form of a a a and b b b . This means it has a unique minimum where the partial derivatives w.r.t. a a a and b b b vanish. We won’t show the full calculation but give a flavor. For example, we can use the chain rule to help evaluate the first term ∂ ∂ a ( x t ( x ˉ ( a , b ) − x ) ) 2 = 2 x t ( x ˉ ( a , b ) − x ) ∂ x t x ˉ ( a , b ) ∂ a \\frac{\\partial}{\\partial a} \\left(x^t(\\bar{x}(a,b)-x)\\right)^2 = 2 x^t(\\bar{x}(a,b)-x) \\frac{\\partial x^t\\bar{x}(a,b)}{\\partial a} ∂ a ∂ ​ ( x t ( x ˉ ( a , b ) − x ) ) 2 = 2 x t ( x ˉ ( a , b ) − x ) ∂ a ∂ x t x ˉ ( a , b ) ​ then ∂ x t x ˉ ( a , b ) ∂ a = ∂ ∂ a ∑ i = 1 d x i ( a + b − a 2 n − 1 x ^ i ) = ∑ i = 1 d x i ( 1 − 1 2 n − 1 x ^ i ) = ∑ i = 1 d x i ( 1 − s i ) \\frac{\\partial x^t\\bar{x}(a,b)}{\\partial a} = \\frac{\\partial}{\\partial a} \\sum_{i=1}^d x_i \\left(a + \\frac{b-a}{2^n-1}\\hat{x}_i\\right) = \\sum_{i=1}^d x_i \\left(1-\\frac{1}{2^n-1}\\hat{x}_i\\right) = \\sum_{i=1}^d x_i (1-s_i) ∂ a ∂ x t x ˉ ( a , b ) ​ = ∂ a ∂ ​ i = 1 ∑ d ​ x i ​ ( a + 2 n − 1 b − a ​ x ^ i ​ ) = i = 1 ∑ d ​ x i ​ ( 1 − 2 n − 1 1 ​ x ^ i ​ ) = i = 1 ∑ d ​ x i ​ ( 1 − s i ​ ) Where we’ve defined s i = 1 2 n − 1 x ^ i s_i = \\frac{1}{2^n-1}\\hat{x}_i s i ​ = 2 n − 1 1 ​ x ^ i ​ . The final result is that the optimal interval satisfies [ 1 − λ ∥ x ∥ 2 ( ∑ i x i ( 1 − s i ) ) 2 + λ ∑ i ( 1 − s i ) 2 1 − λ ∥ x ∥ 2 ( ∑ i x i ( 1 − s i ) ) ∑ i x i s i + λ ∑ i ( 1 − s i ) s i 1 − λ ∥ x ∥ 2 ( ∑ i x i ( 1 − s i ) ) ∑ i x i s i + λ ∑ i ( 1 − s i ) s i 1 − λ ∥ x ∥ 2 ( ∑ i x i s i ) 2 + λ ∑ i s i 2 ] [ a b ] = [ ∑ i x i ( 1 − s i ) ∑ i x i s i ] \\footnotesize \\left[ \\begin{matrix} \\frac{1-\\lambda}{\\|x\\|^2}\\left(\\sum_i x_i(1-s_i)\\right)^2+\\lambda \\sum_i (1-s_i)^2 & \\frac{1-\\lambda}{\\|x\\|^2}\\left(\\sum_i x_i(1-s_i)\\right)\\sum_i x_i s_i+\\lambda \\sum_i(1-s_i)s_i \\\\ \\frac{1-\\lambda}{\\|x\\|^2}\\left(\\sum_i x_i(1-s_i)\\right)\\sum_i x_i s_i+\\lambda \\sum_i(1-s_i)s_i & \\frac{1-\\lambda}{\\|x\\|^2}\\left(\\sum_i x_i s_i\\right)^2 + \\lambda \\sum_i s_i^2 \\end{matrix} \\right] \\left[\\begin{matrix}a \\\\ b\\end{matrix}\\right]= \\left[\\begin{matrix}\\sum_i x_i(1-s_i) \\\\ \\sum_i x_i s_i \\end{matrix}\\right] [ ∥ x ∥ 2 1 − λ ​ ( ∑ i ​ x i ​ ( 1 − s i ​ ) ) 2 + λ ∑ i ​ ( 1 − s i ​ ) 2 ∥ x ∥ 2 1 − λ ​ ( ∑ i ​ x i ​ ( 1 − s i ​ ) ) ∑ i ​ x i ​ s i ​ + λ ∑ i ​ ( 1 − s i ​ ) s i ​ ​ ∥ x ∥ 2 1 − λ ​ ( ∑ i ​ x i ​ ( 1 − s i ​ ) ) ∑ i ​ x i ​ s i ​ + λ ∑ i ​ ( 1 − s i ​ ) s i ​ ∥ x ∥ 2 1 − λ ​ ( ∑ i ​ x i ​ s i ​ ) 2 + λ ∑ i ​ s i 2 ​ ​ ] [ a b ​ ] = [ ∑ i ​ x i ​ ( 1 − s i ​ ) ∑ i ​ x i ​ s i ​ ​ ] which is trivial to solve for a a a ​ and b b b ​. Taken together with the preprocessing and interval initialization steps this defines OSQ. The query and document vectors are different We noted already that each vector could choose its bit count. Whilst there are certain technical disadvantages to using different bit counts for different document vectors, the query and the document vectors are different. In particular, the compression factor the quantization scheme yields only depends on the number of bits used to represent the document vectors. There are side effects from using more bits to represent the query, principally that it can affect the dot product performance. Also there’s a limit to what one can gain based on the document quantization error. However, we get large recall gains by using asymmetric quantization for high compression factors and this translates to significant net wins in terms of recall as a function of query latency. Therefore, we always quantize the query using at least 4 bits. How does it perform? In this section we compare the brute force recall@ k ∣ n k|n k ∣ n and the correlation between the estimated and actual dot products for OSQ and our baseline BBQ quantization scheme, which was an adaptation of RaBitQ to use with an HNSW index. We have previously evaluated this baseline scheme and its accuracy is commensurate with RaBitQ. The authors of RaBitQ did extensive comparisons with alternative methods showing its superiority; we therefore consider it sufficient to simply compare to this very strong baseline. We also perform a couple of ablation studies: against a global interval and against the per vector intervals calculated by our initialization scheme. To compute a global interval we use the OSQ initialization strategy, but compute the mean and variance of the corpus as a whole. For low bit counts, this is significantly better than the usual fixed interval quantization scheme, which tends to use something like the 99th percentile centered confidence interval. High confidence intervals in such cases often completely fail because the grid points are far from the majority of the data. We evaluate against a variety of embeddings e5-small-v2 arctic-embed-m gte-base-en-v1.5 Cohere v2 Gist descriptors and datasets Quora FiQA A sample of around 1M passages from the English portion of wikipedia-22-12 GIST1M The embedding dimensions vary from 384 to 960. E5, Arctic and GTE use cosine similarity, and Cohere and GIST using MIP. The datasets vary from around 100k to 1M vectors. In all our experiments we set λ = 0.1 \\lambda=0.1 λ = 0.1 , which we found to be an effective choice. First off we study 1 bit quantization. We report brute force recall in these experiments to take any effects of the indexing choice out of the picture. As such we do not compare recall vs latency curves, which are very strongly affected by both the indexing data structure and the dot product implementation. Instead we focus on the recall at 10 reranking the top n hits. In our next blog we’ll study how OSQ behaves when it is integrated with Lucene's HNSW index implementation and turn our attention to query latency. The figures below show our brute force recall@ 10 ∣ n 10|n 10∣ n for n ∈ { 10 , 20 , 30 , 40 , 50 } n\\in\\{10,20,30,40,50\\} n ∈ { 10 , 20 , 30 , 40 , 50 } . Rolling these up into the average recall@ 10 ∣ n 10|n 10∣ n for the 8 retrieval experiments we tested we get the table below. n Baseline average recall@10|n OSQ average recall@10|n Global average recall@10|n Initialization average recall@10|n 10 0.71 0.74 0.54 0.65 20 0.88 0.90 0.69 0.81 30 0.93 0.94 0.76 0.86 40 0.95 0.96 0.80 0.89 50 0.96 0.97 0.82 0.90 Compared to the baseline we gain 2% on average in recall. As for the ablation study, compared to using a global quantization interval we gain 26% and compared to our initial per vector quantization intervals we gain 10%. The figure below shows the floating point dot product values versus the corresponding 1 bit quantized dot product values for a sample of 2000 gte-base-en-v1.5 embeddings. Visually the correlation is high. We can quantify this by computing the R 2 R^2 R 2 between the floating point and quantized dot products. For each dataset and model combination we computed the average R 2 R^2 R 2 for every query against the full corpus. We see small but systematic improvement in R 2 R^2 R 2 comparing OSQ to the baseline. The table below shows the R 2 R^2 R 2 values broken down by the dataset and model combinations we tested. Dataset Model Baseline R2 OSQ R2 FiQA e5-small 0.849 0.865 FiQA arctic 0.850 0.863 FiQA gte 0.925 0.930 Quora e5-small 0.868 0.881 Quora arctic 0.817 0.838 Quora gte 0.887 0.897 Wiki Cohere v2 0.884 0.898 GIST1M - 0.953 0.974 Interestingly, when we integrated OSQ with HNSW we got substantially larger improvements in recall than we see for brute force search, as we’ll show in our next blog. One hypothesis we have is that the improvements we see in correlation with the true floating point dot products are more beneficial for graph search than brute force search. For many queries the tail of high scoring documents are well separated from the bulk of the score distribution and are less prone to being reordered by quantization errors. By contrast we have to navigate through regions of low scoring documents as we traverse the HNSW graph. Here any gains in accuracy can be important. Finally, the table below compares the average recall for 1 and 2 bit OSQ for the same 8 retrieval experiments. With 2 bits we reach 95% recall reranking between 10 and 20 candidates. The average R 2 R^2 R 2 rises from 0.893 for 1 bit to 0.968 for 2 bit. n 1 bit OSQ average recall@10|n 2 bit OSQ average recall@10|n 10 0.74 0.84 20 0.90 0.97 30 0.94 0.99 40 0.96 0.995 50 0.97 0.997 Conclusion We’ve presented an improved automatic scalar quantization scheme which allows us to achieve high recall with relatively modest reranking depth. Avoiding deep reranking has significant system advantages. For 1 bit quantization we compared it to a very strong baseline and showed it was able to achieve systematic improvements in both recall and the accuracy with which it approximates the floating point vector distance calculation. Therefore, we feel comfortable saying that it sets new state-of-the-art performance for at 32 × \\times × compression of the raw vectors. It also allows one to simply trade compression for retrieval quality using the same underlying approach and achieves significant performance improvements compared to standard scalar quantization techniques. We are working on integrating this new approach into Elasticsearch. In our next blog we will discuss how it is able to enhance the performance of our existing BBQ scalar quantization index formats. Report an issue Related content Vector Database Lucene December 4, 2024 Smokin' fast BBQ with hardware accelerated SIMD instructions How we optimized vector comparisons in BBQ with hardware accelerated SIMD (Single Instruction Multiple Data) instructions. CH By: Chris Hegarty Lucene Vector Database November 11, 2024 Better Binary Quantization (BBQ) in Lucene and Elasticsearch How Better Binary Quantization (BBQ) works in Lucene and Elasticsearch. BT By: Benjamin Trent Vector Database Lucene +1 October 22, 2024 RaBitQ binary quantization 101 Understand the most critical components of RaBitQ binary quantization, how it works and its benefits. This guide also covers the math behind the quantization and examples. JW By: John Wagster Vector Database Lucene +1 November 18, 2024 Better Binary Quantization (BBQ) vs. Product Quantization Why we chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch. BT By: Benjamin Trent Lucene ML Research November 11, 2023 Understanding scalar quantization in Lucene Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights. BT By: Benjamin Trent Jump to Measuring success Motivation Dot products are enough Scalar quantization refresher The need for speed Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Understanding optimized scalar quantization - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/scalar-quantization-optimization",
+    "meta_description": "Get a refresher of scalar quantization and learn about optimized scalar quantization, a new form of scalar quantization we've developed at Elastic."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Accessing machine learning models in Elastic Explore the machine learning (ML) models supported in Elastic, the Eland library for loading models and how to apply transformers & NLP in Elastic. Integrations BS JD By: Bernhard Suhm and Josh Devins On June 21, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This blog explores the machine learning (ML) models supported in Elastic, including built-in, third-party, and custom models. It also discusses the Eland library for loading models and explains how to apply transformers and NLP in Elastic, which is a common use case in the context of search applications. Elastic supports the machine learning models you need Elastic lets you apply the machine learning (ML) that’s appropriate for your use case and level of ML expertise. You have multiple options: Leverage the models that come built-in. Aside from models targeting specific security threats and types of system issues in our observability and security solution, you can use our proprietary Elastic Learned Sparse Encoder model out of the box, as well as a language identification — useful if you’re working with non-English text data. Access third-party PyTorch models from anywhere including the HuggingFace model hub. Load a model you trained yourself — primarily NLP transformers at this point. Using built-in models gets you value out of the box, without requiring any ML expertise from you, yet you have the flexibility to try out different models and determine what performs best on your data We designed our model management to be scalable across multiple nodes in a cluster, while also ensuring good inference performance for both high throughput and low latency workloads. That’s in part by empowering ingest pipelines to run inference and by using dedicated nodes for the computationally demanding model inference — during the ingestion phase, as well as data analysis and search. Read on to learn more about the Eland library that lets you load models into Elastic and how that plays out for the various types of machine learning you might use within Elasticsearch — from the latest transformer and natural language processing (NLP) models to boosted tree models for regression. Eland lets you load ML models into Elastic Our Eland library provides an easy interface to load ML models into Elasticsearch — provided they were trained using PyTorch. Using the native library libtorch, and expecting models that have been exported or saved as a TorchScript representation, Elasticsearch avoids running a Python interpreter while performing model inference. By integrating with one of the most popular formats for building NLP models in PyTorch, Elasticsearch can provide a platform that works with a large variety of NLP tasks and use cases. We’ll get more into that in the section on transformers that follows. You have three options for using Eland to upload a model: command-line, Docker, and from within your own Python code. Docker is less complex because it does not require a local installation of Eland and all of its dependencies. Once you have access to Eland, the code sample below shows how to upload a DistilBERT NER model, as an example: Further below we’ll walk through each of the arguments of eland_import_hub_model. And you can issue the same command from a Docker container. Once uploaded, Kibana’s ML Model Management user interface lets you manage the models on an Elasticsearch cluster, including increasing allocations for additional throughput, and stop/resume models while (re)configuring your system. Which models Elastic support? Elastic supports a variety of transformer models, as well as the most popular supervised learning libraries: NLP and embedding models: All transformers that conform to the standard BERT model interface and use the WordPiece tokenization algorithm. View a complete list of supported model architectures. Supervised learning: Trained models from scikit-learn, XGBoost, and LightGBM libraries to be serialized and used as an inference model in Elasticsearch. Our documentation provides an example for training an XGBoost classify on data in Elastic . You can also export and import supervised models trained in Elastic with our data frame analytics. Generative AI: You can use the API provided for the LLM to pass queries — potentially enriched with context retrieved from Elastic — and process the results returned. For further instructions, refer to this blog , which links to a GitHub repository with example code for communicating via ChatGPT’s API. Below we provide more information for the type of model you’re most likely to use in the context of search applications: NLP transformers. How to apply transformers and NLP in Elastic, with ease! Let us walk you through the steps to load and use an NLP model, for example a popular NER model from Hugging Face, going over the arguments identified in below code snippet. Specify the Elastic Cloud identifier. Alternatively, use --url . Provide authentication details to access your cluster. You can look up available authentication methods . Specify the identifier for the model in the Hugging Face model hub. Specify the type of NLP task. Supported values are fill_mask, ner, text_classification, text_embedding, and zero_shot_classification. Once you’ve loaded the model, next you need to deploy it. You accomplish this on the Model Management screen of the Machine Learning tab in Kibana. Then you’d typically test the model to ensure it’s working properly. Now you’re ready to use the deployed model for inference. For example to extract named entities, you call the _infer endpoint on the loaded NER model: The model identifies two entities: the person \"Josh\" and the location \"Berlin.\" For additional steps, like using this model in an inference pipeline and tuning the deployment, read the blog that describes this example. Want to see how to apply semantic search — for example, how to create embeddings for text and then apply vector search to find related documents? This blog lays that out step-by-step, including validating model performance. Don’t know which type of task for which model? This table should help you get started. Hugging Face Model task-type Named entity recognition ner Text embedding text_embedding Text classification text_classification Zero shot classification zero_shot_classification Question answering question_answering Elastic also supports comparing how similar two pieces of text are to each other as text_similarity task-type — this is useful for ranking document text when comparing it to another provided text input, and it’s sometimes referred to as cross-encoding. Check these resources for more details Support for PyTorch transformers, including design considerations for Eland Steps for loading transformers into Elastic and using them in inference Blog describing how to query your proprietary data using ChatGPT Adapt a pre trained transformer to a text classification task, and load the custom model into Elastic Built-in language identification that lets you identify non-English text before passing into models that support only English Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Jump to Elastic supports the machine learning models you need Eland lets you load ML models into Elastic Which models Elastic support? How to apply transformers and NLP in Elastic, with ease! Check these resources for more details Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Accessing machine learning models in Elastic - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-machine-learning-models",
+    "meta_description": "Explore the machine learning (ML) models supported in Elastic, the Eland library for loading models and how to apply transformers & NLP in Elastic."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Improving the ES|QL editor experience in Kibana With the new ES|QL language becoming GA, a new editor experience has been developed in Kibana to help users write faster and better queries. Features like live validation, improved autocomplete and quick fixes will streamline the ES|QL experience. ES|QL ML By: Marco Liberati On December 31, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We’re going to cover the guiding principles behind improving the ES|QL editor experience in Kibana and what we did to achieve that goal. We’ll cover features like live validation, improved autocomplete and quick fixes which all streamline the ES|QL experience. ES|QL background Since Elastic 8.11, a technical preview is now available of Elastic’s new piped query language, ES|QL (Elasticsearch Query Language), which transforms, enriches, and simplifies data investigations. Powered by a new query engine, ES|QL delivers advanced search capabilities with concurrent processing, improving speed and efficiency, irrespective of data source and structure. Accelerate resolution by creating aggregations and visualizations from one screen, delivering an iterative, uninterrupted workflow. As a developer, learning a new language can be both an interesting challenge and a frustrating scenario. For a query language having nice syntax, lots of documentation and examples makes it click, but then moving from the walled garden of the documentation examples into the real world queries can be challenging. When adopting a new language as a developer, I’m interested in quickly iterating and jumping from a trial and error environment to the documentation to check more in-depth topics about syntax, limits and caveats. Writing a correct ES|QL query should be easy With ES|QL, we want to provide the best possible experience for developers to push on all of the possibilities that modern web editors can provide. As such, the ES|QL editor in Kibana has a critical role as it is one of the main mediums for users to approach the new language. Improving its user experience is of high importance to us. In order to improve the user experience in the editor, these four principles have been identified: The user should not need to memorize all of the knowledge regarding indices/fields/policies/functions etc… It should take seconds, not minutes, to understand what’s wrong with a query. Autocomplete should make it easy for users to build the right queries. The user should not be blamed for errors, rather the editor should help to fix them. Catching ES|QL errors early on (and fix them) with Kibana editor In 8.13 ES|QL in Discover offers a complete client side validation engine, making it easy to catch potential errors before submitting a query to Elasticsearch. The validation runs while typing and offers immediate feedback for the incorrect parts of the query: (When expanded it is possible to inspect specific errors in the ES|QL editor with cursor hovering) The validation has some resiliency to syntax errors and can still provide useful information to the user with incomplete queries. (The ES|QL can validate the entire query at multiple points: errors are collected and fully reported on demand) As a developer that is comfortable using IDEs in my daily coding environment, I’m used to the quick fix menu that provides suggestions on how to address common problems like spelling errors or using the wrong quotes. Kibana uses the Monaco editor under the hood, which is a smaller version of the VSCode editor, and that provides an interface to deliver a similar feature also on the web. An initial quick fix feature has been developed and some basic suggestions are already supported: (The new ES|QL will leverage internal knowledge to propose a quick fix with existing indexes) Current list of supported quick fixes includes: Wrong field quoting Wrong literals quoting Index, field (and metafields), function, policies typos …and more will be added in subsequent versions The quick fixes function is still in its initial development and we are looking for feedback and enhancement requests. Better ES|QL auto-complete in Kibana editor Since its’ release ES|QL has been shipped with a basic auto-complete feature in the Kibana editor, which already provides some useful suggestions for first time users. In 8.13 the autocomplete logic has been refactored with improved feedback for the user, leveraging all field and function types, together with a deep knowledge of its ES|QL implementation in Elasticsearch. In simple terms this means that from 8.13 on autocomplete will only suggest the right “thing” in many scenarios that were before uncovered. A list (not necessarily complete) of covered features are: Propose the right function, even when used within another function: (The ES|QL autocomplete knows what functions are compatible with each other, even when nested) Propose the right argument for a function, either filtering out fields by type or proposing the right constants (Autocomplete can help with special enums for particular functions, listing all of them directly) Know when to quote or not a field/index name The new autocomplete attempts to reduce the amount of information the user has to keep in mind in order to build a query, with both the application of many contextual type filters and leveraging some deep knowledge of the new language syntax. Provide more contextual help in ES|QL Kibana editor The new autocomplete contains the hidden gem of providing a full contextual help for any proposed suggestion, in particular for functions or commands with examples. (Autocomplete can provide full inline documentation with examples on demand for commands and functions) Another useful way to get more information within the editor is hover over specific parts of the query, like the policy name to gather more metadata information about it. (Contextual tooltips helps with quick summaries of enrich policies with same basic informations) Make the most of your data with the ES|QL Kibana editor In this post, we showcased some of the new ES|QL Kibana editor features. In summary, the list of features are as follows: Users can get immediate feedback when typing a query about syntax and/or invalid query statement Users can quickly get fix suggestions on some specific errors Index, fields and policies are automatically suggested to the users in the right place Help is provided inline with full documentation and examples. Elastic invites SREs and developers to experience this editor feature firsthand and unlock new horizons in their data tasks. Try it today at https://ela.st/free-trial now. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau ES|QL Ruby +1 October 24, 2024 How to use the ES|QL Helper in the Elasticsearch Ruby Client Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results. FB By: Fernando Briano ES|QL Python +1 September 5, 2024 From ES|QL to native Pandas dataframes in Python Learn how to export ES|QL queries as native Pandas dataframes in Python through practical examples. QP By: Quentin Pradet ES|QL Python August 20, 2024 An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus Use Elasticsearch Query Language (ES|QL) to run statistical analysis on demographic data index in Elasticsearch. BA By: Baha Azarmi Jump to ES|QL background Writing a correct ES|QL query should be easy Catching ES|QL errors early on (and fix them) with Kibana editor Better ES|QL auto-complete in Kibana editor Provide more contextual help in ES|QL Kibana editor Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving the ES|QL editor experience in Kibana - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/improving-esql-editor-experience-in-kibana",
+    "meta_description": "Learn about the ES|QL editor in Kibana and its features, such as live validation, improved autocomplete, and quick fixes."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch vs. OpenSearch: Vector Search Performance Comparison Elasticsearch is out-of-the-box 2x–12x faster than OpenSearch for vector search Vector Database Lucene US By: Ugo Sangiorgi On June 26, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. TLDR: Elasticsearch is up to 12x faster - We at Elastic have received numerous requests from our community to clarify the performance differences between Elasticsearch and OpenSearch, particularly in the realm of Semantic Search / Vector Search, so we have undertaken this performance testing to provide a clear, data-driven comparison — no ambiguity, just straightforward facts to inform our users. The results show that Elasticsearch is up to 12x faster than OpenSearch for vector search and therefore requires fewer computational resources. This reflects Elastic's focus on consolidating Lucene as the best vector database for search and retrieval use cases. Vector search is revolutionizing the way we conduct similarity searches, particularly in fields like AI and machine learning. With the increasing adoption of vector embedding models, the ability to efficiently search through millions of high-dimension vectors becomes critical. When it comes to powering vector databases, Elastic and OpenSearch have taken notably different approaches. Elastic has invested heavily in optimizing Apache Lucene together with Elasticsearch to elevate them as the top-tier choice for vector search applications. In contrast, OpenSearch has broadened its focus, integrating other vector search implementations and exploring beyond Lucene's scope. Our focus on Lucene is strategic, enabling us to provide highly integrated support in our version of Elasticsearch, resulting in an enhanced feature set where each component complements and amplifies the capabilities of the other. This blog presents a detailed comparison between Elasticsearch 8.14 and OpenSearch 2.14 accounting for different configurations and vector engines. In this performance analysis, Elasticsearch proved to be the superior platform for vector search operations, and upcoming features will widen the differences even more significantly . When pitted against OpenSearch, it excelled in every benchmark track — offering 2x to 12x faster performance on average . This was across scenarios using varying vector amounts and dimensions including so_vector (2M vectors, 768D), openai_vector (2.5M vectors, 1536D), and dense_vector (10M vectors, 96D), all available in this repository alongside the Terraform scripts for provisioning all the required infrastructure on Google Cloud and Kubernetes manifests for running the tests. The results detailed in this blog complement the results from a previously published and third-party validated study that shows Elasticsearch is 40%–140% faster than OpenSearch for the most common search analytics operations: Text Querying, Sort, Range, Date Histogram and Terms filtering. Now we can add another differentiator: Vector Search. Up to 12x faster out-of-the-box Our focused benchmarks across the four vector data sets involved both Approximate KNN and Exact KNN searches, considering different sizes, dimensions and configurations, totaling 40.189.820 uncached search requests. The results: Elasticsearch is up to 12x faster than OpenSearch for vector search and therefore requires fewer computational resources. Figure 1: Grouped tasks for ANN and Exact KNN across different combinations in Elasticsearch and OpenSearch. The groups like knn-10-100 means KNN search with k : 10 k:10 k : 10 and n : 100 n:100 n : 100 . In HNSW vector search, k k k determines the number of nearest neighbors to retrieve for a query vector. It specifies how many similar vectors to find as a result. n n n sets the number of candidate vectors to retrieve at each segment. More candidates can enhance accuracy but require greater computational resources. We also tested with different quantization techniques and leveraged engine-specific optimizations, the detailed results for each track, task and vector engine are available below. Exact KNN and Approximate KNN When dealing with varying data sets and use cases, the right approach for vector search will differ. In this blog all tasks stated as knn-* like knn-10-100 use Approximate KNN and script-score-* refer to Exact KNN , but what is the difference between them, and why are they important? In essence, if you're handling more substantial data sets, the preferred method is the Approximate K-Nearest Neighbor (ANN) due to its superior scalability. For more modest data sets that may require a filtration process, Exact KNN method is ideal. Exact KNN uses a brute-force method, calculating the distance between one vector and every other vector in the data set. It then ranks these distances to find the k k k nearest neighbors. While this method ensures an exact match, it suffers from scalability challenges for large, high-dimensional data sets. However, there are many cases in which Exact KNN is needed: Rescoring : In scenarios involving lexical or semantic searches followed by vector-based rescoring, Exact KNN is essential. For example, in a product search engine, initial search results can be filtered based on textual queries (e.g., keywords, categories), and then vectors associated with the filtered items are used for a more accurate similarity assessment. Personalization : When dealing with a large number of users, each represented by a relatively small number (like 1 million) of distinct vectors, sorting the index by user-specific metadata (e.g., user_id) and brute-force scoring with vectors becomes efficient. This approach allows for personalized recommendations or content delivery based on precise vector comparisons tailored to individual user preferences. Exact KNN therefore ensures that the final ranking and recommendations based on vector similarity are precise and tailored to user preferences. Approximate KNN (or ANN) on the other hand employs methods to make data searching faster and more efficient than Exact KNN, especially in large, high-dimensional data sets. Instead of a brute-force approach, which measures the exact nearest distance between a query and all points leading to computation and scaling challenges, ANN uses certain techniques to efficiently restructure the indexes and dimensions of searchable vectors in the data set. While this may cause a slight inaccuracy, it significantly boosts the speed of the search process, making it an effective alternative for dealing with large data sets. In this blog all tasks stated as knn-* like knn-10-100 use Approximate KNN and script-score-* refer to Exact KNN . Testing methodology While Elasticsearch and OpenSearch are similar in terms of API for BM25 search operations, since the latter is a fork of the former, it is not the case for Vector Search, which was introduced after the fork. OpenSearch took a different approach than Elasticsearch when it comes to algorithms, by introducing two other engines — nmslib and faiss — apart from lucene , each with their specific configurations and limitations (e.g., nmslib in OpenSearch does not allow for filters, an essential feature for many use cases). All three engines use the Hierarchical Navigable Small World (HNSW) algorithm, which is efficient for approximate nearest neighbor search, and especially powerful when dealing with high-dimensional data. It's important to note that faiss also supports a second algorithm, ivf , but since it requires pre-training on the data set, we are going to focus solely on HNSW. The core idea of HNSW is to organize the data into multiple layers of connected graphs, with each layer representing a different granularity of the data set. The search begins at the top layer with the coarsest view and progresses down to finer and finer layers until reaching the base level. Both search engines were tested under identical conditions in a controlled environment to ensure fair testing grounds. The method applied is similar to this previously published performance comparison , with dedicated node pools for Elasticsearch, OpenSearch, and Rally. The terraform script is available (alongside all sources) to provision a Kubernetes cluster with: 1 Node pool for Elasticsearch with 3 e2-standard-32 machines (128GB RAM and 32 CPUs) 1 Node pool for OpenSearch with 3 e2-standard-32 machines (128GB RAM and 32 CPUs) 1 Node pool for Rally with 2 t2a-standard-16 machines (64GB RAM and 16 CPUs) Each \"track\" (or test) ran for 10 times for each configuration, which included different engines, different configurations and different vector types. The tracks have tasks that repeat between 1000 and 10000 times, depending on the track. If one of the tasks in a track failed for instance due to a network timeout, then all tasks were discarded, so all results represent tracks that started and finished without problems. All test results are statistically validated, ensuring that improvements aren’t coincidental. Detailed findings Why compare using the 99th percentile and not the average latency? Consider a hypothetical example of average house prices in a certain neighborhood. The average price may indicate an expensive area, but on closer inspection, it may turn out that most homes are valued much lower, with only a few luxury properties inflating the average figure. This illustrates how the average price can fail to accurately represent the full spectrum of house values in the area. This is akin to examining response times, where the average can conceal critical issues. Tasks Approximate KNN with k:10 n:50 Approximate KNN with k:10 n:100 Approximate KNN with k:100 n:1000 Approximate KNN with k:10 n:50 and keyword filters Approximate KNN with k:10 n:100 and keyword filters Approximate KNN with k:100 n:1000 and keyword filters Approximate KNN with k:10 n:100 in conjunction with indexing Exact KNN (script score) Vector engines lucene in Elasticsearch and OpenSearch, both on version 9.10 faiss in OpenSearch nmslib in OpenSearch Vector types hnsw in Elasticsearch and OpenSearch int8_hnsw in Elasticsearch (HNSW with automatic 8 bit quantization: link ) sq_fp16 hnsw in OpenSearch (HNSW with automatic 16 bit quantization: link ) Out-of-the-box and Concurrent Segment Search As you probably know, Lucene is a highly performant text search engine library written in Java that serves as the backbone for many search platforms like Elasticsearch, OpenSearch, and Solr. At its core, Lucene organizes data into segments, which are essentially self-contained indices that allow Lucene to execute searches more efficiently. So when you issue a search to any Lucene-based search engine, your search will end up being executed in those segments, either sequentially or in parallel. OpenSearch introduced concurrent segment search as an optional flag, and does not use it by default, you must enable it using a special index setting index.search.concurrent_segment_search.enabled as detailed here , with some limitations . Elasticsearch on the other hand searches on segments concurrently out-of-the-box , therefore the comparisons we make in this blog will take into consideration, on top of the different vector engines and vector types, also the different configurations: Elasticsearch ootb: Elasticsearch out-of-the-box, with concurrent segment search; OpenSearch ootb: without concurrent segment search enabled; OpenSearch css: with concurrent segment search enabled Now, let’s dive into some detailed results for each vector data set tested: 2.5 million vectors, 1536 dimensions (openai_vector) Starting with the simplest track, but also the largest in terms of dimensions, openai_vector - which uses the NQ data set enriched with embeddings generated using OpenAI's text-embedding-ada-002 model . It is the simplest since it tests only Approximate KNN and has only 5 tasks. It tests in standalone (without indexing) as well as alongside indexing, and using a single client and 8 simultaneous clients. Tasks standalone-search-knn-10-100-multiple-clients : searching on 2.5 million vectors with 8 clients simultaneously, k: 10 and n:100 standalone-search-knn-100-1000-multiple-clients : searching on 2.5 million vectors with 8 clients simultaneously, k: 100 and n:1000 standalone-search-knn-10-100-single-client : searching on 2.5 million vectors with a single client, k: 10 and n:100 standalone-search-knn-100-1000-single-client : searching on 2.5 million vectors with a single client, k: 100 and n:1000 parallel-documents-indexing-search-knn-10-100 : searching on 2.5 million vectors while also indexing additional 100000 documents, k:10 and n:100 The averaged p99 performance is outlined below: Here we observed that Elasticsearch is between 3x-8x faster than OpenSearch when performing vector search alongside indexing (i.e. read+write) with k k k :10 and n n n :100 and 2x-3x faster without indexing for the same k and n. For k k k :100 and n n n :1000 ( standalone-search-knn-100-1000-single-client and standalone-search-knn-100-1000-multiple-clients Elasticsearch is 2x to 7x faster than OpenSearch, on average. The detailed results show the exact cases and vector engines compared: Recall knn-recall-10-100 knn-recall-100-1000 Elasticsearch-8.14.0@lucene-hnsw 0.969485 0.995138 Elasticsearch-8.14.0@lucene-int8_hnsw 0.781445 0.784817 OpenSearch-2.14.0@lucene-hnsw 0.96519 0.995422 OpenSearch-2.14.0@faiss 0.984154 0.98049 OpenSearch-2.14.0@faiss-sq_fp16 0.980012 0.97721 OpenSearch-2.14.0@nmslib 0.982532 0.99832 10 million vectors, 96 dimensions (dense_vector) In dense_vector with 10M vectors and 96 dimensions. It is based on the Yandex DEEP1B image data set. The data set is created from the first 10 million vectors of the \"sample data\" file called learn.350M.fbin . The search operations use vectors from the \"query data\" file query. public.10K.fbin . Both Elasticsearch and OpenSearch perform very well on this data set, especially after a force merge which is usually done on read-only indices and it’s similar to defragmenting the index to have a single \"table\" to search on. Tasks Each task warms up for 100 requests and then 1000 requests are measured knn-search-10-100 : searching on 10 million vectors, k: 10 and n:100 knn-search-100-1000 : searching on 10 million vectors, k: 100 and n:1000 knn-search-10-100-force-merge : searching on 10 million vectors after a force merge, k: 10 and n:100 knn-search-100-1000-force-merge : searching on 10 million vectors after a force merge, k: 100 and n:1000 knn-search-100-1000-concurrent-with-indexing : searching on 10 million vectors while also updating 5% of the data set , k: 100 and n:1000 script-score-query : Exact KNN search of 2000 specific vectors . Both Elasticsearch and OpenSearch performed well for Approximate KNN. When the index is merged (i.e. has just a single segment) in knn-search-100-1000-force-merge and knn-search-10-100-force-merge , OpenSearch performs better than the others when using nmslib and faiss , even though they are all around 15ms and all very close. However, when the index has multiple segments (a typical situation where an index receives updates to its documents) in knn-search-10-100 and knn-search-100-1000 , Elasticsearch keeps the latency in about ~7ms and ~16ms, while all other OpenSearch engines are slower. Also when the index is being searched and written to at the same time ( knn-search-100-1000-concurrent-with-indexing ), Elasticsearch maintains the latency below 15ms (at 13.8ms), being almost 4x faster than OpenSearch out-of-the-box (49.3ms) and still faster when concurrent segment search is enabled (17.9ms), but too close to be significative. As for Exact KNN, the difference is much larger: Elasticsearch is 6x faster than OpenSearch (~260ms vs ~1600ms). Recall knn-recall-10-100 knn-recall-100-1000 Elasticsearch-8.14.0@lucene-hnsw 0.969843 0.996577 Elasticsearch-8.14.0@lucene-int8_hnsw 0.775458 0.840254 OpenSearch-2.14.0@lucene-hnsw 0.971333 0.996747 OpenSearch-2.14.0@faiss 0.9704 0.914755 OpenSearch-2.14.0@faiss-sq_fp16 0.968025 0.913862 OpenSearch-2.14.0@nmslib 0.9674 0.910303 2 million vectors, 768 dimensions (so_vector) This track , so_vector , is derived from a dump of StackOverflow posts downloaded on April, 21st 2022. It only contains question documents — all documents representing answers have been removed. Each question title was encoded into a vector using the sentence transformer model multi-qa-mpnet-base-cos-v1 . This data set contains the first 2 million questions. Unlike the previous track, each document here contains other fields besides vectors to support testing features like Approximate KNN with filtering and hybrid search. nmslib for OpenSearch is notably absent in this test since it does not support filters . Tasks Each task warms up for 100 requests and then 100 requests are measured. Note the tasks were grouped for sake of simplicity, since the test contains 16 search types * 2 different k values * 3 different n values. knn-10-50 : searching on 2 million vectors without filters, k:10 and n:50 knn-10-50-filtered : searching on 2 million vectors with filters , k:10 and n:50 knn-10-50-after-force-merge : searching on 2 million vectors with filters and after a force merge, k:10 and n:50 knn-10-100 : searching on 2 million vectors without filters, k:10 and n:100 knn-10-100-filtered : searching on 2 million vectors with filters , k:10 and n:100 knn-10-100-after-force-merge : searching on 2 million vectors with filters and after a force merge, k:10 and n:100 knn-100-1000 : searching on 2 million vectors without filters, k:100 and n:1000 knn-100-1000-filtered : searching on 2 million vectors with filters , k:100 and n:1000 knn-100-1000-after-force-merge : searching on 2 million vectors with filters and after a force merge, k:100 and n:1000 exact-knn : Exact KNN search with and without filters . Elasticsearch is consistently faster than OpenSearch out-of-the-box on this test, only in two cases OpenSearch is faster, and not by much ( knn-10-100 and knn-100-1000 ). Tasks involving knn-10-50 , knn-10-100 and knn-100-1000 in combination with filters show a difference of up to 7x (112ms vs 803ms). The performance of both solutions seems to even out after a \"force merge\", understandably, as evidenced by knn-10-50-after-force-merge , knn-10-100-after-force-merge and knn-100-1000-after-force-merge. On those tasks faiss is faster. The performance for Exact KNN once again is very different, Elasticsearch being 13 times faster than OpenSearch this time (~385ms vs ~5262ms). Recall knn-recall-10-100 knn-recall-100-1000 knn-recall-10-50 Elasticsearch-8.14.0@lucene-hnsw 1 1 1 Elasticsearch-8.14.0@lucene-int8_hnsw 1 0.986667 1 OpenSearch-2.14.0@lucene-hnsw 1 1 1 OpenSearch-2.14.0@faiss 1 1 1 OpenSearch-2.14.0@faiss-sq_fp16 1 1 1 OpenSearch-2.14.0@nmslib 0.9674 0.910303 0.976394 Elasticsearch and Lucene as clear victors At Elastic, we are relentlessly innovating Apache Lucene and Elasticsearch to ensure we are able to provide the premier vector database for search and retrieval use cases, including RAG (Retrieval Augmented Generation). Our recent advancements have dramatically increased performance, making vector search faster and more space efficient than before, building upon the gains from Lucene 9.10. This blog presented a study that shows when comparing up-to-date versions Elasticsearch is up to 12 times faster than OpenSearch. It's worth noting both products use the same version of Lucene ( Elasticsearch 8.14 Release Notes and OpenSearch 2.14 Release Notes ). The pace of innovation at Elastic will deliver even more not only for our on-premises and Elastic Cloud customers but those using our stateless platform . Features like support for scalar quantization to int4 will be offered with rigorous testing to ensure customers can utilize these techniques without a significant drop in recall, similar to our testing for int8 . Vector search efficiency is becoming a non-negotiable feature in modern search engines due to the proliferation of AI and machine learning applications. For organizations looking for a powerful search engine capable of keeping up with the demands of high-volume, high-complexity vector data, Elasticsearch is the definitive answer. Whether expanding an established platform or initiating new projects, integrating Elasticsearch for vector search needs is a strategic move that will yield tangible, long-term benefits. With its proven performance advantage, Elasticsearch is poised to underpin the next wave of innovations in search. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Up to 12x faster out-of-the-box Exact KNN and Approximate KNN Testing methodology Detailed findings Tasks Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch vs. OpenSearch: Vector Search Performance Comparison - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-opensearch-vector-search-performance-comparison",
+    "meta_description": "Elasticsearch is out-of-the-box 2x–12x faster than OpenSearch for vector search"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Making Elasticsearch and Lucene the best vector database: up to 8x faster and 32x efficient Discover the recent enhancements and optimizations that notably improve vector search performance in Elasticsearch & Lucene vector database. Vector Database Generative AI MS BT JF By: Mayya Sharipova , Benjamin Trent and Jim Ferenczi On April 26, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch and Lucene report card: noteworthy speed and efficiency investments Our mission at Elastic is to make Apache Lucene the best vector database out there, and to continue to make Elasticsearch the best retrieval platform out there for search and RAG. Our investments into Lucene are key to ensure that every release of Elasticsearch brings increasing faster performance and scale. Customers are already building the next generation of AI enabled search applications with Elastic’s vector database and vector search technology. Roboflow is used by over 500,000 engineers to create datasets, train models, and deploy computer vision models to production. Roboflow uses Elastic vector database to store and search billions of vector embeddings. In this blog we summarize recent enhancements and optimisations that significantly improve vector search performance in Elasticsearch and Apache Lucene, over and above performance gains delivered with Lucene 9.9 and Elasticsearch 8.12.x. The integration of vector search into Elasticsearch relies on Apache Lucene, the layer that orchestrates data storage and retrieval. Lucene's architecture organizes data into segments, immutable units that undergo periodic merging. This structure allows for efficient management of inverted indices, essential for text search. With vector search, Lucene extends its capabilities to handle multi-dimensional points, employing the hierarchical navigable small world (HNSW) algorithm to index vectors. This approach facilitates scalability, enabling data sets to exceed available RAM size while maintaining performance. Additionally, Lucene's segment-based approach offers lock-free search operations, supporting incremental changes and ensuring visibility consistency across various data structures. The integration however comes with its own engineering challenges. Merging segments requires recomputing HNSW graphs, incurring index-time overhead. Searches must cover multiple segments, leading to possible latency overhead. Moreover, optimal performance requires scaling RAM as data grows, which may raise resource management concerns. Lucene's integration into Elasticsearch comes with the benefit of robust vector search capabilities. This includes aggregations, document level security, geo-spatial queries, pre-filtering, to full compatibility with various Elasticsearch features. Imagine running vector searches using a geo bounding box, this is an example usecase enabled by Elasticsearch and Lucene. Lucene's architecture lays a solid foundation for efficient and versatile vector search within Elasticsearch. Let’s explore optimization strategies and enhancements we have implemented to integrate vector search into Lucene, which delivers a high performance and comprehensive feature-set for developers. Harnessing Lucene's architecture for multi-threaded search Lucene's segmented architecture enables the implementation of multi-threaded search capabilities. Elasticsearch’s performance gains come from efficiently searching multiple segments simultaneously. Latency of individual searches is significantly reduced by using the processing power of all available CPU cores. While it may not directly improve overall throughput, this enhancement prioritizes minimizing response times, ensuring that users receive their search results as swiftly as possible. Furthermore, this optimization is particularly beneficial for Hierarchical Navigable Small World (HNSW) searches, as each graph is independent of the others and can be searched in parallel, maximizing efficiency and speeding up retrieval times even further. The advantage of having multiple independent segments extends to the architectural level, especially in serverless environments. In this new architecture, the indexing tier is responsible for creating new segments, each containing its own HSNW graph. The search tier can simply replicate these segments without incurring the CPU cost of indexation. This separation allows a significant portion of compute resources to be dedicated to searches, optimizing overall system performance and responsiveness. Accelerating multi-graph vector search In spite of gains achieved with parallelization, each segment's searches would remain independent, unaware of progress made by other segment searches. So our focus shifted towards optimizing the efficiency of concurrent searches across multiple segments. The graph shows that the number queries per second increased from 104 queries/sec to 219 queries/sec. Recognizing the potential for further speedups, we leveraged our insights from optimizing lexical search, to enable information exchange among segment searches allowing for better coordination and efficiency in vector search. Our strategy for accelerating multi-graph vector search revolves around balancing exploration and exploitation within the proximity graph. By adjusting the size of the expanded match set, we control the trade-off between runtime and recall, crucial for achieving optimal search performance across multiple graphs. In multi-graph search scenarios, the challenge lies in efficiently navigating individual graphs, while ensuring comprehensive exploration to avoid local minima. While searching multiple graphs independently yields higher recall, it incurs increased runtime due to redundant exploration efforts. To mitigate this, we devised a strategy to intelligently share state between searches, enabling informed traversal decisions based on global and local competitive thresholds. This approach involves maintaining shared global and local queues of distances to closest vectors, dynamically adapting search parameters based on the competitiveness of each graph's local search. By synchronizing information exchange and adjusting search strategies accordingly, we achieve significant improvements in search latency while preserving recall rates comparable to single-graph searches. The impact of these optimizations is evident in our benchmark results. In concurrent search and indexing scenarios, we notice up to 60% reduction in query latencies! Even for queries conducted outside of indexing operations, we observed notable speedups and a dramatic decrease in the number of vector operations required. These enhancements, integrated into Lucene 9.10 and subsequently Elasticsearch 8.13, mark significant strides towards enhancing vector database performance for search while maintaining excellent recall rates. Harnessing Java's latest advancements for ludicrous speed In the area of Java development, automatic vectorization has been a boon, optimizing scalar operations into SIMD (Single Instruction Multiple Data) instructions through the HotSpot C2 compiler. While this automatic optimization has been beneficial, it has its limitations, particularly in scenarios where explicit control over code shape yields superior performance. Enter Project Panama Vector API, a recent addition to the JDK offering an API for expressing computations reliably compiled to SIMD instructions at runtime. Lucene's vector search implementation relies on fundamental operations like dot product, square, and cosine distance, both in floating point and binary variants. Traditionally, these operations were backed by scalar implementations, leaving performance enhancements to the JIT compiler. However, recent advancements introduce a paradigm shift, enabling developers to express these operations explicitly for optimal performance. Consider the dot product operation, a fundamental vector computation. Traditionally implemented in Java with scalar arithmetic, recent innovations leverage the Panama Vector API to express dot product computations in a manner conducive to SIMD instructions. This revised implementation iterates over input arrays, multiplying and accumulating elements in batches, aligning with the underlying hardware capabilities. By harnessing Panama Vector API, Java code now interfaces seamlessly with SIMD instructions, unlocking the potential for significant performance gains. The compiled code, when executed on compatible CPUs, leverages advanced vector instructions like AVX2 or AVX 512, resulting in accelerated computations. Disassembling the compiled code reveals optimized instructions tailored to the underlying hardware architecture. Microbenchmarks comparing traditional Java implementations to those leveraging Panama Vector API illustrate dramatic performance improvements. Across various vector operations and dimension sizes, the optimized implementations outperform their predecessors by significant margins, offering a glimpse into the transformative power of SIMD instructions. Micro-benchmark comparing dot product with the new Panama API (dotProductNew) and the scalar implementation (dotProductOld). Beyond microbenchmarks, the real-world impact of these optimizations is quite exciting to think about. Vector search benchmarks, such as SO Vector, demonstrate notable enhancements in indexing throughput, merge times, and query latencies. Elasticsearch, embracing these advancements, incorporates the faster implementations by default, ensuring users reap the performance benefits seamlessly. The graph shows indexing throughput increased from about 900 documents/sec to about 1300 documents/sec. Despite the incubating status of Panama Vector API, its quality and potential benefits are undeniable. Lucene's pragmatic approach allows for selective adoption of non-final JDK APIs, balancing the promise of performance improvements with maintenance considerations. With Lucene and Elasticsearch, users can leverage these advancements effortlessly, with performance gains translating directly to real-world workloads. The integration of Panama Vector API into Java development yields a new era of performance optimization, particularly in vector search scenarios. By embracing hardware-accelerated SIMD instructions, developers can unlock efficiency gains, visible both in microbenchmarks and macro-level benchmarks. As Java continues to evolve, leveraging its latest features promises to propel performance to new heights, enriching user experiences across diverse applications. Maximizing memory efficiency with scalar quantization Memory consumption has long been a concern for efficient vector database operations, particularly for searching large datasets. Lucene introduces a breakthrough optimization technique - scalar quantization - aimed at significantly reducing memory requirements without sacrificing search performance. Consider a scenario where querying millions of float32 vectors of high dimensions demands substantial memory, leading to significant costs. By embracing byte quantization, Lucene slashes memory usage by approximately 75%, offering a viable solution to the memory-intensive nature of vector search operations. For quantizing floats to bytes, Lucene implements Scalar quantization a lossy compression technique that transforms raw data into a compressed form, sacrificing some information for space efficiency. Lucene's implementation of scalar quantization achieves remarkable space savings with minimal impact on recall, making it an ideal solution for memory-constrained environments. Lucene's architecture, consisting of nodes, shards, and segments, which facilitates efficient distribution and management of documents for search. Each segment stores raw vectors, quantized vectors, and metadata, ensuring optimized storage and retrieval mechanisms. Lucene's vector quantization adapts dynamically over time, adjusting quantiles during segment merge operations to maintain optimal recall. By intelligently handling quantization updates and re-quantization when necessary, Lucene ensures consistent performance while accommodating changes in data distribution. Example of merged quantiles where segments A and B have 1000 documents and C only has 100. Experimental results demonstrate the efficacy of scalar quantization in reducing memory footprint while maintaining search performance. Despite minor differences in recall compared to raw vectors, Lucene's quantized vectors offer significant speed improvements and recall recovery with minimal additional vectors. Recall@10 for quantized vectors vs raw vectors. The search performance of quantized vectors is significantly faster than raw, and recall is quickly recoverable by gathering just 5 more vectors; visible by quantized@15. Lucene's scalar quantization presents a revolutionary approach to memory optimization in vector search operations. With no need for training or optimization steps, Lucene seamlessly integrates quantization into its indexing process, automatically adapting to changes in data distribution over time. As Lucene and Elasticsearch continue to evolve, widespread adoption of scalar quantization will revolutionize memory efficiency for vector database applications, paving the way for enhanced search performance at scale. Achieving seamless compression with minimal impact on recall To make compression even better, we aimed to reduce each dimension from 7 bits to just 4 bits. Our main goal was to compress data further while still keeping search results accurate. By making some improvements, we managed to compress data by a factor of 8 without making search results worse. Here's how we did it. We focused on keeping search results accurate while making data smaller. By making sure we didn't lose important information during compression, we could still find things well even with less detailed data. To make sure we didn't lose any important information, we added a smart error correction system. We checked our compression improvements by testing them with different types of data and real search situations. This helped us see how well our searches worked with different compression levels and what we might lose in accuracy by compressing more. Comparison of int4 dot product values to the corresponding float values for a random sample of 100 documents and their 10 nearest neighbors. These compression features were created to easily work with existing vector search systems. They help organizations and users save space without needing to change much in their setup. With this simple compression, organizations can expand their search systems without wasting resources. In short, moving to 4 bits per dimension for scalar quantization was a big step in making compression more efficient. It lets users compress their original vectors by 8 times. By optimizing carefully, adding error correction, testing with real data, and offering scalable deployment, organizations could save a lot of storage space without making search results worse. This opens up new chances for efficient and scalable search applications. Paving the way for binary quantization The optimization to reduce each dimension to 4 bits not only delivers significant compression gains but also lays the groundwork for further advancements in compression efficiency. Specifically, future advancements like binary quantization into Lucene, a development that has the potential to revolutionize vector storage and retrieval. In an ongoing effort to push the boundaries of compression in vector search, we are actively working on integrating binary quantization into Lucene using the same techniques and principles that underpin our existing optimization strategies. The goal is to achieve binary quantization of vector dimensions, thereby reducing the size of the vector representation by a factor of 32 compared to the original floating-point format. Through our iterations and experiments, we want to deliver the full potential of vector search while maximizing resource utilization and scalability. Stay tuned for further updates on our progress towards integrating binary quantization into Lucene and Elasticsearch, and the transformative impact it will have on vector database storage and retrieval. Multi-vector integration in Lucene and Elasticsearch Several real world applications rely on text embedding models and large text inputs. Most embedding models have token limits, which necessitate chunking of longer text into passages. Therefore, instead of a single document, multiple passages and embeddings must be managed, potentially complicating metadata preservation. Now instead of having a single piece of metadata indicating, for example the first chapter of the book “Little Women”, you have to index that information data for every sentence. Lucene's \"join\" functionality, integral to Elasticsearch's nested field type, offers a solution. This feature enables multiple nested documents within a top-level document, allowing searches across nested documents and subsequent joins with their parent documents. So, how do we deliver support for vectors in nested fields with Elasticsearch? The key lies in how Lucene joins back to parent documents when searching child vector passages. The parallel concept here is the debate around pre-filtering versus post-filtering in kNN methods, as the timing of joining significantly impacts result quality and quantity. To address this, recent enhancements to Lucene enable pre-joining against parent documents while searching the HNSW graph. Practically, pre-joining ensures that when retrieving the k nearest neighbors of a query vector, the algorithm returns the k nearest documents instead of passages. This approach diversifies results without complicating the HNSW algorithm, requiring only a minimal additional memory overhead per stored vector. Efficiency is improved by leveraging certain restrictions, such as disjoint sets of parent and child documents and the monotonicity of document IDs. These restrictions allow for optimizations using bit sets, providing rapid identification of parent document IDs. Searching through a vast number of documents efficiently required investing in nested fields and joins in Lucene. This work helps storage and search for dense vectors that represent passages within long texts, making document searches in Lucene more effective. Overall, these advancements represent an exciting step forward in the area of vector database retrieval within Lucene. Wrapping up (for now) We're dedicated to making Elasticsearch and Lucene the best vector database with every release. Our goal is to make it easier for people to search for things. With some of the investments we discuss in this blog, there is significant progress, but we're not done! To say that the gen AI ecosystem is rapidly evolving is an understatement. At Elastic, we want to give developers the most flexible and open tools to keep up with all the innovation—with features available across recent releases until 8.13 and serverless Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Elasticsearch and Lucene report card: noteworthy speed and efficiency investments Harnessing Lucene's architecture for multi-threaded search Accelerating multi-graph vector search Harnessing Java's latest advancements for ludicrous speed Maximizing memory efficiency with scalar quantization Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Making Elasticsearch and Lucene the best vector database: up to 8x faster and 32x efficient - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-lucene-vector-database-gains",
+    "meta_description": "Discover the recent enhancements and optimizations that notably improve vector search performance in Elasticsearch & Lucene vector database."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. Vector Database US By: Ugo Sangiorgi On April 15, 2025 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector search with binary quantization: Elasticsearch with BBQ is 5x faster than OpenSearch with FAISS . Elastic has received requests from our community to clarify performance differences between Elasticsearch and OpenSearch, particularly in the realm of Semantic Search/Vector Search, so we conducted these performance tests to provide clear, data-driven comparisons. Binary quantization showdown Storing high-dimensional vectors in their original form can be memory-intensive. Quantization techniques compress these vectors into a compact representation, drastically reducing the memory footprint. The search then operates in the compressed space, which reduces the computational complexity and makes searches faster, especially in large datasets. Elastic is committed to making Lucene a top-performing Vector Engine. We introduced Better Binary Quantization (BBQ) in Elasticsearch 8.16 on top of Lucene and evolved it further in 8.18 and 9.0. BBQ is built on a new approach in scalar quantization that reduces float32 dimensions to bits, delivering ~95% memory reduction while maintaining high ranking quality. OpenSearch on the other hand uses multiple vector engines: nmslib (now deprecated), Lucene and FAISS. In a previous blog , we compared Elasticsearch and OpenSearch for vector search. We used three different datasets and tested different combinations of engines and configurations on both products. This blog focuses on the binary quantization algorithms currently available in both products. We tested Elasticsearch with BBQ and OpenSearch with FAISS’s Binary Quantization using the openai_vector Rally track. The main objective was to evaluate the performance of both solutions under the same level of recall. What does recall mean? Recall is a metric that measures how many of the relevant results are successfully retrieved by a search system. In this evaluation, recall@k is particularly important, where k represents the number of top results considered. Recall@10 , Recall@50 and Recall@100 therefore measure how many of the true relevant results appear in the top 10, 50 and 100 retrieved items, respectively. Recall is expressed on a scale from 0 to 1 (or 0% to 100% precision). And that is important because we are talking about Approximate KNN (ANN) and not Exact KNN, where recall is always 1 (100%). For each value of k we also specified n, which is the number of candidates considered before applying the final ranking. This means that for Recall@10, Recall@50, and Recall@100, the system first retrieves n candidates using the binary quantization algorithm and then ranks them to determine whether the top k results contain the expected relevant items. By controlling n , we can analyze the trade-off between efficiency and accuracy. A higher n typically increases recall, as more candidates are available for ranking, but it also increases latency and decreases throughput. Conversely, a lower n speeds up retrieval but may reduce recall if too few relevant candidates are included in the initial set. In this comparison, Elasticsearch demonstrated lower latency and higher throughput than OpenSearch on identical setups. Methodology The full configuration, alongside Terraform scripts, Kubernetes manifests and the specific Rally track is available in this repository under openai_vector_bq . As with previous benchmarks, we used a Kubernetes cluster composed of: 1 Node pool for Elasticsearch 9.0 with 3 e2-standard-32 machines (128GB RAM and 32 CPUs) 1 Node pool for OpenSearch 2.19 with 3 e2-standard-32 machines (128GB RAM and 32 CPUs) 1 Node pool for Rally with 2 e2-standard-4 machines (16GB RAM and 4 CPUs) We set up one Elasticsearch cluster version 9.0 and one OpenSearch cluster version 2.19. Both Elasticsearch and OpenSearch were tested with the exact same setup: we used openai_vector Rally track with some modifications - which uses 2.5 million documents from the NQ data set enriched with embeddings generated using OpenAI's text-embedding-ada-002 model . The results report on measured latency and throughput at different recall levels (recall@10, recall@50 and recall@100) using 8 simultaneous clients for performing search operations. We used a single shard and no replicas. We ran the following combinations of k-n-rescore, e.g. 10-2000-2000, or k:10 , n:2000 and rescore:2000 would retrieve the top k (10) over n candidates (2000) applying a rescore over 2000 results (which is equivalent of an “oversample factor” of 1). Each search ran for 10.000 times with 1000 searches as warmup: Recall@10 10-40-40 10-50-50 10-100-100 10-200-200 10-500-500 10-750-750 10-1000-1000 10-1500-1500 10-2000-2000 Recall@50 50-150-150 50-200-200 50-250-250 50-500-500 50-750-750 50-1000-1000 50-1200-1200 50-1500-1500 50-2000-2000 Recall@100 100-200-200 100-250-250 100-300-300 100-500-500 100-750-750 100-1000-1000 100-1200-1200 100-1500-1500 100-2000-2000 To replicate the benchmark, the Kubernetes manifests for both rally-elasticsearch and rally-opensearch have all the relevant variables externalized in a ConfigMap, available here (ES) and here (OS). The search_ops parameter can be customized to test any combination of k, n and rescore. OpenSearch Rally configuration /k8s/rally-openai_vector-os-bq.yml Opensearch index configuration The variables from the ConfigMap are then used on the index configuration, some parameters are left unchanged. 1-bit quantization in OpenSearch is configured by setting the compression level to “32x” . index-vectors-only-mapping-with-docid-mapping.json Elasticsearch Rally configuration /k8s/rally-openai_vector-es-bq.yml Elasticsearch index configuration index-vectors-only-mapping-with-docid-mapping.json Results There are multiple ways to interpret the results. For both latency and throughput, we plotted a simplified and a detailed chart at each level of recall. It’s easy to see differences if we consider “higher is better” for each metric. However, latency is a negative one (lower is actually better), while throughput is a positive one. For the simplified charts, we used (recall / latency) * 10000 (called simply “speed”) and recall * throughput , so both metrics mean more speed and more throughput are better. Let’s get to it. Recall @ 10 - simplified At that level of recall Elasticsearch BBQ is up to 5x faster (3.9x faster on average) and has 3.2x more throughput on average than OpenSearch FAISS. Recall @ 10 - Detailed task latency.mean throughput.mean avg_recall Elasticsearch-9.0-BBQ 10-100-100 11.70 513.58 0.89 Elasticsearch-9.0-BBQ 10-1000-100 27.33 250.55 0.95 Elasticsearch-9.0-BBQ 10-1500-1500 35.93 197.26 0.95 Elasticsearch-9.0-BBQ 10-200-200 13.33 456.16 0.92 Elasticsearch-9.0-BBQ 10-2000-2000 44.27 161.40 0.95 Elasticsearch-9.0-BBQ 10-40-40 10.97 539.94 0.84 Elasticsearch-9.0-BBQ 10-50-50 11.00 535.73 0.85 Elasticsearch-9.0-BBQ 10-500-500 19.52 341.45 0.93 Elasticsearch-9.0-BBQ 10-750-750 22.94 295.19 0.94 OpenSearch-2.19-faiss 10-100-100 35.59 200.61 0.94 OpenSearch-2.19-faiss 10-1000-1000 156.81 58.30 0.96 OpenSearch-2.19-faiss 10-1500-1500 181.79 42.97 0.96 OpenSearch-2.19-faiss 10-200-200 47.91 155.16 0.95 OpenSearch-2.19-faiss 10-2000-2000 232.14 31.84 0.96 OpenSearch-2.19-faiss 10-40-40 27.55 249.25 0.92 OpenSearch-2.19-faiss 10-50-50 28.78 245.14 0.92 OpenSearch-2.19-faiss 10-500-500 79.44 97.06 0.96 OpenSearch-2.19-faiss 10-750-750 104.19 75.49 0.96 Recall @ 50 - simplified At that level of recall Elasticsearch BBQ is up to 5x faster (4.2x faster on average) and has 3.9x more throughput on average than OpenSearch FAISS. Detailed Results - Recall @ 50 Task Latency Mean Throughput Mean Avg Recall Elasticsearch-9.0-BBQ 50-1000-1000 25.71 246.44 0.95 Elasticsearch-9.0-BBQ 50-1200-1200 28.81 227.85 0.95 Elasticsearch-9.0-BBQ 50-150-150 13.43 362.90 0.90 Elasticsearch-9.0-BBQ 50-1500-1500 33.38 202.37 0.95 Elasticsearch-9.0-BBQ 50-200-200 12.99 406.30 0.91 Elasticsearch-9.0-BBQ 50-2000-2000 42.63 163.68 0.95 Elasticsearch-9.0-BBQ 50-250-250 14.41 373.21 0.92 Elasticsearch-9.0-BBQ 50-500-500 17.15 341.04 0.93 Elasticsearch-9.0-BBQ 50-750-750 31.25 248.60 0.94 OpenSearch-2.19-faiss 50-1000-1000 125.35 62.53 0.96 OpenSearch-2.19-faiss 50-1200-1200 143.87 54.75 0.96 OpenSearch-2.19-faiss 50-150-150 43.64 130.01 0.89 OpenSearch-2.19-faiss 50-1500-1500 169.45 46.35 0.96 OpenSearch-2.19-faiss 50-200-200 48.05 156.07 0.91 OpenSearch-2.19-faiss 50-2000-2000 216.73 36.38 0.96 OpenSearch-2.19-faiss 50-250-250 53.52 142.44 0.93 OpenSearch-2.19-faiss 50-500-500 78.98 97.82 0.95 OpenSearch-2.19-faiss 50-750-750 103.20 75.86 0.96 Recall @ 100 At that level of recall Elasticsearch BBQ is up to 5x faster (average 4.6x faster) and has 3.9x more throughput on average than OpenSearch FAISS. Detailed Results - Recall @ 100 task latency.mean throughput.mean avg_recall Elasticsearch-9.0-BBQ 100-1000-1000 27.82 243.22 0.95 Elasticsearch-9.0-BBQ 100-1200-1200 31.14 224.04 0.95 Elasticsearch-9.0-BBQ 100-1500-1500 35.98 193.99 0.95 Elasticsearch-9.0-BBQ 100-200-200 14.18 403.86 0.88 Elasticsearch-9.0-BBQ 100-2000-2000 45.36 159.88 0.95 Elasticsearch-9.0-BBQ 100-250-250 14.77 433.06 0.90 Elasticsearch-9.0-BBQ 100-300-300 14.61 375.54 0.91 Elasticsearch-9.0-BBQ 100-500-500 18.88 340.37 0.93 Elasticsearch-9.0-BBQ 100-750-750 23.59 285.79 0.94 OpenSearch-2.19-faiss 100-1000-1000 142.90 58.48 0.95 OpenSearch-2.19-faiss 100-1200-1200 153.03 51.04 0.95 OpenSearch-2.19-faiss 100-1500-1500 181.79 43.20 0.96 OpenSearch-2.19-faiss 100-200-200 50.94 131.62 0.83 OpenSearch-2.19-faiss 100-2000-2000 232.53 33.67 0.96 OpenSearch-2.19-faiss 100-250-250 57.08 131.23 0.87 OpenSearch-2.19-faiss 100-300-300 62.76 120.10 0.89 OpenSearch-2.19-faiss 100-500-500 84.36 91.54 0.93 OpenSearch-2.19-faiss 100-750-750 111.33 69.95 0.94 Improvements on BBQ BBQ has come a long way since its first release. On Elasticsearch 8.16, for the sake of comparison, we included a benchmark run from 8.16 alongside the current one, and we can see how recall and latency have improved since then. In Elasticsearch 8.18 and 9.0, we rewrote the core algorithm for quantizing the vectors. So, while BBQ in 8.16 was good, the newest versions are even better. You can read about it here and here . In short, every vector is individually quantized through optimized scalar quantiles. As a result, users benefit from higher accuracy in vector search without compromising performance, making Elasticsearch’s vector retrieval even more powerful. Conclusion In this performance comparison between Elasticsearch BBQ and OpenSearch FAISS, Elasticsearch significantly outperforms OpenSearch for vector search, achieving up to 5x faster query speeds and 3.9x higher throughput on average across various levels of recall. Key findings include: Recall@10 : Elasticsearch BBQ is up to 5x faster (3.9x faster on average) and has 3.2x more throughput on average compared to OpenSearch FAISS. Recall@50 : Elasticsearch BBQ is up to 5x faster (4.2x faster on average) and has 3.9x more throughput on average compared to OpenSearch FAISS. Recall@100 : Elasticsearch BBQ is up to 5x faster (4.6x faster on average) and has 3.9x more throughput on average compared to OpenSearch FAISS. These results highlight the efficiency and performance advantages of Elasticsearch BBQ, particularly in high-dimensional vector search scenarios. The Better Binary Quantization (BBQ) technique, introduced in Elasticsearch 8.16, provides substantial memory reduction (~95%) while maintaining high ranking quality, making it a superior choice for large-scale vector search applications. At Elastic, we are relentlessly innovating to improve Apache Lucene and Elasticsearch to provide the best vector database for search and retrieval use cases, including RAG (Retrieval Augmented Generation). Our recent advancements have dramatically increased performance, making vector search faster and more space efficient than before, building upon the gains from Lucene 10. This blog is another illustration of that innovation. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Jump to Binary quantization showdown Methodology OpenSearch Rally configuration Opensearch index configuration Elasticsearch Rally configuration Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-bbq-vs-opensearch-faiss",
+    "meta_description": "A performance comparison between Elasticsearch BBQ and OpenSearch FAISS."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Automatically updating your Elasticsearch index using Node.js and an Azure Function App Learn how to update your Elasticsearch index automatically using Node.js and an Azure Function App. Follow these steps to ensure your index stays current. Javascript Python How To JG By: Jessica Garson On June 4, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Maintaining an up-to-date Elasticsearch index is crucial, especially when dealing with frequently changing dynamic datasets. This blog post will guide you through automatically updating your Elasticsearch index using Node.js and an Azure Function App. First, we'll load the data using Node.js and ensure it remains current through regular updates. Then, we'll leverage the capabilities of Azure Function Apps to automate these updates, thereby ensuring your index is always fresh and reliable. For this blog post, we will be using the Near Earth Object Web Service (NeoWs ), a RESTful web service offering detailed information about near-earth asteroids. By integrating NeoWs with Node.js services integrated as Azure serverless functions, this example will provide you with a robust framework to handle the complexities of managing dynamic data effectively. This approach will help you minimize the risks of working with outdated information and maximize the accuracy and usefulness of your data. Prerequisites This example uses Elasticsearch version 8.13; if you are new to Elasticsearch, check out our Quick Start on Elasticsearch . Any 8.0 version should work for this blog post. Download the latest NPM and Node.js version . This tutorial uses Node v21.6.1 and npm 10.5.0. An API key for NASA's APIs. An active Azure account with access to create a Function App. Access to the Azure portal or Azure CLI Setting up locally Before you begin indexing and loading your data locally, setting up your environment is essential. First, create a directory and initialize it. Then, download the necessary packages and create a .env file to store your configuration settings. This preliminary setup ensures your local environment is prepared to handle the data efficiently. You will be using the Elasticsearch node client to connect to Elastic, Axios to connect to the NASA APIs and dotenv to parse your secrets. You will want to download the required packages running the following commands: After downloading the required packages, you can create a . env file at the root of the project directory. The . env file allows you to keep your credentials secure locally. Check out the example .env file to learn more. To learn more about connecting to Elasticsearch, be sure to take a look at the documentation on the subject . To create a .env file, you can use this command at the root of your project: In your .env , be sure to have the following entered in. Be sure to add your complete endpoint: You will also want to create a new JavaScript file as well: Creating your index and loading your data in Now that you have set up the proper file structure and downloaded the required packages, you are ready to create a script that creates an index and loads data into the index. If you get stuck along the way be sure to check out the full version of the file you are creating in this section. In the file loading_data_into_a_index.js, configure the dotenv package to use the keys and tokens stored in your . env file. You should also import the Elasticsearch client to connect to Elasticsearch and Axios and make HTTP requests. Since your keys and tokens are currently stored as environment variables, you will want to retrieve them and create a client to authenticate to Elasticsearch. You can develop a function to retrieve data from NASA's NEO (Near Earth Object) Web Service asynchronously. You will first configure the base URL for the NASA API request and create date objects for today and the previous week to establish the query period. After you format these dates in the YYYY-MM-DD format required for the API request, set up the dates as query parameters and execute the GET request to the NASA API. Additionally, the function includes error-handling mechanisms to aid debugging should any issues arise. Now, you can create a function to transform the raw data from the NASA API into a structured format. Since the data you get back is currently nested in a complex JSON response. A more straightforward array of objects makes handling data easier. You will want to create an index to store the data from the API. An index inside Elasticsearch is where you can store your data in documents. In this function, you will check to see if an index exists and create a new one if needed. You will also specify the proper mapping of fields for your index. This function also loads the data into the index as documents and maps the id field from the NASA data to the _id field in Elasticsearch. You will want to create a main function to fetch, structure, and index the data. This function will also print out the number of records being uploaded and log whether the data is indexed, whether there is no data to index, or whether it failed to get data back from the NASA API. After creating the run function, you will want to call the function and catch any errors that may come up. You can now run the file from your command line by running the following: To confirm that your index has been successfully loaded, you can check in the Elastic Dev Tools by executing the following API call: Keeping your index updated with an Azure Function App Now that you've successfully loaded your data into your index locally, this data can quickly become outdated. To ensure your information remains current, you can set up an Azure Function App to automatically fetch new data daily and upload it to your Elasticsearch index. The first step is to configure your Function app in Azure Portal. A helpful resource for getting started is the Azure quick start guide . After you've set up your function, you can ensure that you have environment variables set up for ELASTICSEARCH_ENDPOINT , ELASTICSEARCH_API_KEY , and NASA_API_KEY . In Function Apps, environment variables are called Application settings. Inside your function app, click on the \"Configuration\" option in the left panel under \"Settings.\" Under\" the \"Application settings\" tab, click on \"+ New application setting.\" You will want to make sure the required libraries are installed as well. If you go to your terminal on the Azure Portal, you can install the necessary packages by entering the following: The packages you are installing should look very similar to the previous install, except you will be using the moment to parse dates, and you no longer need to load an env file since you just set your secrets to be Application settings. You can click where it says create to create a new function inside your Function App select the template entitled “Timer trigger”. You will now have a file called function.json set for you. You will want to adjust it to look as follows to run this application every day at 10 am. You'll also want to upload your package.json file and ensure it appears as follows: The next step is to create a index.js file. This script is designed to automatically update the data daily. It accomplishes this by systematically fetching and parsing new data each day and then seamlessly updating the dataset accordingly. Elasticsearch can use the same method to ingest time series or immutable data, such as webhook responses. This method ensures the information remains current and accurate, reflecting the latest available data.You can can check out the full code as well. The main differences between the script you run locally and this one are as follows: You will no longer need to load a .env file, since you have already set your environment variables There is also different logging designed more towards creating a more sustainable script You keep your index updated based on the most recent close approach date There is an entry point for an Azure Function App You will first want to set up your libraries and authenticate to Elasticsearch as follows: Afterward, you will want to obtain the last date update date from Elasticsearch and configure a backup method to get data from the past day if anything goes wrong. The following function connects to NASA's NEO (Near Earth Object) Web Service to get the data to keep your index updated. There is also some additional error handling that can capture any API errors that might come up. Now, you will want to create a function to organize your data by iterating over the objects of each date. Now, you will want to load your data into Elasticsearch using the bulk indexing operation. This function should look similar to the one in the previous section. Finally, you will want to create an entry point for the function that will run according to the timer you set. This function is similar to a main function, as it calls the functions created previously in the file. There is also some additional logging, such as printing the number of records and informing you if the data was indexed correctly. Conclusion Using Node.js and Azure's Function App , you should be able to ensure that your Elasticsearch index is updated regularly. By utilizing Node.js's capabilities in conjunction with Azure's Function App, you can efficiently maintain your index's regular updates. This powerful combination offers a streamlined, automated process, reducing the manual effort involved in keeping your index regularly updated. Full code for this example can be found on Search Labs GitHub . Let us know if you built anything based on this blog or if you have questions on our forums and the community Slack channel . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Prerequisites Setting up locally Creating your index and loading your data in Keeping your index updated with an Azure Function App Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Automatically updating your Elasticsearch index using Node.js and an Azure Function App - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-index-node-js-automatic-updates",
+    "meta_description": "Learn how to update your Elasticsearch index automatically using Node.js and an Azure Function App. Follow these steps to ensure your index stays current."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog GenAI for Customer Support — Part 2: Building a Knowledge Library This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! Inside Elastic CM By: Cory Mangini On July 22, 2024 Part of Series GenAI for customer support Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog series reveals how our Field Engineering team used the Elastic stack with generative AI to develop a lovable and effective customer support chatbot. If you missed other installments in the series, be sure to check out part one , part three , part four , the launch blog , and part five . Retrieval-Augmented Generation (RAG) Over a Fine Tuned Model As an engineering team, we knew that Elastic customers would need to trust a generative AI-based Support Assistant to provide accurate and relevant answers. Our initial proof of concept showed that large language model (LLM) foundational training was insufficient on technologies as technically deep and broad as Elastic. We explored fine-tuning our own model for the Support Assistant and instead landed on an RAG-based approach for several reasons. Easier with unstructured data: Fine-tuning required question-answer pairing that did not match our data set and would be challenging to do at the scale of our data. Real-time updates: Immediately incorporates new information by accessing up-to-date documents, ensuring current and relevant responses. Role-Based Access Control: A single user experience across roles restricts specific documents or sources based on the allowed level of access. Less maintenance: Search on the Support Hub and the Support Assistant share much of the same underlying infrastructure. We improved search results and a chatbot from the same work effort. Understanding the Support Assistant as a Search Problem We then formed a hypothesis that drove the technical development and testing of the Support Assistant. Providing more concise and relevant search results as context for the LLM will lead to stronger positive user sentiment by minimizing the chance of the model using unrelated information to answer the user's question. In order to test our team hypothesis, we had to reframe our understanding of chatbots in the context of search. Think of a support chatbot as a librarian. The librarian has access to an extensive pool of books (via search) and innately knows (the LLM) a bit about a broad range of topics. When asked a question, the librarian might be able to answer from their own knowledge but may need to find the appropriate book(s) to address questions about deep domain knowledge. Search extends the ”librarian” ability to find passages within the book in ways that have never existed before. The Dewey Decimal Classification enabled a searchable index of books. Personal computers evolved into a better catalog with some limited text search. RAG via Elasticsearch + Lucene enables the ability to not only find key passages within books across an entire library, but also to synthesize an answer for the user, often in less than a minute. The system is infinitely scalable as by adding more books to the library, the chances are stronger of having the required knowledge to answer a given question. The phrasing of the user input, prompts and settings like temperature (degree of randomness) still matter but we found that we can use search as a way to understand user intent and better augment the context passed on to the large language model for higher user satisfaction. Elastic Support’s Knowledge Library The body of knowledge that we draw from for both search and the Elastic Support Assistant depends on three key activities: the technical support articles our Support Engineers author, the product documentation and blogs that we ingest, and the enrichment service, which increases the search relevancy for each document in our hybrid search approach. It is also important to note that the answers to user questions often come from specific passages across multiple documents. This is a significant driver for why we chose to offer the Support Assistant. The effort for a user to find an answer in a specific paragraph across multiple documents is substantial. By extracting this information and sending it to the large language model, we both save the user time and return an answer in natural language that is easy to understand Technical Support Articles Elastic Support follows a knowledge-centered service approach where Support Engineers document their solutions and insights to cases so that our knowledge base (both internal and external) grows daily. This is run entirely on Elasticsearch on the back end and the EUI Markdown Editor control on the front end and is one of the key information sources for the Elastic Support Assistant. The majority of our work for the Support Assistant was to enable semantic search so we could take advantage of ELSER . Our prior architecture had two separate storage methods for knowledge articles. One was a Swiftype instance used for customer facing articles and the other was through Elastic Appsearch for internal Support team articles. This was tech debt on the part of our engineering team as Elasticsearch had already brought parity to several of the features we needed. Our new architecture takes advantage of document level security to enable role-based access from a single index source. Depending on the Elastic Support Assistant user, we could retrieve documents appropriate for them to use as part of the context sent to OpenAI. This can scale in the future to new roles as required. At times, we also find a need to annotate external articles with information for the Support team. To accommodate this, we developed an EUI plugin called private context. This finds multiline private context tags within the article that begin a block of text, parses them using regex to find private context blocks and then processes them as special things called AST nodes, of type privateContext . The result of these changes resulted in an index that we could use with ELSER for semantic search and nuanced access to information based on the role of the user. We performed multiple index migrations, resulting in a single document structure for both use cases. Each document contains four broad categories of fields. By storing the metadata and article content in the same JSON document, we can efficiently leverage different fields as needed. For our hybrid search approach in the Support Assistant, we use the title and summary fields for semantic search with BM25 on the much larger content field. This enables the Support Assistant to have both speed and high relevance to the text we will pass as context to the OpenAI GPT. Ingesting Product Documentation, Blogs & Search Labs Content Even though our technical support knowledge base has over 2,800 articles, we knew that there would be questions that these would not answer for users of the Elastic Support Assistant. For example: What new features would be available if I upgraded from Elastic Cloud 8.11 to 8.14? wouldn’t be present in technical support articles since it’s not a break-fix question or in the OpenAI model since 8.14 is past the model training date cutoff. We elected to address this by including more official Elastic sources, such as product documentation across all versions, Elastic blogs, Search/Security/Observability Labs and Elastic onboarding guides as the source for our semantic search implementation, similar to this example . By using semantic search to retrieve these docs when they were relevant, we enabled the Support Assistant to answer a much broader range of questions. The ingest process includes several hundred thousand documents and deals with complex site maps across Elastic properties. We elected to use a scraping and automation library called Crawlee in order to handle the scale and frequency needed to keep our knowledge library up to date. Each of the four crawler jobs executes on Google Cloud Run . We chose this because jobs can have a timeout of 24 hours and they can be scheduled without the use of Cloud Tasks or PubSub. Our needs resulted in a total of four jobs running in parallel, each with a base url that would capture a specific category of documents. When crawling websites we recommend starting with base urls that do not have overlapping content so as to avoid the ingestion of duplicates. This must be balanced with crawling at too high of a level and ingesting documents that aren't helpful to your knowledge store. For example, we crawl https://elastic.com/blog and https://www.elastic.co/search-labs/blog rather than elastic.co/ since our objective was technical documents. Even with the correct base url, we needed to account for different versions of the Elastic product docs (we have 114 unique versions across major/minors in our knowledge library). First, we built the table of contents for a product page in order to load and cache the different versions of the product. Our tech stack is a combination of Typescript with Node.js and Elastic's EUI for the front end components. We then load the table of contents for a product page and cache the versions of the product. If the product versions are already cached, then the function will do nothing. If the product versions are not cached, then the function will also enqueue all of the versions of the product page so that it can crawl all versions of the docs for the product. Request Handlers Since the structure of the documents we crawl can vary widely, we created a request handler for each document type. The request handler tells the crawler which CSS to parse as the body of the document. This creates consistency in the documents we store in Elasticsearch and captures the text that would be relevant. This is especially important for our RAG methodology as any filler text would also be searchable and could be returned incorrectly as a result for the context we send to the LLM. Blogs Request Handler This example is our most straightforward request handler. We specify that the crawler should look for a div element that matches the provided parameter. Any text within that div will be ingested as the content for the resulting Elasticsearch document. Product Documentation Request Handler In this product docs example, multiple css selectors contain text we want to ingest, giving each selector a list of possibilities. Text in one or more of these matching parameters will be included in the resulting document. The crawler also allows us to configure and send an authorization header, which prevents it from being denied access to scrape pages of all Elastic product documentation versions. Since we needed to anticipate that users of the Support Assistant might ask about any version of Elastic, it was crucial to capture enough documentation to account for nuances in each release. The product docs do have some duplication of content as a given page may not change across multiple product versions. We handled this by fine-tuning our search queries to default to the most current product docs unless otherwise specified by the user. The fourth blog will cover this in detail. Enriching Document Sources Our entire knowledge library at Elastic consists of over 300,000 documents. The documents varied widely in the type of metadata they had, if any at all. This created a need for us to enrich these documents so search would accommodate a larger range of user questions against them. At this scale, the team needed the process of enriching documents to be automated, simple and able to both backfill existing documents and to run on demand as new documents are created. We chose to use Elastic as a vector database and enable ELSER to power our semantic search – and generative ai to fill in the metadata gaps. ELSER Elastic ELSER (Elastic Learned Sparse Embedding Retrieval) enriches Elastic documents by transforming them into enriched embeddings that enhance search relevance and accuracy. This advanced embedding mechanism leverages machine learning to understand the contextual relationships within the data, going beyond traditional keyword-based search methods. This transformation allows for faster retrieval of pertinent information, even from large and complex datasets such as ours. What made ELSER a clear choice for our team was the ease of setup. We downloaded and deployed the model, created an ingest pipeline and reindexed our data. The result were enriched documents. How to install and run the support diagnostics troubleshooting utility is a popular technical support article. ELSER computed the vector database embeddings for both the title and summary since we use those with semantic search as part of our hybrid search approach. The result was stored in an Elastic doc as the ml field. Vector Embeddings for How to Install and Run… The embeddings in the ml field are stored as a keyword and vector pair. When a search query is issued, it is also converted into an embedding. Documents that have embeddings close to the query embedding are considered relevant and are retrieved and ranked accordingly. The example below is what the ELSER embeddings look like for the title field How to install and run the support diagnostics troubleshooting utility . Although, only the title is shown below, the field will also have all the vector embeddings for the summary. Summaries & Questions Semantic search could only be as effective as the quality of the document summaries. Our technical support articles have a summary written by support engineers but other docs that we ingested did not. Given the scale of our ingested knowledge, we needed an automated process to generate these. The simplest approach was to take the first 280 characters of each document and use that as the summary. We tested this and found that it led to poor search relevancy. One of our team’s engineers had the idea to instead use AI to do this. We created a new service which leveraged OpenAI GPT3.5 Turbo to backfill all of our documents which lacked a summary upon ingestion. In the future, we intend to test the output of other models to find what improvements we might see in the final summaries. As we have a private instance for GPT3.5 Turbo, we chose to use it in order to keep costs lows at the scale required. The service itself is straightforward and a result of finding and fine tuning an effective prompt. The prompt provides the large language model with a set of overall directions and then a specific subset of directions for each task. While more complex, this enabled us to create a Cloud Run job that loops through each doc in our knowledge library. The loop does the following tasks before moving onto the next document. Sends an API call to the LLM with the prompt and the text from the document's content field. Waits for a completed response (or handles any errors gracefully). Updates the summary and questions fields in the document. Runs the next document. Cloud Run allows us to control the number of concurrent workers so that we don't use all of the allocated threads to our LLM instance. Doing so would result in a timeout for any users of the Support Assistant, so we elected to backfill the existing knowledge library over a period of weeks -- starting with the most current product docs. Create the Overall Summary This section of the prompt outputs a summary that is as concise as possible while still maintaining accuracy. We achieve this through asking the LLM to take multiple passes at the text it generates and check for accuracy against the source document. Specific guidelines are indicated so that each document's outputs will be consistent. Try this prompt for yourself with an article to see the type of results it can generate. Then change one or more guidelines and run the prompt in a new chat to observe the difference in output. Create the Second Summary Type We create a second summary which enables us to search for specific passages of the overall text that will represent the article. In this use case, we try to maintain a closer output to the key sentences already within the document. Create a Set of Relevant Questions In addition to the summaries, we asked the GPT to generate a set of questions that would be relevant to the document. This will be used in several ways, including semantic-search based suggestions for the user. We are also testing the relevancy of including the question set in our hybrid search approach for the Elastic Support Assistant so that we search the title, summary, body content and question set. Support Assistant Demo Despite a large number of tasks and queries that run in the back end, we elected to keep the chat interface itself simple to use. A successful Support Assistant will work without friction and provide expertise that the user can trust. Our alpha build is shown below. Key Learnings The process of building our current knowledge library has not been linear. As a team we test new hypotheses daily and observe the behaviors of our Support Assistant users to understand their needs. We push code often to production and measure the impact so that we have small failures rather than feature and project level ones. Smaller, more precise context makes the LLM responses significantly more deterministic. We initially passed larger text passages as context to the user question. This decreased the accuracy of the results as the large language model would often pass over key sentences in favor of ones that didn’t answer the question. This transitioned search to become a problem of both finding the right documents and how these aligned with user questions. An RBAC strategy is essential for managing what data a given persona can access. Document level security reduced our infrastructure duplication, drove down deployment costs and simplified the queries we needed to run. As a team, we realized early on that our tech debt would prevent us from achieving a lovable experience for the Support Assistant. We collaborated closely with our product engineers and came up with a blueprint for using the latest Elastic features. We will write an in-depth blog about our transition from Swiftype and Appsearch to elaborate on this learning. Stay tuned! One search query does not cover the potential range of user questions. More on this in Part 4 (search and relevancy tuning for a RAG chatbot). We measured the user sentiment of responses and learned to interpret user intent much more effectively. In effect, what is the search question behind the user question? Understanding what our users search for plays a key role in how we enrich our data. Even at the scale of hundreds of thousands of documents, we still find gaps in our documented knowledge. By analyzing our user trends we are able to determine when to add new types of sources and better enrich our existing data to allow us to package together a context from multiple sources that the LLM can use for further elaboration. What's next? At the time of writing, we have vector embeddings for the more than 300,000 documents in our indices and over 128,000 ai-generated summaries with an average of 8 questions per document. Given that we only have ~8,000 technical support articles with human-written summaries, this was a 10x improvement for our semantic search results. Field Engineering has a roadmap of new ways to expand our knowledge library and stretch what's technically possible with our explicit search interface and the Elastic Support Assistant. For example, we plan to create an ingest and search strategy for technical diagrams and ingest Github issues for Elastic employees. Creating the knowledge sources was just one step of our journey with the Elastic Support Assistant. Read about our initial GenAI experiments in the first blog here . In the third blog , we dive into the design and implementation of the user experience. Following that, our fourth blog discusses our strategies for tuning search relevancy to provide the best context to LLMs. Stay tuned for more insights and inspiration for your own generative AI projects! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Jump to Retrieval-Augmented Generation (RAG) Over a Fine Tuned Model Understanding the Support Assistant as a Search Problem Elastic Support’s Knowledge Library Technical Support Articles Ingesting Product Documentation, Blogs & Search Labs Content Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "GenAI for Customer Support — Part 2: Building a Knowledge Library - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/genai-customer-support-building-a-knowledge-library",
+    "meta_description": "This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time!"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog A quick introduction to vector search This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. Vector Database VC By: Valentin Crettaz On February 6, 2025 Part of Series Vector search introduction and implementation Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. This first part focuses on providing a general introduction to the basics of embedding vectors and how vector search works under the hood. Armed with all the knowledge learned in the first article, the second part will guide you through the meanders of how to set up vector search in Elasticsearch. In the third part , we’ll leverage what we’ve learned in the first two parts, build upon that knowledge, and delve into how to craft powerful hybrid search queries in Elasticsearch. Before we dive into the real matter of this article, let’s go back in time and review some of the history of vectors, which is a keystone concept in semantic search. Vectors are not new Pretty sure everyone would agree that since the advent of ChatGPT in November 2022, not a single day goes by without hearing or reading about “vector search.” It’s everywhere and so prevalent that we often get the impression this is a new cutting-edge technology that just came out, yet the truth is that this technology has been around for more than six decades! Research on the subject began in the mid-1960s, and the first research papers were published in 1978 by Gerard Salton, an information retrieval pundit, and his colleagues at Cornell University. Salton’s work on dense and sparse vector models constitutes the root of modern vector search technology. In the last 20 years, many different vector DBMS based on his research have been created and brought to market. These include Elasticsearch powered by the Apache Lucene project, which started working on vector search in 2019. Vectors are now everywhere and so pervasive that it is important to first get a good grasp of their underlying theory and inner workings before playing with them. Before we dive into that, let’s quickly review the differences between lexical search and vector search so we can better understand how they differ and how they can complement each other. Vector search vs. lexical search An easy way to introduce vector search is by comparing it to the more conventional lexical search that you’re probably used to. Vector search, also commonly known as semantic search, and lexical search work very differently. Lexical search is the kind of search that we’ve all been using for years in Elasticsearch. To summarize it very briefly, it doesn’t try to understand the real meaning of what is indexed and queried, instead, it makes a big effort to lexically match the literals of the words or variants of them (think stemming, synonyms, etc.) that the user types in a query with all the literals that have been previously indexed into the database using similarity algorithms, such as TF-IDF Figure 1: A simple example of a lexical search As we can see, the three documents to the top left are tokenized and analyzed. Then, the resulting terms are indexed in an inverted index, which simply maps the analyzed terms to the document IDs containing them. Note that all of the terms are only present once and none are shared by any document. Searching for “nice german teacher” will match all three documents with varying scores, even though none of them really catches the true meaning of the query. As can be seen in Figure 2, below, it gets even trickier when dealing with polysemy or homographs, i.e., words that are spelled the same but have different meanings (right, palm, bat, mean, etc.) Let’s take the word “right” which can mean three different things, and see what happens. Figure 2: Searching for homographs Searching for “I’m not right” returns a document that has the exact opposite meaning as the first returned result. If you search for the exact same terms but order them differently to produce a different meaning, e.g., “turn right” and “right turn,” it yields the exact same result (i.e., the third document “Take a right turn”). Granted, our queries are overly simplified and don’t make use of the more advanced queries such as phrase matching, but this helps illustrate that lexical search doesn’t understand the true meaning behind what’s indexed and what’s searched. If that’s not clear, don’t fret about it, we’ll revisit this example in the third article to see how vector search can help in this case. To do some justice to lexical search, when you have control over how you index your structured data (think mappings, text analysis, ingest pipelines, etc.) and how you craft your queries (think cleverly crafted DSL queries, query term analysis, etc.), you can do wonders with lexical search engines, there’s no question about it! The track records of Elasticsearch regarding its lexical search capabilities are just amazing. What it has achieved and how much it has popularized and improved the field of lexical search over the past few years is truly remarkable. However, when you are tasked to provide support for querying unstructured data (think images, videos, audios, raw text, etc.) to users who need to ask free-text questions, lexical search falls short. Moreover, sometimes the query is not even text, it could be an image, as we’ll see shortly. The main reason why lexical search is inadequate in such situations is that unstructured data can neither be indexed nor queried the same way as structured data. When dealing with unstructured data, semantics comes into play. What does semantics mean? Very simply, the meaning! Let’s take the simple example of an image search engine (e.g., Google Image Search or Lens). You drag and drop an image, and the Google semantic search engine will find and return the most similar images to the one you queried. In Figure 3, below, we can see on the left side the picture of a German shepherd and to the right all the similar pictures that have been retrieved, with the first result being the same picture as the provided one (i.e., the most similar one). Figure 3: Searching for a picture. Source: Google Image Search, https://www.google.com/imghp Even if this sounds simple and logical for us humans, for computers it’s a whole different story. That’s what vector search enables and helps to achieve. The power unlocked by vector search is massive, as the world has recently witnessed. Let’s now lift the hood and discover what hides underneath. Embedding vectors As we’ve seen earlier, with lexical search engines, structured data such as text can easily be tokenized into terms that can be matched at search time, regardless of the true meaning of the terms. Unstructured data, however, can take different forms, such as big binary objects (images, videos, audios, etc.), and are not at all suited for the same tokenizing process. Moreover, the whole purpose of semantic search is to index data in such a way that it can be searched based on the meaning it represents. How do we achieve that? The answer lies in two words: Machine Learning ! Or more precisely Deep Learning! Deep Learning is a specific area of machine learning that relies on models based on artificial neural networks made of multiple layers of processing that can progressively extract the true meaning of the data. The way those neural network models work is heavily inspired by the human brain. Figure 4, below, shows what a neural network looks like, with its input and output layers as well as multiple hidden layers: Figure 4: Neural network layers. Source: IBM, https://www.ibm.com/topics/neural-networks The true feat of neural networks is that they are capable of turning a single piece of unstructured data into a sequence of floating point values, which are known as embedding vectors or simply embeddings . As human beings, we can pretty well understand what vectors are as long as we visualize them in a two or three-dimensional space. Each component of the vector represents a coordinate in a 2D x-y plane or a 3D x-y-z space. However, the embedding vectors on which neural network models work can have several hundreds or even thousands of dimensions and simply represent a point in a multi-dimensional space. Each vector dimension represents a feature , or a characteristic, of the unstructured data. Let’s illustrate this with a deep learning model that turns images into embedding vectors of 2048 dimensions. That model would turn the German shepherd picture we used in Figure 3 into the embedding vector shown in the table below. Note that we only show the first and last three elements, but there would be 2,042 more columns/dimensions in the table. is_red is_dog blue_sky … no_gras german_shepherd is_tree German shepherd embeddings 0.0121 0.9572 0.8735 … 0.1198 0.9712 0.0512 Each column is a dimension of the model and represents a feature, or characteristic, that the underlying neural network seeks to modelize. Each input given to the model will be characterized depending on how similar that input is to each of the 2048 dimensions. Hence, the value of each element in the embedding vector denotes the similarity of that input to a specific dimension. In this example, we can see that the model detected a high similarity between dogs and German shepherds and also the presence of some blue sky. In contrast to lexical search, where a term can either be matched or not, with vector search we can get a much better sense of how similar a piece of unstructured data is to each of the dimensions supported by the model. As such, embedding vectors serve as a fantastic semantic representation of unstructured data. The secret sauce Now that we know how unstructured data is sliced and diced by deep learning neural networks into embedding vectors that capture the similarity of the data along a high number of dimensions, we need to understand how the matching of those vectors works. It turns out that the answer is pretty simple. Embedding vectors that are close to one another represent semantically similar pieces of data. So, when we query a vector database, the search input (image, text, etc.) is first turned into an embedding vector using the same model that has been used for indexing all the unstructured data, and the ultimate goal is to find the nearest neighboring vectors to that query vector. Hence, all we need to do is figure out how to measure the “distance” or “similarity” between the query vector and all the existing vectors indexed in the database, that’s pretty much it. Distance and similarity Luckily for us, measuring the distance between two vectors is an easy problem to solve thanks to vector arithmetics. So, let’s look at the most popular distance and similarity functions that are supported by modern vector search databases, such as Elasticsearch. Warning, math ahead! L1 distance The L1 distance, also called the Manhattan distance, of two vectors x and y is measured by summing up the pairwise absolute difference of all their elements. Obviously, the smaller the distance d, the closer the two vectors are. The formula is pretty simple, as can be seen below: Visually, the L1 distance can be illustrated as shown in Figure 5, below: Figure 5: Visualizing the L1 distance between two vectors Let’s take two vectors x and y, such as x = (1, 2) and y = (4, 3), then the L1 distance of both vectors would be | 1 - 4 | + | 2 - 3 | = 4. L2 distance The L2 distance, also called the Euclidean distance, of two vectors x and y is measured by first summing up the square of the pairwise difference of all their elements and then taking the square root of the result. It’s basically the shortest path between two points (also called hypotenuse). Similarly to L1, the smaller the distance d, the closer the two vectors are: The L2 distance is shown in Figure 6 below: Figure 6: Visualizing the L2 distance between two vectors Let’s reuse the same two sample vectors x and y as we used for the L1 distance, and we can now compute the L2 distance as ( 1 − 4 ) 2 + ( 2 − 3 ) 2 = 10 (1 - 4)^2 + (2 - 3)^2 = 10 ( 1 − 4 ) 2 + ( 2 − 3 ) 2 = 10 . Taking the square root of 10 would yield 3.16. Linf distance The Linf (for L infinity) distance, also called the Chebyshev or chessboard distance, of two vectors x and y is simply defined as the longest distance between any two of their elements or the longest distance measured along one of the axis/dimensions. The formula is very simple and shown below: A representation of the Linf distance is shown in Figure 7 below: Figure 7: Visualizing the Linf distance between two vectors Again, taking the same two sample vectors x and y, we can compute the L infinity distance as max ( | 1 - 4 | , | 2 - 3 | ) = max (3, 1) = 3. Cosine similarity In contrast to L1, L2, and Linf, cosine similarity does not measure the distance between two vectors x and y, but rather their relative angle, i.e., whether they are both pointing in roughly the same direction. The higher the similarity s, the “closer” the two vectors are. The formula is again very simple and shown below: A way to represent the cosine similarity between two vectors is shown in Figure 8 below: Figure 8: Visualizing the cosine similarity between two vectors Furthermore, as cosine values are always in the [-1, 1] interval, -1 means opposite similarity (i.e., a 180° angle between both vectors), 0 means unrelated similarity (i.e., a 90° angle), and 1 means identical (i.e., a 0° angle), as shown in Figure 9 below: Figure 9: The cosine similarity spectrum Once again, let’s reuse the same sample vectors x and y and compute the cosine similarity using the above formula. First, we can compute the dot product of both vectors as ( 1 ⋅ 4 ) + ( 2 ⋅ 3 ) = 10 (1 · 4) + (2 · 3) = 10 ( 1 ⋅ 4 ) + ( 2 ⋅ 3 ) = 10 . Then, we multiply the length (also called magnitude) of both vectors: ( 1 2 + 2 2 ) 1 / 2 + ( 4 2 + 3 2 ) 1 / 2 = 11.18034. (1^2 + 2^2)^{1/2}+ (4^2 + 3^2)^{1/2} = 11.18034. ( 1 2 + 2 2 ) 1/2 + ( 4 2 + 3 2 ) 1/2 = 11.18034. Finally, we divide the dot product by the multiplied length 10 / 11.18034 = 0.894427 (i.e., a 26° angle), which is quite close to 1, so both vectors can be considered pretty similar. Dot product similarity One drawback of cosine similarity is that it only takes into account the angle between two vectors but not their magnitude (i.e., length), which means that if two vectors point roughly in the same direction but one is much longer than the other, both will still be considered similar. Dot product similarity, also called scalar or inner product, improves that by taking into account both the angle and the magnitude of the vectors, which provides for a much more accurate similarity metric. Two equivalent formulas are used to compute dot product similarity. The first is the same as we’ve seen in the numerator of cosine similarity earlier: The second formula simply multiplies the length of both vectors by the cosine of the angle between them: Dot product similarity is visualized in Figure 10, below: Figure 10: Visualizing the dot product similarity between two vectors One last time, we take the sample x and y vectors and compute their dot product similarity using the first formula, as we did for the cosine similarity earlier, as (1 ⋅ · ⋅ 4) + (2 ⋅ · ⋅ 3) = 10. Using the second formula, we multiply the length of both vectors: ( 1 2 + 2 2 ) 1 / 2 + ( 4 2 + 3 2 ) 1 / 2 = 11.18034 (1^2 + 2^2)^{1/2}+ (4^2 + 3^2)^{1/2} = 11.18034 ( 1 2 + 2 2 ) 1/2 + ( 4 2 + 3 2 ) 1/2 = 11.18034 and multiply that by the cosine of the 26° angle between both vectors, and we get 11.18034 ⋅ · ⋅ cos(26°) = 10. One thing worth noting is that if all the vectors are normalized first (i.e., their length is 1), then the dot product similarity becomes exactly the same as the cosine similarity (because |x| |y| = 1), i.e., the cosine of the angle between both vectors. As we’ll see later, normalizing vectors is a good practice to adopt in order to make the magnitude of the vector irrelevant so that the similarity simply focuses on the angle. It also speeds up the distance computation at indexing and query time, which can be a big issue when operating on billions of vectors. Quick recap Wow, we’ve been through a LOT of information so far, so let’s halt for a minute and make a quick recap of where we stand. We’ve learned that… …semantic search is based on deep learning neural network models that excel at transforming unstructured data into multi-dimensional embedding vectors. …each dimension of the model represents a feature or characteristic of the unstructured data. …an embedding vector is a sequence of similarity values (one for each dimension) that represent how similar to each dimension a given piece of unstructured data is. …the “closer” two vectors are (i.e., the nearest neighbors), the more they represent semantically similar concepts. …distance functions (L1, L2, Linf) allow us to measure how close two vectors are. …similarity functions (cosine and dot product) allow us to measure how much two vectors are heading in the same direction. Now, the last remaining piece that we need to dive into is the vector search engine itself. When a query comes in, the query is first vectorized, and then the vector search engine finds the nearest neighboring vectors to that query vector. The brute-force approach of measuring the distance or similarity between the query vector and all vectors in the database can work for small data sets but quickly falls short as the number of vectors increases. Put differently, how can we index millions, billions, or even trillions of vectors and find the nearest neighbors of the query vector in a reasonable amount of time? That’s where we need to get smart and figure out optimal ways of indexing vectors so we can zero in on the nearest neighbors as fast as possible without degrading precision too much. Vector search algorithms and techniques Over the years, many different research teams have invested a lot of effort into developing very clever vector search algorithms. Here, we’re going to briefly introduce the main ones. Depending on the use case, some are better suited than others. Linear search We briefly touched upon linear search, or flat indexing, earlier when we mentioned the brute-force approach of comparing the query vector with all vectors present in the database. While it might work well on small datasets, performance decreases rapidly as the number of vectors and dimensions increase (O(n) complexity). Luckily, there are more efficient approaches called approximate nearest neighbor (ANN) where the distances between embedding vectors are pre-computed and similar vectors are stored and organized in a way that keeps them close together, for instance using clusters, trees, hashes, or graphs. Such approaches are called “approximate” because they usually do not guarantee 100% accuracy. The ultimate goal is to either reduce the search scope as much and as quickly as possible in order to focus only on areas that are most likely to contain similar vectors or to reduce the vectors’ dimensionality . K-Dimensional trees A K-Dimensional tree, or KD tree, is a generalization of a binary search tree that stores points in a k-dimensional space and works by continuously bisecting the search space into smaller left and right trees where the vectors are indexed. At search time, the algorithm simply has to visit a few tree branches around the query vector (the red point in Figure 11) in order to find the nearest neighbor (the green point in Figure 11). If more than k neighbors are requested, then the yellow area is extended until the algorithm finds more neighbors. Figure 11: KD tree algorithm. Source: https://salzi.blog/2014/06/28/kd-tree-and-nearest-neighbor-nn-search-2d-case/ The biggest advantage of the KD tree algorithm is that it allows us to quickly focus only on some localized tree branches, thus eliminating most of the vectors from consideration. However, the efficiency of this algorithm decreases as the number of dimensions increases because many more branches need to be visited than in lower-dimensional spaces. Inverted file index The inverted file index (IVF) approach is also a space-partitioning algorithm that assigns vectors close to each other to their shared centroid. In the 2D space, this is best visualized with a Voronoi diagram as shown in Figure 12: Figure 12: Voronoi representation of an inverted file index in the 2D space. Source: https://docs.zilliz.com/docs/vector-index-basics-and-the-inverted-file-index We can see that the above 2D space is partitioned into 20 clusters, each having its centroid denoted as black dots. All embedding vectors in the space are assigned to the cluster whose centroid is closest to them. At search time, the algorithm first figures out the cluster to focus on by finding the centroid that is closest to the query vector, and then it can simply zero in on that area, and the surrounding ones as well if needed, in order to find the nearest neighbors. This algorithm suffers from the same issue as KD trees when used in high-dimensional spaces. This is called the curse of dimensionality, and it occurs when the volume of the space increases so much that all the data seems sparse and the amount of data that would be required to get more accurate results grows exponentially. When the data is sparse, it becomes harder for these space-partitioning algorithms to organize the data into clusters. Luckily for us, there are other algorithms and techniques that alleviate this problem, as detailed below. Quantization Quantization is a compression -based approach that allows us to reduce the total size of the database by decreasing the precision of the embedding vectors. This can be achieved using scalar quantization (SQ) by converting the floating point vector values into integer values. This not only reduces the size of the database by a factor of 8 but also decreases memory consumption and speeds up the distance computation between vectors at search time. Another technique is called product quantization (PQ), which first divides the space into lower-dimensional subspaces, and then vectors that are close together are grouped in each subspace using a clustering algorithm (similar to k-means). Note that quantization is different from dimensionality reduction , where the number of dimensions is reduced, i.e., the vectors simply become shorter. Hierarchical Navigable Small Worlds (HNSW) If it looks complex just by reading the name, don’t worry, it’s not really! In short, Hierarchical Navigable Small Worlds is a multi-layer graph-based algorithm that is very popular and efficient. It is used by many different vector databases, including Apache Lucene. A conceptual representation of HNSW can be seen in Figure 13, below. Figure 13: Hierarchical Navigable Small Worlds. Source: https://towardsdatascience.com/similarity-search-part-4-hierarchical-navigable-small-world-hnsw-2aad4fe87d37 On the top layer, we can see a graph of very few vectors that have the longest links between them, i.e., a graph of connected vectors with the least similarity. The more we dive into lower layers, the more vectors we find and the denser the graph becomes, with more and more vectors closer to one another. At the lowest layer, we can find all the vectors, with the most similar ones being located closest to one another. At search time, the algorithm starts from the top layer at an arbitrary entry point and finds the vector that is closest to the query vector (shown by the gray point). Then, it moves one layer below and repeats the same process, starting from the same vector that it left in the above layer, and so on, one layer after another, until it reaches the lowest layer and finds the nearest neighbor to the query vector. Locality-sensitive hashing (LSH) In the same vein as all the other approaches presented so far, locality-sensitive hashing seeks to drastically reduce the search space in order to increase the retrieval speed. With this technique, embedding vectors are transformed into hash values, all by preserving the similarity information, so that the search space ultimately becomes a simple hash table that can be looked up instead of a graph or tree that needs to be traversed. The main advantage of hash-based methods is that vectors containing an arbitrary (big) number of dimensions can be mapped to fixed-size hashes, which enormously speeds up retrieval time without sacrificing too much precision. There are many different ways of hashing data in general, and embedding vectors in particular, but this article will not dive into the details of each of them. Conventional hashing methods usually produce very different hashes for data that seem very similar. Since embedding vectors are composed of float values, let’s take two sample float values that are considered to be very close to one another in vector arithmetic (e.g., 0.73 and 0.74) and run them through a few common hashing functions. Looking at the results below, it’s pretty obvious that common hashing functions do not retain the similarity between the inputs. Hashing function 0.73 0.74 MD5 1342129d04cd2924dd06cead4cf0a3ca 0aec1b15371bd979cfa66b0a50ebecc5 SHA1 49d2c3e0e44bff838e1db571a121be5ea874e8d9 a534e76482ade9d9fe4bff3035a7f31f2f363d77 SHA256 99d03fc3771fe6848d675339fc49eeb1cb8d99a12e6358173336b99a2ec530ea 5ecbc825ba5c16856edfdaf0abc5c6c41d0d8a9c508e34188239521dc7645663 While conventional hashing methods try to minimize hashing collisions between similar data pieces, the main objective of locality-sensitive hashing is to do exactly the opposite, i.e., to maximize hashing collisions so that similar data falls within the same bucket with a high probability. By doing so, embedding vectors that are close together in a multi-dimensional space will be hashed to a fixed-size value falling in the same bucket. Since LSH allows those hashed vectors to retain their proximity, this technique comes in very handy for data clustering and nearest neighbor searches. All the heavy lifting happens at indexing time when the hashes need to be computed, while at search time we only need to hash the query vector in order to look up the bucket that contains the closest embedding vectors. Once the candidate bucket is found, a second round usually takes place to identify the nearest neighboring vectors to the query vector. Let’s conclude In order to introduce vector search, we had to cover quite some ground in this article. After comparing the differences between lexical search and vector search, we’ve learned how deep learning neural network models manage to capture the semantics of unstructured data and transcode their meaning into high-dimensional embedding vectors, a sequence of floating point numbers representing the similarity of the data along each of the dimensions of the model. It is also worth noting that vector search and lexical search are not competing but complementary information retrieval techniques (as we’ll see in the third part of this series when we’ll dive into hybrid search). After that, we introduced a fundamental building block of vector search, namely the distance (and similarity) functions that allow us to measure the proximity of two vectors and assess the similarity of the concepts they represent. Finally, we’ve reviewed different flavors of the most popular vector search algorithms and techniques, which can be based on trees, graphs, clusters, or hashes, whose goal is to quickly narrow in on a specific area of the multi-dimensional space in order to find the nearest neighbors without having to visit the entire space like a linear brute-force search would do. If you like what you’re reading, make sure to check out the other parts of this series: Part 2: How to Set Up Vector Search in Elasticsearch Part 3: Hybrid Search Using Elasticsearch Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Vectors are not new Vector search vs. lexical search Embedding vectors The secret sauce Distance and similarity Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "A quick introduction to vector search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/introduction-to-vector-search",
+    "meta_description": "Learn about vector search (aka semantic search), including the basics of vectors, how vector search works, and how it differs from lexical search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog What is semantic reranking and how to use it? Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines. ML Research Search Relevance TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou On October 29, 2024 Part of Series Semantic reranking & the Elastic Rerank model Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this series of blogs we'll introduce Elastic's new semantic reranker. Semantic reranking often improves relevance, particularly in a zero-shot setting. It can also be used to trade-off indexing compute cost for querying compute cost by significantly improving lexical retrieval relevance. In this first blog we set the scene with some background on semantic reranking and how it can fit into your search and RAG pipelines. In the second installment, we introduce you to Elastic Rerank: Elastic's new semantic re-ranker model we've trained and released in technical preview. Retrieval Typically, text search is broken down into multiple stages, which gradually filter the result set into the final list that is presented to a user (or an LLM). The first stage is called retrieval and must be able to scale to efficiently compare the query text with a very large corpus of candidate matches. This limits the set of approaches that one can consider. For many years, the only paradigm available for retrieval was lexical. Here documents and queries are treated as bags of words and a statistical model is used to deduce relevance. The most popular option in this camp is BM25. For this choice, the query can be efficiently compared with a huge document corpus using inverted indices together with clever optimisations to prune non-competitive candidates. It remains a useful option since many queries, such as keyword searches and exact phrase matching, are well aligned with this model and it is easy to efficiently apply filtering predicates at the same time. The scoring is also tailored to the corpus characteristics which makes it a strong baseline when no tuning is applied. Finally, it is particularly efficient from an indexing perspective: no model inference needs to run, updating index data structures is very efficient and a lot of state can permanently reside on disk. In recent years, semantic retrieval has seen a surge in popularity. There are multiple flavors of this approach; for example, dense passage , learned sparse and late interaction retrieval. In summary, they use a transformer model to independently create representations of the query and each document and define a distance function on these representations to capture semantic similarity. For example, the query and document might be both embedded into a high dimensional vector space where queries and their relevant documents have low angular separation. These sorts of approaches have different strengths to BM25: they can find matches that require understanding synonyms, where the context is important to determine word meanings, where there are misspellings and so on. They also allow for wholly new relevance signals, such as embedding images and text in a common vector space. Queries can be efficiently compared against very large corpuses of documents by dropping the requirement that exact nearest neighbor sets are found. Data structures like HNSW can be used to find most of the best matches in logarithmic complexity in the corpus size. Intelligent compression schemes allow significant amounts of data to reside on disk. However, it is worth noting that model inference must be run on all documents before indexing and these data structures are relatively expensive to build in comparison to inverted indices. A lot of work has been done to improve the training of general purpose semantic retrieval models and indeed the best models significantly outperform BM25 in benchmarks that try and assess zero shot retrieval quality. Semantic reranking So far we have discussed methods that independently create representations of the query and document. This choice is necessary to scale retrieval. However, given the top N results returned by first stage retrieval we don't have the same constraint. The work that must be done to compare the query and these top N results is naturally much smaller, so we can consider new approaches for reordering them in order to improve relevance of the final result. This task is called reranking. We define semantic reranking as using a model to assess the semantic similarity of a query and one (or more) document text(s). This is to distinguish it from other reranking methods such as learn to rank , which typically use a variety of features to model user preference. Note that what constitutes semantic similarity can vary from task to task: for example, finding similar documents requires assessing similarity of two texts, whereas answering a question requires understanding if the necessary information is contained in the document text. In principle, any semantic first stage retrieval method can be used for reranking. For example, ELSER could be used to rerank the top results of a BM25 search. Keep in mind though that there may be blindspots in BM25 retrieval, which tends to have lower recall than semantic retrieval, and no reranking method will be able to fix these. It is therefore important to evaluate setups like these on your own data. From a performance standpoint, one is trading indexing compute cost for querying compute cost and possibly latency. Compared to semantic retrieval, in addition to the cost of embedding the query, you must also embed each document you want to rerank. This can be a good cost trade-off if you have a very large corpus and/or one which is frequently updated and relatively few queries per second. Furthermore, for GPUs the extra cost is partly amortized by the fact that the document inferences can be processed as a batch, which allows for better utilization. However, there is little cost benefit compared to a model which gets to see the query and the document at the same time. This approach is called cross-encoding, as opposed to bi-encoding which is used for semantic retrieval, and can bring significant benefits. Cross-encoders For cross-encoders both the query and a document text are presented together to the model concatenated with a special separation token. The model itself returns a similarity score. Schematically, the text is modeled something like the following: In a bi-encoder the query and the document are first embedded individually and then compared using a simple similarity function. Schematically, the text is modeled something like the following: Reranking for a fixed query with a cross-encoder is framed as a regression problem. The model outputs a numerical scores for each query-document pair. Then the documents are sorted in descending score order. We will return to the process by which this model is trained in the second blog in this series. Conceptually, it is useful to realize that this allows the model to attend to different parts of the query and document text and learn rich features for assessing relevance. It has been observed that this process allows the model to learn more robust representations for generally assessing relevance. It also potentially allows the model to capture more nuanced semantics. For example, bi-encoder models struggle with things like negation and instead tend to pick up on matches for the majority concepts in the text, independent of whether the query wants to include or exclude them. Cross-encoder models have the capacity to learn how negation should affect relevance judgments. Finally, cross-encoder scores are often better calibrated across a diverse range of query types and topics. This makes choosing a score at which to drop documents significantly more reliable. Connection with RAG Improving the content supplied to an LLM improves the quality of RAG. Indeed search quality is often the bottleneck for RAG performance. For example, if the information needed to respond correctly to a question is contained exclusively in a specific document, that document must be provided in the LLM context window. Furthermore, whilst the current generation of long context models are excellent at extracting information from long contexts, the cost to process the extra input tokens is significant and money spent on search typically yields large overall cost efficiencies . RAG use cases generally also have looser latency constraints, so some extra time spent in reranking is less of an issue. Indeed, the latency can also be offset by reducing the generation time if fewer passages need to be supplied in the prompt to achieve the same recall. This makes semantic reranking especially well suited to be applied in RAG scenarios. Wrapping Up In this post we introduced the concept of semantic reranking and discussed how model architecture can be tailored to this use case to improve relevance, particularly in a zero shot setting. We discussed the performance trade-offs associated with semantic reranking as opposed to semantic retrieval. A crucial choice when discussing performance in this context is how many documents to rerank, which critically affects the trade-off between performance and relevance of reranking methods. We will pick up this topic again when we discuss how to evaluate reranking models and survey some state of the art open and closed reranking models. In the second installment of this series, we introduce you to Elastic Rerank: Elastic's new semantic re-ranker model we've trained and released in technical preview. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Jump to Retrieval Semantic reranking Cross-encoders Connection with RAG Wrapping Up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "What is semantic reranking and how to use it? - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-semantic-reranker-part-1",
+    "meta_description": "Learn about semantic reranking and how it can fit into your search & RAG pipelines. This blog also covers cross-encoders and semantic retrieval."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Chatting with your PDFs using Playground This blog showcases a practical example of chatting with PDFs in Playground. You'll learn how to upload PDF files into Kibana and interact with them using Elastic Playground. Integrations Ingestion How To TM By: Tomás Murúa On January 8, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Elasticsearch 8.16 has a new functionality that allows you to upload PDF files directly into Kibana and analyze them using Playground. In this article, we'll see how to use this functionality by uploading a resume in PDF format and then using Playground to interact with it. Playground is a low-code platform hosted in Kibana that allows you to create a RAG application and chat with your content. You can read more about it in this article and even test it using this link . Steps: Configure the Elasticsearch Inference Service Endpoint Upload PDFs to Kibana Interact with the data in Playground Configure the Elasticsearch Inference Service Endpoint To run semantic searches, we must first configure an inference endpoint. In this example, we'll use the Elasticsearch Inference Endpoint . This endpoint offers: rerank sparse embedding text embedding For this example, let's select sparse embedding : Once configured, confirm that the model was correctly loaded into Kibana by checking Search > Relevance > Inference Endpoint in the Kibana UI. Upload PDFs to Kibana We'll upload the resume of a junior developer to learn how to use the Kibana upload files functionality. Go to the Kibana UI and follow these steps: Next, for Import Data , we have two options: Simple: This is the default option and it allows us to quickly upload our PDF into the index and automatically creates a data view with the indexed info. Advanced: This option allows us to customize mappings or add ingest pipelines. Within these settings you can: Add a semantic text type of field. Index Settings : If you want to configure things like shards or analyzers. Index Mappings : If you want to change a field type or how you define your data. Ingest Pipeline : If you want to make changes to your data before indexing it. Go to \"Advanced\" and select \"Add additional field\": Select the field attachment.content ; in “copy to field” type \"content\" and make sure that the inference endpoint is my-elser-model : The field Copy to is used to copy the content from attachment.content to a new semantic_text field of (content), which automatically generates vector embeddings using the underlying Inference endpoint (Elastic’s ELSER in this case). This makes both the semantic and text fields available so you can run full-text , semantic , or hybrid searches. Once everything is configured, click on \"Import\": Now that the index is created, we can explore it using Playground. Interact with the data in Playground Connect to Playground After configuring the index and uploading the resumes, we now need to connect the index to Playground. Click Connect to an LLM and select one of the options. Configure the chatbot Once Playground has been configured and we have indexed Alex Johnson's resume, we can interact with the data. Using semantic search and LLMs we can ask questions using natural language and get answers even if the documents don't have the keywords we used in the query, like in the example below: Using the instructions menu, we can control the chatbot behavior and define features like the response format. It can also include citations, to make sure the answer is properly grounded. If we go to the \"Query\" tab, we can see the query generated by Playground and we add both a text and a semantic_text fields, Playground will automatically generate a hybrid query to normalize the score between different types of different types of queries. Playground not only answers questions but also helps us understand the internal components of a RAG system, like querying, retrieval phase, context and prompt instructions. Give it a try and chat with your PDFs! With the Elasticsearch 8.16 update, we can easily upload PDF/Word/Powerpoint files using the Kibana UI. It can automatically create an index in the simple mode, and you can use the advanced mode to customize your index and tailor it to your needs. Once your files are uploaded, you can access Playground and quickly and easily chat with them since Playground will handle the LLM interactions and provide the best query based on the type of fields you want to search. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Configure the Elasticsearch Inference Service Endpoint Upload PDFs to Kibana Interact with the data in Playground Connect to Playground Configure the chatbot Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Chatting with your PDFs using Playground - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/chat-with-pdf-elastic-playground",
+    "meta_description": "Learn how to chat with your PDFs using Elastic Playground. We'll upload PDF files into Kibana and then use Playground to chat with them."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Advanced RAG techniques part 1: Data processing Discussing and implementing techniques which may increase RAG performance. Part 1 of 2, focusing on the data processing and ingestion component of an advanced RAG pipeline. Vector Database Generative AI HC By: Han Xiang Choong On August 14, 2024 Part of Series Advanced RAG techniques Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This is Part 1 of our exploration into Advanced RAG Techniques. Click here for Part 2! The recent paper Searching for Best Practices in Retrieval-Augmented Generation empirically assesses the efficacy of various RAG enhancing techniques, with the goal of converging on a set of best-practices for RAG. The RAG pipeline recommended by Wang and colleagues. We'll implement a few of these proposed best-practices, namely the ones which aim to improve the quality of search (Sentence Chunking, HyDE, Reverse Packing) . For brevity, we will omit those techniques focused on improving efficiency (Query Classification and Summarization) . We will also implement a few techniques that were not covered, but which I personally find useful and interesting (Metadata Inclusion, Composite Multi-Field Embeddings, Query Enrichment) . Finally, we'll run a short test to see if the quality of our search results and generated answers has improved versus the baseline. Let's get to it! RAG overview RAG aims to enhance LLMs by retrieving information from external knowledge bases to enrich generated answers. By providing domain-specific information, LLMs can be quickly adapted for use cases outside the scope of their training data; significantly cheaper than fine-tuning, and easier to keep up-to-date. Measures to improve the quality of RAG typically focus on two tracks: Enhancing the quality and clarity of the knowledge base. Improving the coverage and specificity of search queries. These two measures will achieve the goal of improving the odds that the LLM has access to relevant facts and information, and is thus less likely to hallucinate or draw upon its own knowledge - which may be outdated or irrelevant. The diversity of methods is difficult to clarify in just a few sentences. Let's go straight to implementation to make things clearer. Figure 1: The RAG pipeline used by the author. Table of contents Overview Table of contents Set-up Ingesting, processing, and embedding documents Data ingestion Sentence-level, token-wise chunking Metadata inclusion and generation Keyphrases extracted by TextRank Potential questions generated by GPT-4o Entities extracted by Spacy Composite multi-field embeddings Indexing to Elastic Cat break Appendix Definitions Set-up All code may be found in the Searchlabs repo . First things first. You will need the following: An Elastic Cloud Deployment An LLM API - We are using a GPT-4o deployment on Azure OpenAI in this notebook Python Version 3.12.4 or later We will be running all the code from the main.ipynb notebook. Go ahead and git clone the repo, navigate to supporting-blog-content/advanced-rag-techniques, then run the following commands: Once that's done, create a .env file and fill out the following fields (Referenced in .env.example ). Credits to my co-author, Claude-3.5, for the helpful comments. Next, we'll choose the document to ingest, and place it in the documents folder. For this article, we'll be using the Elastic N.V. Annual Report 2023 . It's a pretty challenging and dense document, perfect for stress testing our RAG techniques. Elastic Annual Report 2023 Now we're all set, let's go to ingestion. Open main.ipynb and execute the first two cells to import all packages and intialize all services. Back to top Ingesting, processing, and embedding documents Data ingestion Personal note: I am stunned by LlamaIndex's convenience. In the olden days before LLMs and LlamaIndex, ingesting documents of various formats was a painful process of collecting esoteric packages from all over. Now it's reduced to a single function call. Wild. The SimpleDirectoryReader will load every document in the directory_path. For .pdf files, it returns a list of document objects, which I convert to Python dictionaries because I find them easier to work with. Each dictionary contains the key content in the text field. It also contains useful metadata such as page number, filename, file size, and type. Back to top Sentence-level, token-wise chunking The first thing to do is reduce our documents to chunks of a standard length (to ensure consistency and manageability). Embedding models have unique token limits (maximum input size they can process). Tokens are the basic units of text that models process. To prevent information loss (truncation or omission of content), we should provide text that does not exceed those limits (by splitting longer texts into smaller segments). Chunking has a significant impact on performance. Ideally, each chunk would represent a self-contained piece of information, capturing contextual information about a single topic. Chunking methods include word-level chunking, where documents are split by word count, and semantic chunking which uses an LLM to identify logical breakpoints. Word-level chunking is cheap, fast, and easy, but runs a risk of splitting sentences and thus breaking context. Semantic chunking gets slow and expensive, especially if you're dealing with documents like the 116-page Elastic Annual Report. Let's choose a middleground approach. Sentence level chunking is still simple, but can preserve context more effectively than word-level chunking while being significantly cheaper and faster. Additionally, we'll implement a sliding window to capture some of the surrounding context, and alleviate the impact of splitting paragraphs. The Chunker class takes in the embedding model's tokenizer to encode and decode text. We'll now build chunks of 512 tokens each, with an overlap of 20 tokens. To do this, we'll split the text into sentences, tokenize those sentences, and then add the tokenized sentences to our current chunk until we cannot add more without breaching our token limit. Finally, decode the sentences back to the original text for embedding, storing it in a field called original_text . Chunks are stored in a field called chunk . To reduce noise (aka useless documents), we will discard any documents smaller than 50 tokens in length. Let's run it over our documents: And get back chunks of text that look like this: Back to top Metadata inclusion and generation We've chunked our documents. Now it's time to enrich the data. I want to generate or extract additional metadata. This additional metadata can be used to influence and enhance search performance. We'll define a DocumentEnricher class, whose role is to take in a list of documents (Python dictionaries), and a list of processor functions. These functions will run over the documents' original_text column, and store their outputs in new fields. First, we extract keyphrases using TextRank . TextRank is a graph-based algorithm that extracts key phrases and sentences from text by ranking their importance based on the relationships between words. Next, we'll generate potential_questions using GPT-4o . Finally, we'll extract entities using Spacy . Since the code for each of these is quite lengthy and involved, I will refrain from reproducing it here. If you are interested, the files are marked in the code samples below. Let's run the data enrichment: And take a look at the results: Keyphrases extracted by TextRank These keyphrases are a stand-in for the chunk's core topics. If a query has to do with cybersecurity, this chunk's score will be boosted. Potential questions generated by GPT-4o These potential questions may directly match with user queries, offering a boost in score. We prompt GPT-4o to generate questions which can be answered using the information found in the current chunk. Entities extracted by Spacy These entities serve a similar purpose to the keyphrases, but capture organizations' and individuals' names, which keyphrase extraction may miss. Back to top Composite multi-field embeddings Now that we have enriched our documents with additional metadata, we can leverage this information to create more robust and context-aware embeddings. Let's review our current point in the process. We've got four fields of interest in each document. Each field represents a different perspective on the document's context, potentially highlighting a key area for the LLM to focus on. Metadata Enrichment Pipeline The plan is to embed each of these fields, and then create a weighted sum of the embeddings, known as a Composite Embedding. With luck, this Composite Embedding will allow the system to become more context aware, in addition to introducing another tunable hyperparameter from controlling the search behavior. First, let's embed each field and update each document in place, using our locally defined embedding model imported at the beginning of the main.ipynb notebook. Each embedding function returns the embedding's field, which is just the original input field with an _embedding postfix. Let's now define the weightings of our composite embedding: The weightings allow you to assign priorities to each component, based on your usecase and the quality of your data. Intuitively, the size of these weightings is dependent on the semantic value of each component. Since the chunk text itself is by far the richest, I assign a weighting of 70%. Since the entities are the smallest, being just a list of org or person names, I assign it a weighting of 5%. The precise setting for these values has to be determined empirically, on a use-case by use-case basis. Finally, let's write a function to apply the weightings, and create our composite embedding. We'll delete all the component embeddings as well to save space. With this, we've completed our document processing. We now have a list of document objects which look like this: Indexing to Elastic Let's bulk upload our documents to Elastic Search. For this purpose, I long-ago defined a set of Elastic Helper functions in elastic_helpers.py . It is a very lengthy piece of code so let's sticking to looking at the function calls. es_bulk_indexer.bulk_upload_documents works with any list of dictionary objects, taking advantage of Elasticsearch's convenient dynamic mappings. Head on over to Kibana and verify that all documents have been indexed. There should be 224 of them. Not bad for such a large document! Indexed Annual Report Documents in Kibana Back to top Cat break Let's take a break, article's a little heavy, I know. Check out my cat: look at how furious she is Adorable. The hat went missing and I half suspect she stole and hid it somewhere :( Congrats on making it this far :) Join me in Part 2 for testing and evaluation of our RAG pipeline! Appendix Definitions 1. Sentence Chunking A preprocessing technique used in RAG systems to divide text into smaller, meaningful units. Process: Input: Large block of text (e.g., document, paragraph) Output: Smaller text segments (typically sentences or small groups of sentences) Purpose: Creates granular, context-specific text segments Allows for more precise indexing and retrieval Improves the relevance of retrieved information in RAG systems Characteristics: Segments are semantically meaningful Can be independently indexed and retrieved Often preserves some context to ensure standalone comprehensibility Benefits: Enhances retrieval precision Enables more focused augmentation in RAG pipelines 2. HyDE (Hypothetical Document Embedding) A technique that uses an LLM to generate a hypothetical document for query expansion in RAG systems. Process: Input query to an LLM LLM generates a hypothetical document answering the query Embed the generated document Use the embedding for vector search Key difference: Traditional RAG: Matches query to documents HyDE: Matches documents to documents Purpose: Improve retrieval performance, especially for complex or ambiguous queries Capture richer semantic context than a short query Benefits: Leverages LLM's knowledge to expand queries Can potentially improve relevance of retrieved documents Challenges: Requires additional LLM inference, increasing latency and cost Performance depends on quality of generated hypothetical document 3. Reverse Packing A technique used in RAG systems to reorder search results before passing them to the LLM. Process: Search engine (e.g., Elasticsearch) returns documents in descending order of relevance. The order is reversed, placing the most relevant document last. Purpose: Exploits the recency bias of LLMs, which tend to focus more on the latest information in their context. Ensures the most relevant information is \"freshest\" in the LLM's context window. Example: Original order: [Most Relevant, Second Most, Third Most, ...] Reversed order: [..., Third Most, Second Most, Most Relevant] 4. Query Classification A technique to optimize RAG system efficiency by determining whether a query requires RAG or can be answered directly by the LLM. Process: Develop a custom dataset specific to the LLM in use Train a specialized classification model Use the model to categorize incoming queries Purpose: Improve system efficiency by avoiding unnecessary RAG processing Direct queries to the most appropriate response mechanism Requirements: LLM-specific dataset and model Ongoing refinement to maintain accuracy Benefits: Reduces computational overhead for simple queries Potentially improves response time for non-RAG queries 5. Summarization A technique to condense retrieved documents in RAG systems. Process: Retrieve relevant documents Generate concise summaries of each document Use summaries instead of full documents in the RAG pipeline Purpose: Improve RAG performance by focusing on essential information Reduce noise and interference from less relevant content Benefits: Potentially improves relevance of LLM responses Allows for inclusion of more documents within context limits Challenges: Risk of losing important details in summarization Additional computational overhead for summary generation 6. Metadata Inclusion A technique to enrich documents with additional contextual information. Types of metadata: Keyphrases Titles Dates Authorship details Blurbs Purpose: Increase contextual information available to the RAG system Provide LLMs with clearer understanding of document content and relevance Benefits: Potentially improves retrieval accuracy Enhances LLM's ability to assess document usefulness Implementation: Can be done during document preprocessing May require additional data extraction or generation steps 7. Composite Multi-Field Embeddings An advanced embedding technique for RAG systems that creates separate embeddings for different document components. Process: Identify relevant fields (e.g., title, keyphrases, blurb, main content) Generate separate embeddings for each field Combine or store these embeddings for use in retrieval Difference from standard approach: Traditional: Single embedding for entire document Composite: Multiple embeddings for different document aspects Purpose: Create more nuanced and context-aware document representations Capture information from a wider variety of sources within a document Benefits: Potentially improves performance on ambiguous or multi-faceted queries Allows for more flexible weighting of different document aspects in retrieval Challenges: Increased complexity in embedding storage and retrieval processes May require more sophisticated matching algorithms 8. Query Enrichment A technique to expand the original query with related terms to improve search coverage. Process: Analyze the original query Generate synonyms and semantically related phrases Augment the query with these additional terms Purpose: Increase the range of potential matches in the document corpus Improve retrieval performance for queries with specific or technical language Benefits: Potentially retrieves relevant documents that don't exactly match the original query terms Can help overcome vocabulary mismatch between queries and documents Challenges: Risk of query drift if not carefully implemented May increase computational overhead in the retrieval process Back to top Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to RAG overview Table of contents Set-up Ingesting, processing, and embedding documents Data ingestion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Advanced RAG techniques part 1: Data processing - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/advanced-rag-techniques-part-1",
+    "meta_description": "This blog explores and implements advanced RAG techniques which may increase performance, focusing on data processing & ingestion of an advanced RAG pipeline."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Advanced RAG techniques part 2: Querying and testing Discussing and implementing techniques which may increase RAG performance. Part 2 of 2, focusing on querying and testing an advanced RAG pipeline. Vector Database Generative AI HC By: Han Xiang Choong On August 15, 2024 Part of Series Advanced RAG techniques Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. All code may be found in the Searchlabs repo, in the advanced-rag-techniques branch . Welcome to Part 2 of our article on Advanced RAG Techniques! In part 1 of this series , we set-up, discussed, and implemented the data processing components of the advanced RAG pipeline: The RAG pipeline used by the author. In this part, we're going to proceed with querying and testing out our implementation. Let's get right to it! Table of contents Searching and retrieving, generating answers Enriching queries with synonyms HyDE (Hypothetical Document Embedding) Hybrid search Experiments Summary of results Test 1: Who audits Elastic? AdvancedRAG SimpleRAG Test 2: total revenue 2023 AdvancedRAG SimpleRAG Test 3: What product does growth primarily depend on? How much? AdvancedRAG SimpleRAG Test 4: Describe employee benefit plan AdvancedRAG SimpleRAG Test 5: Which companies did Elastic acquire? AdvancedRAG SimpleRAG Conclusion Appendix Prompts RAG question answering prompt Elastic query generator prompt Potential questions generator prompt HyDE generator prompt Sample hybrid search query Searching and retrieving, generating answers Let's ask our first query, ideally some piece of information found primarily in the annual report. How about: Now, let's apply a few of our techniques to enhance the query. Enriching queries with synonyms Firstly, let's enhance the diversity of the query wording, and turn it into a form that can be easily processed into an Elasticsearch query. We'll enlist the aid of GPT-4o to convert the query into a list of OR clauses. Let's write this prompt: When applied to our query, GPT-4o generates synonyms of the base query and related vocabulary. In the ESQueryMaker class, I've defined a function to split the query: Its role is to take this string of OR clauses and split them into a list of terms, allowing us do a multi-match on our key document fields: Finally ending up with this query: This covers many more bases than the original query, hopefully reducing the risk of missing a search result because we forgot a synonym. But we can do more. Back to top HyDE (Hypothetical Document Embedding) Let's enlist GPT-4o again, this time to implement HyDE . The basic premise of HyDE is to generate a hypothetical document - The kind of document that would likely contain the answer to the original query. The factuality or accuracy of the document is not a concern. With that in mind, let's write the following prompt: Since vector search typically operates on cosine vector similarity, the premise of HyDE is that we can achieve better results by matching documents to documents instead of queries to documents. What we care about is structure, flow, and terminology. Not so much factuality. GPT-4o outputs a HyDE document like this: It looks pretty believable, like the ideal candidate for the kinds of documents we'd like to index. We're going to embed this and use it for hybrid search. Back to top Hybrid search This is the core of our search logic. Our lexical search component will be the generated OR clause strings. Our dense vector component will be embedded HyDE Document (aka the search vector). We use KNN to efficiently identify several candidate documents closest to our search vector. We call our lexical search component Scoring with TF-IDF and BM25 by default. Finally, the lexical and dense vector scores will be combined using the 30/70 ratio recommended by Wang et al . Finally, we can piece together a RAG function. Our RAG, from query to answer, will follow this flow: Convert Query to OR Clauses. Generate HyDE document and embed it. Pass both as inputs to Hybrid Search. Retrieve top-n results, reverse them so that the most relevant score is the \"most recent\" in the LLM's contextual memory (Reverse Packing) Reverse Packing Example: Query: \"Elasticsearch query optimization techniques\" Retrieved documents (ordered by relevance): Reversed order for LLM context: By reversing the order, the most relevant information (1) appears last in the context, potentially receiving more attention from the LLM during answer generation. \"Use bool queries to combine multiple search criteria efficiently.\" \"Implement caching strategies to improve query response times.\" \"Optimize index mappings for faster search performance.\" \"Optimize index mappings for faster search performance.\" \"Implement caching strategies to improve query response times.\" \"Use bool queries to combine multiple search criteria efficiently.\" Pass the context to the LLM for generation. Let's run our query and get back our answer: Nice. That's correct. Back to top Experiments There's an important question to answer now. What did we get out of investing so much effort and additional complexity into these implementations? Let's do a little comparison. The RAG pipeline we've implemented versus baseline hybrid search, without any of the enhancements we've made. We'll run a small series of tests and see if we notice any substantial differences. We'll refer to the RAG we have just implemented as AdvancedRAG, and the basic pipeline as SimpleRAG. Simple RAG Pipeline without bells and whistles Summary of results This table summarizes the results of five tests of both RAG pipelines. I judged the relative superiority of each method based on answer detail and quality, but this is a totally subjective judgement. The actual answers are reproduced below this table for your consideration. With that said, let's take a look at how they did! SimpleRAG was unable to answer questions 1 & 5. AdvancedRAG also went into far greater detail on questions 2, 3, and 4. Based on the increased detail, I judged the quality of AdvancedRAG's answers better. Test Question AdvancedRAG Performance SimpleRAG Performance AdvancedRAG Latency SimpleRAG Latency Winner 1 Who audits Elastic? Correctly identified PwC as the auditor. Failed to identify the auditor. 11.6s 4.4s AdvancedRAG 2 What was the total revenue in 2023? Provided the correct revenue figure. Included additional context with revenue from previous years. Provided the correct revenue figure. 13.3s 2.8s AdvancedRAG 3 What product does growth primarily depend on? How much? Correctly identified Elastic Cloud as the key driver. Included overall revenue context & greater detail. Correctly identified Elastic Cloud as the key driver. 14.1s 12.8s AdvancedRAG 4 Describe employee benefit plan Gave a comprehensive description of retirement plans, health programs, and other benefits. Included specific contribution amounts for different years. Provided a good overview of benefits, including compensation, retirement plans, work environment, and the Elastic Cares program. 26.6s 11.6s AdvancedRAG 5 Which companies did Elastic acquire? Correctly listed recent acquisitions mentioned in the report (CmdWatch, Build Security, Optimyze). Provided some acquisition dates and purchase prices. Failed to retrieve relevant information from the provided context. 11.9s 2.7s AdvancedRAG Test 1: Who audits Elastic? AdvancedRAG SimpleRAG Summary : SimpleRAG did not identify PWC as the auditor Okay that's actually quite surprising. That looks like a search failure on SimpleRAG's part. No documents related to auditing were retrieved. Let's dial down the difficulty a little with the next test. Test 2: total revenue 2023 AdvancedRAG SimpleRAG Summary : Both RAGs got the right answer: $1,068,989,000 total revenue in 2023 Both of them were right here. It does seem like AdvancedRAG may have acquired a broader range of documents? Certainly the answer is more detailed and incorporates information from previous years. That is to be expected given the enhancements we made, but it's far too early to call. Let's raise the difficulty. Test 3: What product does growth primarily depend on? How much? AdvancedRAG SimpleRAG Summary : Both RAGs correctly identified Elastic Cloud as the key growth driver. However, AdvancedRAG includes more detail, factoring in subscription revenues and customer growth, and explicitly mentions other Elastic offerings. Test 4: Describe employee benefit plan AdvancedRAG SimpleRAG Summary : AdvancedRAG goes into much greater depth and detail, mentioning the 401K plan for US-based employees, as well as defining contribution plans outside of the US. It also mentions Health and Well-Being plans but misses the Elastic Cares program, which SimpleRAG mentions. Test 5: Which companies did Elastic acquire? AdvancedRAG SimpleRAG Summary : SimpleRAG does not retrieve any relevant info about acquisitions, leading to a failed answer. AdvancedRAG correctly lists CmdWatch, Build Security, and Optimyze, which were the key acquisitions listed in the report. Back to top Conclusion Based on our tests, our advanced techniques appear to increase the range and depth of the information presented, potentially enhancing quality of RAG answers. Additionally, there may be improvements in reliability, as ambiguously worded questions such as Which companies did Elastic acquire? and Who audits Elastic were correctly answered by AdvancedRAG but not by SimpleRAG. However, it is worth keeping in perspective that in 3 out of 5 cases, the basic RAG pipeline, incorporating Hybrid Search but no other techniques, managed to produce answers that captured most of the key information. We should note that due to the incorporation of LLMs at the data preparation and query phases, the latency of AdvancedRAG is generally between 2-5x larger that of SimpleRAG. This is a significant cost which may make AdvancedRAG suitable only for situations where answer quality is prioritized over latency. The significant latency costs can be alleviated using a smaller and cheaper LLM like Claude Haiku or GPT-4o-mini at the data preparation stage. Save the advanced models for answer generation. This aligns with the findings of Wang et al. As their results show, any improvements made are relatively incremental. In short, simple baseline RAG gets you most of the way to a decent end-product, while being cheaper and faster to boot. For me, it's an interesting conclusion. For use cases where speed and efficiency are key, SimpleRAG is the sensible choice. For use cases where every last drop of performance needs squeezing out, the techniques incorporated into AdvancedRAG may offer a way forward. Results of the study by Wang et al reveal that the use of advanced techniques creates consistents but incremental improvements. Back to top Appendix Prompts RAG question answering prompt Prompt for getting the LLM to generate answers based on query and context. Elastic query generator prompt Prompt for enriching queries with synonyms and converting them into the OR format. Potential questions generator prompt Prompt for generating potential questions, enriching document metadata. HyDE generator prompt Prompt for generating hypothetical documents using HyDE Sample hybrid search query Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Table of contents Searching and retrieving, generating answers Enriching queries with synonyms HyDE (Hypothetical Document Embedding) Hybrid search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Advanced RAG techniques part 2: Querying and testing - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/advanced-rag-techniques-part-2",
+    "meta_description": "This blog discusses and implements RAG techniques which may increase performance, focusing on querying and testing an advanced RAG pipeline."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search Learn about the Elastic Learned Sparse Encoder (ELSER), an AI model for high relevance semantic search across domains. ML Research AP GG By: Aris Papadopoulos and Gilad Gal On June 21, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Searching for meaning, not just words With 8.8, Elastic offers semantic search out of the box. Semantic search is designed to search with the intent or meaning of the text as opposed to a lexical match or keyword query. It is a qualitative leap compared to traditional lexical term-based search, offering breakthrough relevance. It captures relationships between words on the conceptual level, understanding the context and surfacing relevant results based on meanings, instead of simply query terms. Aiming to eliminate the barrier to AI-powered search, in 8.8 we are introducing a new semantic search model in technical preview, trained and optimized by Elastic. Use it to instantly leverage superior semantic relevance with vector and hybrid search, natively in Elastic. Introducing Elastic Learned Sparse Encoder, a new text expansion model for semantic search Elastic has been investing in vector search and AI for three years and released support for approximate nearest neighbor search in 8.0 (with HNSW in Lucene). Recognizing that the landscape of tools to implement semantic search is rapidly evolving, we have offered third-party model deployment and management, both programmatically and through the UI. With the combined capabilities, you can onboard your vector models (embeddings) and perform vector search through the familiar search APIs, which we enhanced with vector capabilities. The results from using vector search have been astonishing. But to achieve them, organizations need significant expertise and effort that go well beyond typical software productization. This includes annotating a sufficient number of queries within the domain in which search will be performed (typically in the order of tens of thousands), in-domain re-training the machine learning (so called “embedding”) model to achieve domain adaptation, and maintaining the models against drift. At the same time, you may not want to rely on third-party models due to privacy, support, competitiveness, or licensing concerns. As a result, AI-powered search is still outside the reach of the majority of users. With that in mind, in 8.8 we are introducing Elastic Learned Sparse Encoder — in technical preview. You can start using this new retrieval model with a click of a button from within the Elastic UI for a wide array of use cases, and you need exactly zero machine learning expertise or deployment effort. Superior semantic search out of the box Elastic’s Learned Sparse Encoder uses text-expansion to breathe meaning into simple search queries and supercharge relevance. It captures the semantic relationships between words in the English language and based on them, it expands search queries to include relevant terms that are not present in the query. This is more powerful than adding synonyms with lexical scoring (BM25) because it uses this deeper language-scale knowledge to optimize for relevance. And not only that, but context is also factored in, helping to eliminate ambiguity from words that may have different interpretations in different sentences. As a result, this model helps mitigate the vocabulary mismatch problem : Even if the query terms are not present in the documents, Elastic Learned Sparse Encoder will return relevant documents if they exist. Based on our comparison, this novel retrieval model outperforms lexical search in 11 out of 12 prominent relevance benchmarks , and the combination of both using hybrid search in all 12 relevance benchmarks. If you’ve already spent the effort to fine-tune lexical search in your domain, you can get an additional boost from hybrid scoring! Why choose Elastic’s Learned Sparse Encoder? Above all, you can use this new model out of the box, without domain adaptation — we’ll explain that more below; it is a sparse-vector model that performs well out-of-domain or zero-shot. Let’s break down how these terms directly translate to value for your search application. Our model is trained and architected in such a way that you do not need to fine tune it on your data. As an out-of-domain model, it outperforms dense vector models when no domain-specific retraining is applied. In other words, just click “deploy” on the UI and start using state-of-the-art semantic search with your data. Our model outperforms SPLADE (Sparse Lexical and Expansion Model), the previous out-of-domain, sparse-vector, text-expansion champion, as measured by the same benchmarks. In addition, you don’t have to worry about licensing, support, continuity of competitiveness, and extensibility beyond your Elastic license tier. For example, SPLADE is licensed for non-commercial use only. Our model is available on our Platinum subscription tier. As sparse-vector representation, it uses the Elasticsearch, Lucene-based inverted index. This means decades of optimizations are leveraged to provide optimal performance. As a result, Elastic offers one of the most powerful and effortless hybrid search solutions in the market. For the same reason, it is both more efficient and more interpretable. Fewer dimensions are activated than in dense representations, and they often directly map to words, in contrast with the opaqueness of dense representations. In a vocabulary mismatch scenario, this will clearly show you which words non-existing in the query triggered the results. Let’s speak to performance and Elasticsearch as a vector database Keeping vectors of tens of thousands of dimensions and performing vector similarity on them may sound like a scale and latency stretch. However, sparse vectors compress wonderfully well, and the Elasticsearch (and Lucene) inverted index is a strong technical approach to this use case. In addition, for Elastic, vector similarity is a less computationally intensive operation, due to some clever inverted index tricks that Elasticsearch hides up its sleeve. Overall, both the query performance and index size when using our sparse retrieval model are surprisingly good and require fewer resources compared to the typical dense vector index. That said, vector search, sparse or dense, has an inherently larger memory footprint and time complexity compared to lexical search universally, regardless of the platform. Elastic, as a vector database, is optimized and provides all gains possible on all levels (data structures and algorithmic). Although learned sparse retrieval might require more resources compared to lexical search, based on your application and data, the enhanced capabilities it offers could well be worth the investment. The future: The most powerful hybrid search in the market out of the box In this first tech preview release, we are limiting the length of the input to 512 tokens, which is approximately the first 300–400 words in each field going through an inference pipeline. This is sufficient for many use cases already, and we are working on methods for handling longer documents in a future version. For a successful early evaluation, we suggest using documents where most information is stored in the first 300–400 words. As we evaluated different models for relevance, it became clear that the best results are obtained from an ensemble of different ranking methods. You can combine vector search — with or without the new retrieval model — with Elastic’s lexical search through our streamlined search APIs. Linearly combining normalized scores from each method can provide excellent results. However, we want to push boundaries and offer the most powerful hybrid search out of the box, by eliminating any search science effort toward fine tuning based on the distribution of scores, data, queries, etc. To this aim, we are releasing Reciprocal Rank Fusion (RRF) in 8.8 for use initially with third-party models in Elastic and we are working toward integrating our sparse retrieval model and lexical search through RRF in the subsequent releases. This way, you will be able to leverage Elastic's innovative hybrid search architecture, combining semantic, lexical, and multimedia, through the Elastic search APIs that you are familiar with and trust through years of maturity. Finally, in working toward a GA production-ready version, we are exploring strategies for handling long documents and overall optimizations to further boost performance. Get started with Elastic’s AI-powered search today To try Elastic Learned Sparse Encoder, head to Machine Learning at the trained models view or Enterprise Search to start using semantic search with your data, in a simple click of a button. If you don't have access to Elastic yet, you can request access to the premium trial needed here . To learn more about our investments and trajectory in the vector search and AI space, watch this ElasticON Global spotlight talk by Matt Riley, general manager of Enterprise Search. For a deeper understanding of the new model’s architecture and training, read the blog by the creator machine learning scientists. To learn how you can use the model for semantic and hybrid search, head to our API and requirements documentation. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Searching for meaning, not just words Introducing Elastic Learned Sparse Encoder, a new text expansion model for semantic search Superior semantic search out of the box Why choose Elastic’s Learned Sparse Encoder? Let’s speak to performance and Elasticsearch as a vector database Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/introducing-elastic-learned-sparse-encoder-elser",
+    "meta_description": "Learn about the Elastic Learned Sparse Encoder (ELSER), an AI model for high relevance semantic search across domains."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing Learning To Rank (LTR) in Elasticsearch Discover how Learning To Rank (LTR) can help you to improve your search ranking and how to implement it in Elasticsearch. Search Relevance How To AF By: Aurélien Foucret On July 15, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Starting with Elasticsearch 8.13, we provide an implementation of Learning To Rank (LTR) natively integrated into Elasticsearch. LTR uses a trained machine learning (ML) model to build a ranking function for your search engine. Typically, the model is used as a second stage re-ranker, to improve the relevance of search results returned by a simpler, first stage retrieval algorithm. This blog post will explain how this new feature can help in improving your document ranking in text search and how to implement it in Elasticsearch. Whether you are trying to optimize an eCommerce search, build the best context for a Retrieval Augmented Generation(RAG) application or craft a question answering based search on millions of academic papers, you have probably realized how challenging it can be to accurately optimize document ranking in a search engine. That's where Learning to Rank comes in. Understanding relevance features and how to build a scoring function Relevance features are the signals to determine how well a document matches a user's query or interest, all of which impact search relevance . These features can vary significantly depending on the context, but they generally fall into several categories. Let’s take a look at some common relevance features used across different domains: Text Relevance Scores (e.g., BM25 , TF-IDF): Scores derived from text matching algorithms that measure the similarity of document content to the search query. These scores can be obtained from Elasticsearch. Document Properties (e.g., price of a product, publication date): Features that can be extracted directly from the stored document. Popularity Metrics (e.g., click-through rate, views): Indicators of how popular or frequently accessed a document is. Popularity metrics can be obtained with Search analytics tools, of which Elasticsearch provides out-of-the-box. The scoring function combines these features to produce a final relevance score for each document. Documents with higher scores are ranked higher in search results. When using the Elasticsearch Query DSL, you are implicitly writing a scoring function that weights relevance features and ultimately defines your search relevance Scoring in the Elasticsearch Query DSL Consider the following example query: This query translates into the following scoring function: While this approach works well, it has a few limitations: Weights are estimated : The weights assigned to each feature are often based on heuristics or intuition. These guesses may not accurately reflect the true importance of each feature in determining relevance. Uniform Weights Across Documents : Manually assigned weights apply uniformly to all documents, ignoring potential interactions between features and how their importance might vary across different queries or document types. For instance, the relevance of recency might be more significant for news articles but less so for academic papers. As the number of features and documents increases, these limitations become more pronounced, making it increasingly challenging to determine accurate weights. Ultimately, the chosen weights become a compromise, potentially leading to suboptimal ranking in many scenarios. A compelling alternative is to replace the scoring function that uses manual weights by a ML-based model that computes the score using relevance features. Hello Learning To Rank (LTR)! LambdaMART is a popular and effective LTR technique that uses gradient boosting decision trees (GBDT ) to learn the optimal scoring function from a judgment list. The judgment list is a dataset that contains pairs of queries and documents, along with their corresponding relevance labels or grades. Relevance labels are typically either binary, (e.g. relevant/irrelevant) or graded (e.g between 0 for completely irrelevant and 4 for highly relevant). Judgment lists can be created manually by humans or be generated from user engagement data, such as clicks or conversions. The example below uses a graded relevance judgment. LambdaMART treats the ranking problem as a regression task using a decision tree where the inner nodes of the tree are conditions over the relevance features, and the leaves are the predicted scores. LambdaMART uses a gradient boosted tree approach, and in the training process it builds multiple decision trees where each tree corrects errors of its predecessors. This process aims to optimize a ranking metric like NDCG, based on examples from the judgment list. The final model is a weighted sum of individual trees. XGBoost is a well known library that provides an implementation of LambdaMART, making it a popular choice to implement ranking based on gradient boosting decision trees. Getting started with LTR in Elasticsearch Starting with version 8.13, Learning To Rank is integrated directly into Elasticsearch and associated tooling as a technical preview feature. Train and deploy an LTR model to Elasticsearch Eland is our Python client and toolkit for DataFrames and machine learning in Elasticsearch. Eland is compatible with most of the standard Python data science tools like Pandas, scikit-learn and XGBoost. We highly recommend using it to train and deploy your LTR XGBoost model, as it provides features to simplify this process: The first step of the training process is to define the relevant features of the LTR model. Using the Python code below, you can specify the relevant features using the Elasticsearch Query DSL. The second step of the process is to build your training dataset. At this step you will compute and add relevance features for each rows of your judgment list: To help you with this task, Eland provides the FeatureLogger class: When the training dataset is built, the model is trained very easily (as also shown in the notebook ): Deploy your model to Elasticsearch once the training process is complete: To learn more about how our tooling can help you to train and deploy the model, check out this end-to-end notebook . Use your LTR model as a rescorer in Elasticsearch Once you deploy your model in Elasticsearch, you can enhance your search results through a rescorer . The rescorer allows you to refine a first-pass ranking of search results using the more sophisticated scoring provided by your LTR model: In this example: First-pass query: The multi_match query retrieves documents that match the query the quick brown fox in the title and content fields. This query is designed to be fast and capture a large set of potentially relevant documents. Rescore phase: The learning_to_rank rescorer refines the top results from the first-pass query using the LTR model. model_id : Specifies the ID of the deployed LTR model ( ltr-model-xgboost in our example). params : Provides any parameters required by the LTR model to extract features relevant to the query. Here query_text allows you to specify the query issued by the user that some of our features extractors expect. window_size : Defines the number of top documents from the search results issued by the first-pass query to be rescored. In this example, the top 100 documents will be rescored. By integrating LTR as a two stage retrieval process, you can can optimize both performance and accuracy of your retrieval process by combining: Speed of Traditional Search: The first-pass query retrieves a large number of documents with a broad match very quickly, ensuring fast response times. Precision of Machine Learning Models: The LTR model is applied only to the top results, refining their ranking to ensure optimal relevance. This targeted application of the model enhances precision without compromising overall performance. Try LTR yourself!? Whether you are struggling to configure search relevance for an eCommerce platform, aiming to improve the context relevance of your RAG application, or you are simply curious about enhancing your existing search engine's performance, you should consider LTR seriously. To start your journey with implementing LTR, make sure to visit our notebook detailing how to train, deploy, and use an LTR model in Elasticsearch and to read our documentation . Let us know if you built anything based on this blog post or if you have questions on our Discuss forums and the community Slack channel . Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to Understanding relevance features and how to build a scoring function Scoring in the Elasticsearch Query DSL Hello Learning To Rank (LTR)! Getting started with LTR in Elasticsearch Train and deploy an LTR model to Elasticsearch Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing Learning To Rank (LTR) in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-learning-to-rank-introduction",
+    "meta_description": "Discover how Learning To Rank (LTR) can help you to improve your search ranking and how to implement it in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Personalized search with learning-to-rank (LTR) Learn how to train ranking models that improve search relevance for individual users and personalize search through learning-to-rank (LTR) in Elasticsearch. Search Relevance MJ By: Max Jakob On August 30, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Today, users have come to expect search results that are tailored to their individual interests. If all the songs we listen to are rock songs, we would expect an Aerosmith song at the top of the results when searching for Crazy , not the one by Gnarls Barkley. In this article, we take a look at ways to personalize search before diving into the specifics of how to do this with learning-to-rank (LTR), using music preferences as an example. Ranking factors First, let's recap which factors are important in search ranking in general. Given a user query, a relevance function can take into account one or multiple of the following factors: Text similarity can be measured with a variety of methods including BM25, dense vector similarity, sparse vector similarity or through cross-encoder models. We can calculate similarity scores of the query string against multiple fields in a document (title, description, tags, etc.) to determine how well the input query matches a document. Query properties can be inferred from the query itself, for example the language, named entities or the user intent. The domain will influence which of these properties can be most helpful to improve relevance.. Document properties pertain to the document itself, for example its popularity or the price of the product represented by the document. These properties often have a big impact on the relevance when applied with the right weights. User and context properties refer to data that is not associated with the query or the document but with the context of the search request, for example the location of the user, past search behavior or user preferences. These are the signals that will help us personalize our search. Personalized results When looking at the last category of factors, user and context properties, we can distinguish between three types of systems: \"General\" search does not take into account any user properties. Only query input and document properties determine the relevance of search results. Two users that enter the same query see the same results. When you start Elasticsearch you have such a system out-of-the-box. Personalized search adds user properties to the mix. The input query is still important but it is now supplemented by user and/or context properties. In this setting users can get different results for the same query and hopefully the results are more relevant for individuals. Recommendations goes a step further and focuses exclusively on document, user and context properties. There is no actively supplied user query to these systems. Many platforms recommend content on the home page that is tailored to the user’s account, for example based on the shopping history or previously watched movies. If we look at personalization as a spectrum, personalized search sits in the middle. Both user input and user preferences are part of the relevance equation. This also means that personalization in search should be applied carefully. If we put too much weight on past user behavior and too little on the present search intent, we risk frustrating users with their favorite documents when they were specifically searching for something else. Maybe you too had the experience of watching that one folk dance video that your friend posted and subsequently found more of these when searching for dance music. The lesson here is that it's important to ensure sufficient amounts of historic data for a user in order to confidently skew search results in a certain direction. Also keep in mind that personalization is mainly going to make a difference for ambiguous user input and exploratory queries. Unambiguous navigational queries should already be covered by your general search mechanisms. There are many methods for personalization. There are rule-based heuristics in which developers hand-craft the matching of user properties onto sets of specific documents, for example manually boosting onboarding documents for new users. There are also low tech methods of sampling results from general and personal result lists. Many of the more principled approaches use vector representations trained either on item similarity or with collaborative filtering techniques (e.g. “customers also bought”). You can find many posts around these methods online. In this post we will focus on using learning-to-rank. Personalized search with LTR Learning-to-rank (LTR) is the process of creating statistical models for relevance ranking. You can think of it as automatically tuning the weights of different relevance factors. Instead of manually coming up with a structured query and weights for all text similarity, query properties and document properties, we train a model that finds an optimal trade-off given some data. The data comes in the form of a judgment list. Here we are going to look at behavior-based personalization using LTR, meaning that we will utilize past user behavior to extract user properties that will be used in our LTR training process. It's important to note that, in order to set yourself up for success, you should be already well underway in your LTR journey before you start with personalization: You should already have LTR in place. If you want to introduce LTR into your search, it's best to start by optimizing your general (non-personalized) search first. There might be some low-hanging fruit there and this will give you the chance to build a solid technical base before adding complexity. Dealing with user-dependent data means you need more of it during training and evaluation becomes trickier. We recommend waiting with personalization until your overall LTR setup is in a solid state. You should already be collecting usage data. Without it you would not have enough data for sensical improvements to your relevance: the cold start problem. It's also important that you have high confidence in the correctness of your usage tracking data. Incorrectly sent tracking events and erroneous data pipelines can often go undetected because they don’t throw any errors, but the resulting data ends up misrepresenting the actual user behavior. Subsequently basing personalization projects on this data will probably not succeed. You should already be creating your judgment list from usage data. This process is also known as click modeling and it is both a science and an art. Here, instead of manually labeling relevant and irrelevant documents in search results, you use click signals (clicks on search results, add-to-cart, purchases, listening to a whole song, etc.) to estimate the relevance of a document that a user was served as part of past search results. You probably need multiple experiments to get this right. Plus, there are some biases that are being introduced here (most notably position bias ). You should feel confident that your judgment list well represents the relevance for your search. If all these things are a given, then let's go ahead and add personalization. First, we are going to dive into feature engineering. Feature engineering In feature engineering we ask ourselves which concrete user properties can be used in your specific search to make results more relevant? And how can we encode these properties as ranking features? You should be able to imagine exactly how adding, say, the location of the user could improve the result quality. For instance code search is typically a use case that is independent of the user location. Music tastes on the other hand are influenced by local trends. If we know where the searcher is and we know to which geo location we can attribute a document, this can work out. It pays to be thoughtful about which user features and which document feature might work together. If you cannot imagine how this would work in theory, it might not be worth adding a new feature to your model. At any rate, you should always test the effectiveness of new features both offline after training and later in an online A/B test. Some properties can be directly collected from the tracking data, such as the location of the user or the upload location of a document. When it comes to representing user preferences, we have to do some more calculations (as we will see below). Furthermore we have to think about how to encode our properties as features because all features must be numeric. For example, we have to decide whether to represent categorical features as labels represented by integers or as one-hot encoding of multiple binary labels. To illustrate how user features might influence relevance ranking, consider the fictive example boosting tree below that could be part of an XGBoost model for a music search engine. The training process learned the importance of the location feature \"from France\" (on the left-hand side) and weighed them against the other features such as text similarity and document features. Note that these trees are typically much deeper and there are many more of them. We chose a one-hot encoding for the location feature both on the search and on the documents. Be aware that the more features are added, the more nodes in these trees are required to make use of them. Consequently more time and resources will be needed during training in order to reach convergence. Start small, measure improvements and expand step-by-step. Example of personalized search with LTR: music preferences How can we implement this in Elasticsearch? Let's again assume we have a search engine for a music website where users can look for songs and listen to them. Each song is categorized into a high-level genre. An example document could look like this: Further assume that we have an established way to extract a judgment list from usage data. Here we use relevance grades from 0 to 3 as an example, which could be computed from no interaction, clicking on a result, listening to the song and giving a thumbs-up rating for the song. Doing this introduces some biases in our data, including position bias (more on this in a future post). The judgment list could look like this: We track the songs that users listen to on our site, so we can build a dataset of music genre preferences for each user. For example, we could look back some time into the past and aggregate all genres that a user has listened to. Here we could experiment with different representations of genre preferences, including latent features, but for simplicity we'll stick to relative frequencies of listens. In this example we want to personalize for individual users but note that we could also base our calculations on segments of users (and use segment IDs). When calculating this, it would be wise to take the amount of activity of users into account. This goes back to the folk dance example above. If a user only interacted with one song, the genre preference would be completely skewed to its genre. To prevent the subsequent personalization putting too much weight on this, we could add the number of interactions as a feature so the model can learn when to put weight on the genre plays. We could also smooth the interactions and add a constant to all frequencies before normalizing so they don’t deviate from a uniform distribution for low counts. Here we assume the latter. The above data needs to be stored in a feature store so that we can look up the user preference values by user ID both during training and at search time. You can use a dedicated Elasticsearch index here, for example: With the user ID as the Elasticsearch document ID we can use the Get API (see below) to retrieve the preference values. This will have to be done in your application code as of Elasticsearch version 8.15. Also note that these separately stored feature values will need to be refreshed by a regularly running job in order to keep the values up-to-date as preferences change over time. Now we are ready to define our feature extraction. Here we one-hot-encode the genres. We plan to also enable representing categories as integers in future releases. Now when applying the feature extraction, we have to first look up the genre preference values and forward them to the feature logger. Depending on performance, it might be good to batch lookup these values. After feature extraction, we have our data ready for training. Please refer to the previous LTR post and the accompanying notebook for how to train and deploy the model (and make sure to not send the IDs as features). Once the model is trained and deployed, you can use it in a rescorer like this. Note that at search time you also need to look up the user preference values beforehand and add the values to the query. Now the users of our music website with different genre preferences can benefit from your personalized search. Both rock and pop lovers will find their favorite version of the song called Crazy at the top of the search results. Conclusion Adding personalization has the potential to improve relevance. One way to personalize search is through LTR in Elasticsearch. We have looked at some prerequisites that should be given and went through a hands-on example. However, in the name of a focused post, we left out several important details. How would we evaluate the model? There are offline metrics that can be applied during model development, but ultimately an online A/B test with real users will have to decide if the model improves relevance. How do we know if we are using enough data? Spending more resources at this stage can improve quality but we need to know under which conditions this is worth it. How would we build a good judgment list and deal with the different biases introduced by using behavioral tracking data? And can we forget about our personalized model after deployment or do we require repeated maintenance to address drift? Some of these questions will be answered in future posts on LTR, so stay tuned. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Jump to Ranking factors Personalized results Personalized search with LTR Feature engineering Example of personalized search with LTR: music preferences Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Personalized search with learning-to-rank (LTR) - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/personalized-search-elasticsearch-ltr",
+    "meta_description": "Learn how to implement personalized search through Learning to Rank (LTR) in Elasticsearch with a practical example."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Building RAG with Llama 3 open-source and Elastic Learn how to build a RAG system with Llama3 open source and Elastic. This blog provides practical examples of RAG using Llama3 as an LLM. Integrations Generative AI How To RR By: Rishikesh Radhakrishnan On June 20, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This blog will walk through implementing RAG using two approaches. Elastic, Llamaindex, Llama 3 (8B) version running locally using Ollama. Elastic, Langchain, ELSER v2, Llama 3 (8B) version running locally using Ollama. The notebooks are available at this GitHub location. Before we get started, let's take a quick dive into Llama 3. Llama 3 overview Llama 3 is an open source large language model recently launched by Meta. This is a successor to Llama 2 and based on published metrics, is a significant improvement. It has good evaluation metrics, when compared to some of the recently published models such as Gemma 7B Instruct, Mistral 7B Instruct, etc. The model has two variants, which are the 8 billion and 70 billion parameter. An interesting thing to note is that at the time of writing this blog, Meta was still in the process of training 400B+ variant of Llama 3. Meta Llama 3 Instruct Model Performance. (from https://ai.meta.com/blog/meta-llama-3/ ) The above figure shows data on Llama3 performance across different datasets as compared to other models. In order to be optimized for performance for real world scenarios, Llama3 was also evaluated on a high quality human evaluation set. Aggregated results of Human Evaluations across multiple categories and prompts (from https://ai.meta.com/blog/meta-llama-3/ ) How to build RAG with Llama 3 open-source and Elastic Dataset For the dataset, we will use a fictional organization policy document in json format, available at this location . Configure Ollama and Llama3 As we are using the Llama 3 8B parameter size model, we will be running that using Ollama. Follow the steps below to install Ollama. Browse to the URL https://ollama.com/download to download the Ollama installer based on your platform. Note: The Windows version is in preview at the moment. Follow the instructions to install and run Ollama for your OS. Once installed, follow the commands below to download the Llama3 model. This should take some time depending upon your network bandwidth. Once the run completes, you should end with the interface below. To test Llama3, run the following command from a new terminal or enter the text at the prompt itself. At the prompt, the output looks like below. We now have Llama3 running locally using Ollama. Elasticsearch setup We will use Elastic cloud setup for this. Please follow the instructions here . Once successfully deployed, note the API Key and the Cloud ID, we will require them as part of our setup. Application setup There are two notebooks, one for RAG implemented using Llamaindex and Llama3, the other one with Langchain, ELSER v2 and Llama3. In the first notebook, we use Llama3 as a local LLM as well as provide embeddings. For the second notebook, we use ELSER v2 for the embeddings and Llama3 as the local LLM. Method 1: Elastic, Llamaindex, Llama 3 (8B) version running locally using Ollama. Step 1 : Install required dependencies The above section installs the required llamaindex packages. Step 2: Import required dependencies We start with importing the required packages and classes for the app. We start with providing a prompt to the user to capture the Cloud ID and API Key values. If you are not familiar with obtaining the Cloud ID and API Key, please follow the links in the code snippet above to guide you with the process. Step 3: document processing We start with downloading the json document and building out Document objects with the payload. We now define the Elasticsearch vector store ( ElasticsearchStore ), the embedding created using Llama3 and a pipeline to help process the payload constructed above and ingest into Elasticsearch. The ingestion pipeline allows us to compose pipelines using different components, one of which allows us to generate embeddings using Llama3. ElasticsearchStore is defined with the name of the index to be created, the vector field and the content field. And this index is created when we run the pipeline. The index mapping created is as below: The pipeline is executed using the step below. Once this pipeline run completes, the index workplace_index is now available for querying. Do note that the vector field content_vector is created as a dense vector with dimension 4096 . The dimension size comes from the size of the embeddings generated from Llama3. Step 4: LLM configuration We now setup Llamaindex to use the Llama3 as the LLM. This as we covered before is done with the help of Ollama. Step 5: Semantic search We now configure Elasticsearch as the vector store for the Llamaindex query engine. The query engine is then used to answer your questions with contextually relevant data from Elasticsearch. The response I received with Llama3 as the LLM and Elasticsearch as the Vector database is below. This concludes the RAG setup based on using Llama3 as a local LLM and to generate embeddings. Let's now move to the second method, which uses Llama3 as a local LLM, but we use Elastic’s ELSER v2 to generate embeddings and for semantic search. Method 2: Elastic, Langchain, ELSER v2, Llama 3 (8B) version running locally using Ollama. Step 1: Install required dependencies The above section installs the required langchain packages. Step 2: Import required dependencies We start with importing the required packages and classes for the app. This step is similar to Step 2 in Method 1 above. Next, provide a prompt to the user to capture the Cloud ID and API Key values. Step 3: Document processing Next, we move to downloading the json document and building the payload. This step differs from the Method 1 approach, from how we use the LlamaIndex provided pipeline to process the document. Here we use the RecursiveCharacterTextSplitter to generate the chunks. We now define the Elasticsearch vector store ElasticsearchStore . The vector store is defined with the index to be created and the model to be used for embedding and retrieval. You can retrieve the model_id by navigating to Trained Models under Machine Learning. This also results in the creation of an ingest pipeline in Elastic, which generates and stores the embeddings as the documents are ingested into Elastic. We now add the documents processed above. Step 4: LLM configuration We set up the LLM to be used with the following. This is again different from method 1, where we used Llama3 for embeddings too. Step 5: Semantic search The necessary building blocks are all in place now. We tie them up together to perform semantic search using ELSER v2 and Llama3 as the LLM. Essentially, Elasticsearch ELSER v2 provides the contextually relevant response to the users question using its semantic search capabilities. The user's question is then enriched with the response from ELSER and structured using a template. This is then processed with Llama3 to generate relevant responses. The response with Llama3 as the LLM and ELSER v2 for semantic search is as below: This concludes the RAG setup based on using Llama3 as a local LLM and ELSER v2 for semantic search. Conclusion In this blog we looked at two approaches to RAG with Llama3 and Elastic. We explored Llama3 as an LLM and to generate embeddings. Next we used Llama3 as the local LLM and ELSER for embeddings and semantic search. We utilized two different frameworks, LlamaIndex and Langchain. You could implement the two methods using either of these frameworks. The notebooks were tested with the Llama3 8B parameter version. Both the notebooks are available at this GitHub location. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Llama 3 overview How to build RAG with Llama 3 open-source and Elastic Dataset Configure Ollama and Llama3 Elasticsearch setup Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Building RAG with Llama 3 open-source and Elastic - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-rag-with-llama3-opensource-and-elastic",
+    "meta_description": "Learn how to build a RAG system with Llama3 open source and Elastic. This blog provides practical examples of RAG using Llama3 as an LLM."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Retrieval Augmented Generation (RAG) Learn about Retrieval Augmented Generation (RAG) and how it can help improve the quality of an LLM's generated responses by providing relevant source knowledge as context. Generative AI JM By: Joe McElroy On November 13, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Generative AI has recently created enormous successes and excitement, with models that can generate fluent text, realistic images, and even videos. In the case of language, large language models, trained on vast amounts of data, are capable on understanding context and generating relevant responses to questions. This blog post explores the challenges associated with Generative AI, how Retrieval Augmented Generation (RAG) can help overcome those challenges, how RAG works, as well as the advantages and challenges of using RAG. Challenges with Generative AI However, its important to understand that these models are not perfect. The knowledge that these models possess is parametric knowledge that they learned during training and is a condensed representation of the entire training dataset. Lack of domain knowledge These models should be able to generate good responses to questions about general knowledge seen in their training data. But they cannot reliably answer questions about facts which are not in their training dataset. If the model is well aligned it will refuse to answer such out-of-domain questions. However, it is possible it will simply make up answers (also known as hallucinating). For example, a general purpose model will typically understand in general terms that each company will have a leave policy, but it will not have any knowledge of my particular company's leave policy. Frozen parametric knowledge An LLM's knowledge is frozen, which means it doesn't know anything about events that happen post-training. This means it will not be able to reliably answer questions about current events. Models are typically trained to qualify the answers they give for such questions. Hallucinations It has been suggested that LLMs capture in their parameters something like a knowledge graph representation of general ontology: representing facts about and relationships between entities. Common facts that appear frequently in the training data are well represented in the knowledge graph. However, niche knowledge which is unlikely to have many examples in the training data is only approximately represented. As such LLMs have noisy understanding of such facts. The alignment process, where models are calibrated about what they know, is essential. Mistakes often occur in the gray area between known and unknown information, highlighting the challenge of distinguishing relevant details. In the example above, the question about Fields Medal winners in the same year as Borcherds, is a prime example of this sort of niche knowledge. In this case we seeded the conversation with information about other mathematicians and ChatGPT appeared to get confused about what information to attend to. For example, it missed Tim Gowers and added Vladimir Voevodsky (who won in 2002). Expensive to train While LLMs are capable of generating relevant responses to questions when trained on data within a specific domain, they are expensive to train and require vast amounts of data and compute to develop. Similarly, fine-tuning models requires expertise and time and there is the risk that in the process they \"forget\" other important capabilities. How does RAG help solve this problem? Retrieval Augmented Generation (RAG) helps solve this problem by grounding the parametric knowledge of a generative model with an external source knowledge, from a information retrieval system like a database. This source knowledge is passed as additional context to the model and helps the model generate more relevant responses to questions. How does RAG work? A RAG pipeline typically has three main components: Data : A collection of data (e.g documents, webpages) that contain relevant information to answer questions. Retrieval : A retrieval strategy that can retrieve relevant source knowledge from the data. Generation : With the relevant source knowledge, generate a response with the help of an LLM. RAG pipeline flow When directly interacting with a model, the LLM is given a question and generates a response based on its parametric knowledge. RAG adds an extra step to the pipeline, using retrieval to find relevant data that builds additional context for the LLM. In the example below, we use a dense vector retrieval strategy to retrieve relevant source knowledge from the data. This source knowledge is then passed to the LLM as context to generate a response. RAG doesn't have to use dense vector retrieval, it can use any retrieval strategy that can retrieve relevant source knowledge from the data. It could be a simple keyword search or even a Google web search. We will cover other retrieval strategies in a future article. Retrieval of source knowledge Retrieval of relevant source knowledge is key to answering the question effectively. The most common approach for retrieval with Generative AI is using semantic search with dense vectors. Semantic search is a technique that requires an embedding model to transform natural language input into dense vectors which represent that source knowledge. We rely on these dense vectors to represent the source knowledge because they are able to capture the semantic meaning of the text. This is important because it allows us to compare the semantic meaning of the source knowledge with the question to determine if the source knowledge is relevant to the question. Given a question and its embedding, we can find the most relevant source knowledge. Semantic search with dense vectors isn't your only retrieval option, but it's one of the most popular approaches today. We will cover other approaches in a future article. Advantages of RAG After training, LLMs are frozen. The parametric knowledge of the model is fixed and cannot be updated. However, when we add data and retrieval to the RAG pipeline, we can update the source knowledge as the underlying data source changes, without having to retrain the model. Grounded in source knowledge The model's response can also be constrained to only use the source knowledge provided in-context, which helps limit hallucinations. This approach also opens up the option of using smaller, task-specific LLMs instead of large, general purpose models. This enables prioritizing the use of source knowledge to answer questions, rather than general knowledge acquired during training. Citing sources in responses In addition, RAG can provide clear traceability of the source knowledge used to answer a question. This is important for compliance and regulatory reasons and also helps spot LLM hallucinations. This is known as source tracking. RAG in action Once we have retrieved the relevant source knowledge, we can use it to generate a response to the question. To do this, we need to: Build a context A collection of source knowledge (e.g documents, webpages) that contain relevant information to answer questions. This provides the context for the model to generate a response. Prompt template A template written in natural language for a specific task (answer questions, summarize text). Used as the input to the LLM. Question A question that is relevant to the task. Once we have these three components, we can use the LLM to generate a response to the question. In the example below, we combine the prompt template with the user's question and the relevant passages retrieved. The prompt template builds the relevant source knowledge passages into a context. This example also includes source tracing where the source knowledge passages are cited in the response. Challenges with RAG Effective retrieval is the key to answering questions effectively. Good retrieval provides a diverse set of relevant source knowledge to the context. However, this is more of an art than a science, requires a lot of experimentation to get right, and is highly dependent on the use case. Precise dense vectors Large documents are difficult to represent as a single dense vector because they contain multiple semantic meanings. For effective retrieval, we need to break down the document into smaller chunks of text that can be accurately represented as a single dense vector. A common approach for generic text is to chunk by paragraphs and represent each paragraph as a dense vector. Depending on your use case, you may want to break the document down using titles, headings, or even sentences, as chunks. Large context When using LLMs, we need to be mindful of the size of the context we pass to the model. LLMs have a limit on the amount of tokens they can process at once. For example, GPT-3.5-turbo has a limit of 4096 tokens. Secondly, responses generated may degrade in quality as the context increases, increasing the risk of hallucinations. Larger contexts also require more time to process and, crucially, they increase LLM costs. This comes back to the art of retrieval. We need to find the right balance between chunking size and accuracy with embeddings. Conclusion Retrieval Augmented Generation is a powerful technique that can help improve the quality of an LLM's generated responses, by providing relevant source knowledge as context. But RAG isn't a silver bullet. It requires a lot of experimentation and tuning to get right and it's also highly dependent on your use case. In the next article, we will cover how to build a RAG pipeline using LangChain, a popular framework for working with LLMs. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Challenges with Generative AI Lack of domain knowledge Frozen parametric knowledge Hallucinations Expensive to train Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Retrieval Augmented Generation (RAG) - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/retrieval-augmented-generation-rag",
+    "meta_description": "Learn about Retrieval Augmented Generation (RAG) and how it can help improve the quality of an LLM's generated responses by providing relevant source knowledge as context."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Evaluating scalar quantization in Elasticsearch Learn how scalar quantization can be used to reduce the memory footprint of vector embeddings in Elasticsearch through an experiment. ML Research TP TV By: Thanos Papaoikonomou and Thomas Veasey On May 3, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Understanding scalar quantization in Elasticsearch In 8.13 we introduced scalar quantization to Elasticsearch . By using this feature an end-user can provide float vectors that are internally indexed as byte vectors while retaining the float vectors in the index for optional re-scoring. This means they can reduce their index memory requirement, which is its dominant cost, by a factor of four. At the moment this is an opt-in feature feature, but we believe it constitutes a better trade off than indexing vectors as floats. In 8.14 we will switch to make this our default. However, before doing this we wanted a systematic evaluation of the quality impact. Experimentation: Evaluating scalar quantization The multilingual E5-small is a small high quality multilingual passage embedding model that we offer out-of-the-box in Elasticsearch. It has two versions: one cross-platform version which runs on any hardware and one version which is optimized for CPU inference in the Elastic Stack (see here ). E5 represents a challenging case for automatic quantization because the vectors it produces have low angular variation and are relatively low dimension compared to state-of-the-art models. If we can achieve little to no damage enabling int8 quantization for this model we can be confident that it will work reliably. The purpose of this experimentation is to estimate the effects of scalar-quantized kNN search as described here across a broad range of retrieval tasks using this model. More specifically, our aim is to assess the performance degradation (if any) by switching from a full-precision to a quantized index. Overview of methodology For the evaluation we relied upon BEIR and for each dataset that we considered we built a full precision and an int8-quantized index using the default hyperparameters ( m: 16 , ef_construction: 100 ). First, we experimented with the quantized (weights only) version of the multilingual E5-small model provided by Elastic here with Table 1 presenting a summary of the nDCG@10 scores ( k:10 , num_candidates:100 ): Dataset Full precision Int8 quantization Absolute difference Relative difference Arguana 0.37 0.362 -0.008 -2.16% FiQA-2018 0.309 0.304 -0.005 -1.62% NFCorpus 0.302 0.297 -0.005 -1.66% Quora 0.876 0.875 -0.001 -0.11% SCIDOCS 0.135 0.132 -0.003 -2.22% Scifact 0.649 0.644 -0.005 -0.77% TREC-COVID 0.683 0.672 -0.011 -1.61% Average -0.005 -1.05% Table 1 : nDCG@10 scores for the full precision and int8 quantization indices across a selection of BEIR datasets Overall, it seems that there is a slight relative decrease of 1.05% on average. Next, we considered repeating the same evaluation process using the unquantized version of multilingual E5-small (see model card here ) and Table 2 shows the respective results. Dataset Full precision Int8 quantization Absolute difference Relative difference Arguana 0.384 0.379 -0.005 -1.3% Climate-FEVER 0.214 0.222 +0.008 +3.74% FEVER 0.718 0.715 -0.003 -0.42% FiQA-2018 0.328 0.324 -0.004 -1.22% NFCorpus 0.31 0.306 -0.004 -1.29% NQ 0.548 0.537 -0.011 -2.01% Quora 0.882 0.881 -0.001 -0.11% Robust04 0.418 0.415 -0.003 -0.72% SCIDOCS 0.134 0.132 -0.003 -1.49% Scifact 0.67 0.666 -0.004 -0.6% TREC-COVID 0.709 0.693 -0.016 -2.26% Average -0.004 -0.83% Table 2 : nDCG@10 scores of multilingual-E5-small on a selection of BEIR datasets Again, we observe a slight relative decrease in performance equal to 0.83%. Finally, we repeated the exercise for multilingual E5-base and the performance decrease was even smaller (0.59%) But this is not the whole story: The increased efficiency of the quantized HNSW indices and the fact that the original float vectors are still retained in the index allows us to recover a significant portion of the lost performance through rescoring . More specifically, we can retrieve a larger pool of candidates through approximate kNN search in the quantized index, which is quite fast, and then compute the similarity function on the original float vectors and re-score accordingly. As a proof of concept, we consider the NQ dataset which exhibited a large performance decrease (2.01%) with multilingual E5-small. By setting k=15 , num_candidates=100 and window_size=10 (as we are interested in nDCG@10) we get an improved score of 0.539 recovering about 20% of the performance. If we further increase the num_candidates parameter to 200 then we get a score that matches the performance of the full precision index but with faster response times. The same setup on Arguana leads to an increase from 0.379 to 0.382 and thus limiting the relative performance drop from 1.3% to only 0.52% Results The results of our evaluation suggest that scalar quantization can be used to reduce the memory footprint of vector embeddings in Elasticsearch without significant loss in retrieval performance. The performance decrease is more pronounced for smaller vectors (multilingual E5-small produces vectors of size equal to 384 while E5-base gives 768-dimensional embeddings), but this can be mitigated through rescoring. We are confident that scalar quantization will be beneficial for most users and we plan to make it the default in 8.14. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Understanding scalar quantization in Elasticsearch Experimentation: Evaluating scalar quantization Overview of methodology Results Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Evaluating scalar quantization in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/evaluating-scalar-quantization",
+    "meta_description": "Learn how scalar quantization can be used to reduce the memory footprint of vector embeddings in Elasticsearch through an experiment."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Bit vectors in Elasticsearch Discover what are bit vectors, their practical implications and how to use them in Elasticsearch. Vector Database BT By: Benjamin Trent On July 17, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. We have supported float values from the beginning of vector search in Elasticsearch. In version 8.6, we added support for byte encoded vectors. In 8.14, we added automatic quantization down to half-byte values. In 8.15, we are adding support for bit encoded vectors. But what are bit vectors and their practical implications? As stated on the tin, bit vectors are where each dimension of the vector is a single bit. When comparing data sizes for vectors with the typical float values, bit vectors provide a whopping 32x reduction in size. Every bit counts Some semantic embedding models natively output bit vectors such as Cohere . Additionally, some other kinds of data such as image hashing utilize bit vectors directly. However, most semantic embedding models output float vectors and do not support bit encoding directly. You can naively binarize vectors yourself since the math is simple. For each vector dimension, check if the value is > median . If it is, that is a 1 bit, and otherwise it is a 0 bit. Figure 0: Transforming 8 float values into individual bit values and then collapse to single byte , assuming the median value is 0 . Here is some simple Python code to binarize a vector: Obviously, this can lose a fair bit of information (pun intended). But for larger vectors or vectors specifically optimized to work well with bit encoding, the space savings can be worth it. Consider 1 million 1024 dimension floating point vectors. Each vector is 4KB in size and all vectors will require approximately 4GB. With binary quantization, each vector is now only 128 bytes and all vectors in total are only around 128MB. When you consider the cost of storage & memory, this is exceptionally attractive. Now, since we are no longer in float land, we cannot use typical distance functions like cosineSimilarity or dotProduct . Instead, we take advantage of each dimension being a single bit by using Hamming distance . hamming distance is fairly straight forward, for every individual bit , we calculate the xor with the corresponding bit in the other vector. Then we sum up the resulting bits. Figure 1: Hamming distance calculation between two bit elements. Let's think back to our 1 million 1024 dimension vectors. In addition to the space savings, using hamming distance over 128 bytes vs. dotProduct over 1024 floats is a significant reduction in computation time. For some simple benchmarking (this is not exhaustive), we indexed 1 million 1024 dimension vectors in Elasticsearch with a flat index. With only 2GB of off-heap, bit vectors take approximately 40ms to return, but float takes over 3000ms . If we increase the off-heap to 4GB, bit vectors continue to take the same amount of time (they fit into memory even before) and float vectors improve to 200ms . So hamming is still significantly faster than the floating point dot-product and requires significantly less memory. A bit of error bit vectors aren't perfect, it is obvious that it is a lossy encoding. The concern isn't that vectors will not be unique. Even when using a bit encoding, 386 dimensioned vectors still have 2 386 2^{386} 2 386 possible unique vectors. The main concerns are distance collisions and the size of the error the encoding introduces. Even if we assume a well distributed bit encoding, it's likely to have many distance collisions when gathering a large number of vectors. Intuitively, this makes sense as our distance measurement is summing the bits. For example, 00000001 and 10000000 are the same distance apart as 00000001 and 00000010 . Once you need to gather more than dimension documents, you will have collisions. In reality, it will occur much sooner than that. To illustrate, here is a small study. The focus here is finding out how many bit vectors would need gathering to get the true nearest top k k k vectors. For the first experiment, we used 1 million CohereV3 vectors from their Wikipedia dataset . We randomly sampled (without replacement) 50 query vectors and used those to determine true dotProduct and hamming distances. Here are the \"best\" and \"worst\" performing query vectors. With quality being the number of documents required to retrieve the correct 100 100 100 nearest neighbors (e.g. more being worse). Figure 2: The best performing CohereV3 query vector & its distances, you can see how the distances are actually aligning well. Figure 3: The worst performing CohereV3 query vector & its distances. Here, the nearer distances align well, but that correlation weakens as we start gathering vectors that are further away. Figure 4: The median number of vectors required to get the true k k k nearest neighbors over all 50 queries. CohereV3 is excellent here, showing that only around 10x oversampling is required, even for the 10 0 t h 100^{th} 10 0 t h nearest neighbor. Visually, however, we can see that the oversampling required increases exponentially. From this small study, CohereV3 does exceptionally well. The median case showing you can oversample by approximately 10x to achieve similar recall. However, in the worst case when gathering more than 50 nearest documents, it starts being problematic, requiring much more than 10x oversampling. Depending on the query and the dataset you can run into problems. So, how well does binarization do when a model and dataset combination are not optimized for bit vectors? We used e5-small-v2 and embedded the quora datset to test this. Randomly taking 500k vectors and then randomly sampled 50 query vectors from those vectors. Figure 5: The best performing e5-small query vector & its distances. The extremely near distances align fairly well, but still not exceptionally so. Figure 6: The worst performing e5-small query vector & its distances. The hamming and dotProduct distances are effectively uncorrelated. Figure 7: The median number of vectors required to get the true k k k nearest neighbors. The best e5-small vector does moderately well and its hamming distances are semi-correlated with the dotProduct . The worst case is a drastically different story. The distances are effectively uncorrelated. The median values show that you would need to oversample by approximately 800x to achieve the nearest 10 vectors and it only gets worse from there. In short, for models that do well with binary quantization and when the model is well adapted to the dataset, bit quantization is a great option. That said, keep in mind that the oversampling required can increase exponentially as you gather more vectors. For out-of-domain data sets where nearest vectors are not well distinguished for the model, or for models that are not optimized for binary quantization at all, bit vectors can be problematic, even with a small number of nearest vectors. Ok, but how do I use bit vectors? When using bit vectors in Elasticsearch, you can specify the bit encoding in the mapping. For example: Figure 8: Mapping a bit vector in Elasticsearch, allowing for bit encoding. The first document will statically set the bit dimensions Or if you do not want to index in the HNSW index , you can use the flat index type. Figure 9: Mapping a bit vector in Elasticsearch in a flat index type. Then, to index a document with a bit vector, you can use the following: Figure 10: A 1024 dimensioned bit vector in hexidecimal format. Now you can utilize a knn query Figure 11: Querying bit vectors with a 1024 dimensioned hexidecimal vector. Just a bit more Thank you for making it through all the 2-bit jokes. We are very excited about the possibilities that bit vectors bring to Elasticsearch in 8.15. Please try it out in Elastic Cloud once 8.15 is released or in Elasticsearch Serverless right now! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Every bit counts A bit of error Ok, but how do I use Just a bit more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Bit vectors in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/bit-vectors-in-elasticsearch",
+    "meta_description": "Discover what are bit vectors, their practical implications and how to use them in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series The ColPali model series Introducing the ColPali model, its implementation in Elasticsearch, and how to scale late interaction models for large-scale vector search. Part1 Search Relevance Vector Database +1 March 18, 2025 Searching complex documents with ColPali - part 1 The article introduces the ColPali model, a late-interaction model that simplifies the process of searching complex documents with images and tables, and discusses its implementation in Elasticsearch. PS BT By: Peter Straßer and Benjamin Trent Part2 Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "The ColPali model series - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/colpali-model-elasticsearch",
+    "meta_description": "Introducing the ColPali model, its implementation in Elasticsearch, and how to scale late interaction models for large-scale vector search.\n"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Lucene bug adventures: Fixing a corrupted index exception Sometimes, a single line of code takes days to write. Here, we get a glimpse of an engineer's pain and debugging over multiple days to fix a potential Apache Lucene index corruption. Lucene BT By: Benjamin Trent On December 27, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Be prepared: This particular blog is different than usual. It's not an explanation of a new feature or a tutorial. This is about a single line of code that took three days to write. We'll be fixing a potential Apache Lucene index corruption. Some takeaways I hope you will have: All flaky tests are repeatable, given enough time and the right tools Many layers of testing are key for robust systems. However, higher levels of tests become increasingly more difficult to debug and reproduce. Sleep is an excellent debugger How Elasticsearch tests At Elastic, we have a plethora of tests that run against the Elasticsearch codebase. Some are simple and focused functional tests, others are single node “happy path” integration tests, and yet others attempt to break the cluster to make sure everything behaves correctly in a failure scenario. When a test continually fails, an engineer or tooling automation will create a github issue and flag it for a particular team to investigate. This particular bug was discovered by a test of the last kind. These tests are tricky, sometimes only being repeatable after many runs. What is this test actually testing? This particular test is an interesting one. It will create a particular mapping and apply it to a primary shard. Then when attempting to create a replica. The key difference is that when the replica attempts to parse the document, the test injects an exception, thus causing the recovery to fail in a surprising (but expected) way. Everything was working as expected, however, with one significant catch. During the test cleanup, we validated consistency, and there, this test ran into a snag. This test was failing to fail an expected manner. During the consistency check we would verify that all the replicated and primary Lucene segment files were consistent. Meaning, uncorrupted and fully replicated. Having partial data or corrupted data is way worse than having something fail fully. Here is the scary and abbreviated stack trace of the failure. Somehow, during the forced replication failure the replicated shard ended up getting corrupted! Let me explain the key part of the error in plain english. Lucene is a segment based architecture, meaning each segment knows and manages its own read-only files. This particular segment was being validated via its SegmentCoreReaders to ensure everything was copacetic. Each core reader has metadata stored that indicates what field types and files exist for a given segment. However, when validating the Lucene90PointsFormat , certain expected files were missing. With the segments _0.cfs file we expected a point format file called kdi . cfs stands for \"compound file system\" into which Lucene will sometimes combine all field types and all tiny files into a single larger file for more efficient replication and resource utilization. In fact, all three of the point file extensions: kdd , kdi , and kdm were missing. How could we get into the place where a Lucene segment expects to find a point file but it's missing!?! Seems like a scary corruption bug! The first step for every bug fix, replicate it Replicating the failure for this particular bug was extremely painful. While we take advantage of randomized value testing in Elasticsearch, we are sure to provide every failure with a (hopefully) reproducible random seed to ensure all failures can be investigated. Well, this works great for all failures except for those caused by a race condition . No matter how many times I tried, the particular seed never repeated the failure locally. But, there are ways to exercise the tests and push towards a more repeatable failure. Our particular test suite allows for a given test to be run more than once in the same command via the -Dtests.iters parameter. But this wasn’t enough, I needed to make sure that the execution threads were switching and thus increasing the likelihood of this race condition occurring. Another wrench in the system was that the test ended up taking so long to run, the test runner would timeout. In the end, I used the following nightmare bash to repeatably run the test: In comes stress-ng . This allows you to quickly start a process that will just eat CPU cores for lunch. Randomly spamming stress-ng while running numerous iterations of the failing test finally allowed me to replicate the failure. One step closer. To stress the system, just open another terminal window and run: Revealing the bug Now that the test failure revealing the bug is mostly repeatable, time to try and find the cause. What makes this particular test strange is that Lucene is throwing because it expects point values, but none are added directly by the test. Only text values. This pushed me to consider looking at recent changes to our optimistic concurrency control fields: _seq_no and _primary_term . Both of these are indexed as points and exist in every Elasticsearch document. Indeed a commit did change our our _seq_no mapper! YES! This has to be the cause! But, my excitement was short-lived. This only changed the order of when fields got added to the document. Before this change, _seq_no fields were added last to the document. After, they were added first. No way the order of adding fields to a Lucene document would cause this failure... Yep, changing the order of when fields were added caused the failure. This was surprising and turns out to be a bug in Lucene itself! Changing the order of what fields are parsed shouldn’t change the behavior of parsing a document. The bug in Lucene Indeed, the bug in Lucene focused on following conditions: Indexing a points value field (e.g. _seq_no ) Trying to index a text field throw during analysis In this weird state, we open an Near Real Time Reader from the writer that experience the text index analysis exception But no matter how many ways I tried, I couldn’t fully replicate. I directly added pause points for debugging throughout the Lucene codebase. I attempted randomly opening readers during the exception path. I even printed out megabytes and megabytes of logs trying to find the exact path where this failure occurred. I just couldn’t do it. I spent a whole day fighting and losing. Then I slept. The next day I re-read the original stack trace and discovered the following line: In all my recreation attempts, I never specifically set the retention merge policy. The SoftDeletesRetentionMergePolicy is used by Elasticsearch so that we can accurately replicate deletions in replicas and ensure all our concurrency controls are in charge of when documents are actually removed. Otherwise, Lucene is in full control and will remove them at any merge. Once I added this policy and replicated the most basic steps mentioned above, the failure immediately replicated. I have never been more happy to open a bug in Lucene . While it presented itself as a race condition in Elasticsearch, it was simple to write a repeatably failing test in Lucene once all the conditions were met. In the end, like all good bugs, it was fixed with just 1 line of code. Multiple days of work, for just one line of code. But it was worth it. Not the end Hope you enjoyed this wild ride with me! Writing software, especially software as widely used and complex as Elasticsearch and Apache Lucene is rewarding. However, at times, it’s exceptionally frustrating. I both love and hate software. The bug fixing is never over! Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Be prepared: How Elasticsearch tests What is this test actually testing? The first step for every bug fix, replicate it Revealing the bug Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Lucene bug adventures: Fixing a corrupted index exception - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/lucene-corrupted-index-exception",
+    "meta_description": "Learn how to debug and fix a Lucene index corruption with a real-life example from the Elastic engineering team."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. Ingestion How To AL By: Andre Luiz On February 4, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. What is Logstash? Logstash is a widely used Elastic Stack tool for processing large volumes of log data in real-time. It acts as an efficient data pipeline, integrating information from various sources into a single structured flow. Its primary function is to reliably perform data extraction, transformation, and loading. Logstash offers several advantages, particularly its versatility in supporting multiple types of inputs, filters, and outputs, enabling integration with a wide range of sources and destinations. It processes data in real-time, capturing and transforming information. Its native integration with the Elastic Stack, especially Elasticsearch and Kibana, facilitates data analysis and visualization. Additionally, it includes advanced filters that enable efficient data normalization, enrichment, and transformation. How does Logstash work? Logstash is composed of inputs, filters, and outputs, which form the data processing pipeline. These components are configured in a .config file that defines the data ingestion flow. Inputs : Capture data from various sources. Filters : Process and transform the captured data. Outputs : Send the transformed data to defined destinations. The most common types of each component are presented below: Types of Inputs: File : Reads log files in various formats (text, JSON, CSV, etc.). Message Queues : Kafka, RabbitMQ. APIs : Webhooks or other data collection APIs. Databases : JDBC connections for relational data extraction. Types of Filters: Grok : For analyzing and extracting text patterns. Mutate : Modifies fields (renames, converts types, removes data). Date : Converts date and time strings into a readable date format. GeoIP : Enriches logs with geographic data. JSON : Parses or generates JSON data. Types of Outputs: Elasticsearch : The most common destination, Elasticsearch is a search and analytics engine that allows powerful searches and visualizations of data indexed by Logstash. Files : Stores processed data locally. Cloud Services : Logstash can send data to various cloud services, such as AWS S3, Google Cloud Storage, Azure Blob Storage, for storage or analysis. Databases : Logstash can send data to various other databases, such as MySQL, PostgreSQL, MongoDB, etc., through specific connectors. Logstash Elasticsearch data ingestion In this example, we implement data ingestion into Elasticsearch using Logstash. The steps configured in this example will have the following flow: Kafka will be used as the data source. Logstash will consume the data, apply filters such as grok, geoip, and mutate to structure it. The transformed data will be sent to an index in Elasticsearch. Kibana will be used to visualize the indexed data. Prerequisites We will use Docker Compose to create an environment with the necessary services: Elasticsearch, Kibana, Logstash, and Kafka. The Logstash configuration file, named l ogstash.conf , will be mounted directly into the Logstash container. Below we will detail the configuration of the configuration file. Here is docker-compose.yml: As mentioned above, the Logstash pipeline will be defined, in this step we will describe the Input, Filter and Output configurations. The logstash.conf file will be created in the current directory (where docker-compose.yml is located). In docker-compose.yml the logstash.conf file that is on the local file system will be mounted inside the container at the path /usr/share/logstash/pipeline/logstash.conf. Logstash pipeline configuration The Logstash pipeline is divided into three sections: input, filter, and output. Input: Defines where the data will be consumed from (in this case, Kafka). Filter: Applies transformations and structuring to the raw data. Output: Specifies where the processed data will be sent (in this case, Elasticsearch). Next, we will configure each of these steps in detail. Input configuration The data source is a Kafka topic and to consume the data from the topic it will be necessary to configure the Kafka input plugin. Below is the configuration for the Kafka plugin in Logstash, where we define: bootstrap_servers : Address of the Kafka server. topics : Name of the topic to be consumed. group_id : Consumer group identifier. With this, we are ready to receive the data. Filter configuration Filters are responsible for transforming and structuring data. Let's configure the following filters: Grok filter Extracts structured information from unstructured data. In this case, it extracts the timestamp, log level, client IP, URI, status, and the JSON payload. The example log: Extracted Fields: timestamp : Extracts the date and time (e.g., 2025-01-05T16:30:15). log_level : Captures the log level (e.g., INFO, ERROR). client_ip : Captures the client's IP address (e.g., 69.162.81.155). uri : Captures the URI path (e.g., /api/products). status : Captures the HTTP status code (e.g., 200). Date filter Converts the timestamp field into a format readable by Elasticsearch and stores it in @timestamp. GeoIP filter Next, we will use the geoip filter to retrieve geographic information, such as country, region, city, and coordinates, based on the value of the client_ip field. Mutate filter The mutate filter allows transformations on fields. In this case, we will use two of its properties: remove_field : Removes the timestamp and message fields, as they are no longer needed. convert : Converts the status field from a string to an integer. Output configuration The output defines where the transformed data will be sent. In this case, we will use Elasticsearch. We now have our configuration file defined. Below is the complete file: Send and ingest data With the containers running, we can start sending messages to the topic and wait for the data to be indexed. First, create the topic if you haven't already. To send the messages, execute the following command in the terminal: Messages to be sent: To view the indexed data, go to Kibana: Once the indexing has been successfully completed, we can view and analyze the data in Kibana. The mapping and indexing process ensures that the fields are structured according to the configurations defined in Logstash. Conclusion With the configuration presented, we created a pipeline using Logstash to index logs in a containerized environment with Elasticsearch and Kafka. We explored Logstash's flexibility to process messages using filters such as grok, date, geoip, and mutate, structuring the data for analysis in Kibana. Additionally, we demonstrated how to configure the integration with Kafka to consume messages and use them for processing and indexing the data. References Logstash https://www.elastic.co/guide/en/logstash/current/index.html Logstash docker https://www.elastic.co/guide/en/logstash/current/docker.html GeoIp plugin https://www.elastic.co/guide/en/logstash/current/plugins-filters-geoip.html Mutate plugin https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html Grok plugin https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html Kafka plugin https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is Logstash? How does Logstash work? Logstash Elasticsearch data ingestion Prerequisites Logstash pipeline configuration Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to ingest data to Elasticsearch through Logstash - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-logstash-ingest-data",
+    "meta_description": "A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Advanced RAG techniques In this series, we'll discuss and implement techniques that may increase RAG performance. Part1 Vector Database Generative AI August 14, 2024 Advanced RAG techniques part 1: Data processing Discussing and implementing techniques which may increase RAG performance. Part 1 of 2, focusing on the data processing and ingestion component of an advanced RAG pipeline. HC By: Han Xiang Choong Part2 Vector Database Generative AI August 15, 2024 Advanced RAG techniques part 2: Querying and testing Discussing and implementing techniques which may increase RAG performance. Part 2 of 2, focusing on querying and testing an advanced RAG pipeline. HC By: Han Xiang Choong Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Advanced RAG techniques - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/advanced-rag-techniques",
+    "meta_description": "In this series, we'll discuss and implement techniques that may increase RAG performance. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to use the ES|QL Helper in the Elasticsearch Ruby Client Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results. ES|QL Ruby How To FB By: Fernando Briano On October 24, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction The Elasticsearch Ruby client can be used to craft ES|QL queries and make it easier to work with the data returned from esql.query . ES|QL allows developers to filter, transform, and analyze data stored in Elasticsearch via queries. It uses \"pipes\" ( | ) to work with the data step by step. The esql.query API has been supported in the Elasticsearch Ruby Client since it was available as experimental in version 8.11.0 . You can execute an ES|QL request with the following code: The default response is parsed from JSON (you can also get a CSV or text by passing in the format parameter), and it looks like this: ES|QL Helper In Elasticsearch Ruby v8.13.0 , the client introduced the ES|QL Helper for the esql.query API. Instead of the default response, the helper returns an array of hashes with the columns as keys and the respective values instead of the default JSON value. Additionally, you can iterate through the response values and transform the data in by passing in a Hash of column => Proc values. You could use this for example to convert a @timestamp column value into a DateTime object. We'll take a look at how to use this with example data. Setup and ingesting data For this example, we're using the JSON dump from TheGamesDB, a community driven crowd-sourced games information website. Once we've downloaded the JSON file, we can ingest it into Elasticsearch by using another Helper form the Ruby client, the Bulk Helper . The data includes the list of all games in the database within the data.games keys. It also includes platforms and box art information, but for the purpose of this example, we're only going to use the games data. The BulkHelper provides a way to ingest a JSON file directly into Elasticsearch. To use the helper, we need to require it in our code, and instantiate it with a client and an index on which to perform the bulk action (we can change the index later on an already instantiated helper). We can use ingest_json and pass in the JSON file, the keys where it can find the data, and slice to separate the documents in batches before sending them to Elasticsearch: This will ingest all the game titles with their respective information into the videogames index. Using the ES|QL Helper With the data loaded, we can now query it with ES|QL: If we run this query with the esql.query API directly, we'll get the columns/values result: The helper however, returns an Array of Hashes with the columns as keys and the respective values. So we can work with the response, and access the value for each Hash in the Array with the name of a column as the key: The ESQLHelper also provides the ability to transform the data in the response. We can do this by passing in a Hash of column => Proc values. For example, let's say we want to format the release date in this previous query to show a more human friendly date. We can run this: If we run the same code from before, we'll get this result: You can pass in as many Procs as there are columns in the response. For example, the data includes a youtube field, where sometimes the URL for a video on YouTube is stored, other times just the video hash (e.g. U4bKxcV5hsg ). The URL for a YouTube video follows the convention https://youtube.com/watch?v=VIDEOHASH . So we could also add a parser to prepend the URL to the values that only include the hash: If we then run response.map { |a| a['youtube'] }.compact , we'll get just the URLs for YouTube videos for the videogames we're looking for. Conclusion As you can see, the ESQLHelper class can make it easier to work with the data returned from esql.query . You can learn more about the Elasticsearch Ruby Client and its helpers in the official documentation . And if you have any feedback, questions or requests, don't hesitate to create a new issue in the client's repository. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction ES|QL Helper Setup and ingesting data Using the ES|QL Helper Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to use the ES|QL Helper in the Elasticsearch Ruby Client - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-ruby-helper-elasticsearch",
+    "meta_description": "Explore the ES|QL helper and discover how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Build a multimodal image retrieval system using KNN search and CLIP embeddings Learn how to build a powerful semantic image search engine with Roboflow Inference and Elasticsearch. How To JG By: James Gallagher On January 27, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this guide, we are going to walk through how to build an image retrieval system using KNN clustering in Elasticsearch and CLIP embeddings computed with Roboflow Inference , a computer vision inference server. Roboflow Universe , the largest repository of computer vision data on the web with more than 100 million images hosted, uses CLIP embeddings to enable efficient, semantic queries for our dataset search engine. Without further ado, let’s get started! Introduction to CLIP and Roboflow Inference CLIP (Contrastive Language-Image Pretraining) is a computer vision model architecture and model developed by OpenAI. The model was released in 2021 under an MIT license. The model was trained “to predict the most relevant text snippet, given an image”. In so doing, CLIP learned to identify the similarity between images and text with the vectors the model uses. CLIP maps images and text into the vector space. This allows vectors to compare and find images similar to a text query, or images similar to another image. The advancement of multimodal models like CLIP has made it easier than ever to build a semantic image search engine. Models like CLIP can be used to create “embeddings” that capture semantic information about an image or a text query. Vector embeddings are a type of data representation that converts words, sentences, and other data into numbers that capture their meaning and relationships. Roboflow Inference is a high-performance computer vision inference server. Roboflow Inference supports a wide range of state-of-the-art vision models, from YOLO11 for object detection to PaliGemma for visual question answering to CLIP for multimodal embeddings. You can use Roboflow Inference with a Python SDK, or in a Docker environment. In this guide, we will use Inference to calculate CLIP embeddings, then store them in an Elasticsearch cluster for use in building an image retrieval system. Prerequisites To follow this guide, you will need: An Elasticsearch instance that supports KNN search A free Roboflow account Python 3.12+ We have prepared a Jupyter Notebook that you can run on your computer or on Google Colab for use in following along with this guide. Open the notebook . Step #1: Set up an Elasticsearch index with KNN support For this guide, we will use the Elasticsearch Python SDK . You can install it using the following code: If you don’t already have an Elasticsearch cluster set up, refer to the Elasticsearch documentation to get started . Once you have installed the SDK and set up your cluster, create a new Python file and add the following code to connect to your client: To run embedding searches in Elasticsearch, we need an index mapping that contains a dense_vector property type. For this guide, we will create an index with two fields: a dense vector that contains the CLIP embedding associated with an image, and a file name associated with an image. Run the following code to create your index: The output should look similar to the following: The default index type used with KNN search is L2 Norm, also known as Euclidean distance. This distance metric doesn’t work well for CLIP similarity. Thus, above we explicitly say we want to create a cosine similarity index. CLIP embeddings are best compared with cosine similarity. For this guide, we will use a CLIP model with 512 dimensions. If you use a different CLIP model, make sure that you set the dims value to the number of dimensions of the vector returned by the CLIP model. Step #2: Install Roboflow Inference Next, we need to install Roboflow Inference and supervision, a tool for working with vision model predictions. You can install the required dependencies using the following command: This will install both Roboflow Inference and the CLIP model extension that we will use to compute vectors. With Roboflow Inference installed, we can start to compute and store CLIP embeddings. Step #3: Compute and store CLIP embeddings For this guide, we are going to build a semantic search engine for the COCO 128 dataset. This dataset contains 128 images sampled from the larger Microsoft COCO dataset. The images in COCO 128 are varied, making it an ideal dataset for use in testing our semantic search engine. To download COCO 128, first create a free Roboflow account . Then, navigate to the COCO 128 dataset page on Roboflow Universe , Roboflow’s open computer vision dataset community. Click “Download Dataset”: Choose the “YOLOv8” format. Choose the option to show a download code: Copy the terminal command to download the dataset. The command should look something like this: When you run the command, the dataset will be downloaded to your computer and unzipped. We can now start computing CLIP embeddings. Add the following code from your Python file from earlier, then run the full file: This code will loop through all images in the train split of the COCO 128 dataset and run them through CLIP with Roboflow Inference. We then index the vectors in Elasticsearch alongside the file names related to each vector. It may take 1-2 minutes for the CLIP model weights to download. Your script will pause temporarily while this is done. The CLIP model weights are then cached on your system for future use. Note: When you run the code above, you may see a few warnings related to ExecutionProviders. This relates to the optimizations available in Inference for different devices. For example, if you deploy on CUDA the CoreMLExecutionProvide will not be available so a warning is raised. No action is required when you see these warnings. Step #4: Retrieve data from Elasticsearch Once you have indexed your data, you are ready to run a test query! To use a text as an input, you can use this code to retrieve an input vector for use in running a search: To use an image as an input, you can use this code: For this guide, let’s run a text search with the query “coffee”. We are going to use a k-nearest neighbours (KNN) search. This search type accepts an input embedding and finds values in our database whose embeddings are similar to the input. KNN search is commonly used for vector comparisons. KNN search always returns the top k nearest neighbours. If k = 3, Elasticsearch will return the three most similar documents to the input vector. With Elasticsearch, you can retrieve results from a large vector store in milliseconds. We can run a KNN search with the following code: The k value above indicates how many of the nearest vectors should be retrieved from each shard. The size parameter of a query determines how many results to return. Since we are working with one shard for this demo, the query will return three results. Our code returns: We have successfully run a semantic search and found images similar to our input query! Above, we can see the three most similar images: a photo of a coffee cup and a cake on a table outdoors, then two duplicate images in our index with coffee cups on tables. Conclusion With Elasticsearch and the CLIP features in Roboflow Inference, you can create a multimodal search engine. You can use the search engine for image retrieval, image comparison and deduplication, multimodal Retrieval Augmented Generation with visual prompts, and more. Roboflow uses Elasticsearch and CLIP extensively at scale. We store more than 100 million CLIP embeddings and index them for use in multimodal search for our customers who want to search through their datasets at scale. Through the growth of data on our platform from hundreds of images to hundreds of millions, Elasticsearch has scaled seamlessly. To learn more about using Roboflow Inference, refer to the Roboflow Inference documentation . To find data for your next computer vision project, check out Roboflow Universe . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction to CLIP and Roboflow Inference Prerequisites Step #1: Set up an Elasticsearch index with KNN support Step #2: Install Roboflow Inference Step #3: Compute and store CLIP embeddings Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Build a multimodal image retrieval system using KNN search and CLIP embeddings - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/multimodal-image-retrieval-with-roboflow",
+    "meta_description": "Learn how to build a semantic image search engine using CLIP embeddings and Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds Azure AI Studio support Elasticsearch open inference API now supports Azure AI Studio. Learn how to use Azure AI Studio capabilities with Elasticsearch in this blog. Integrations Generative AI Vector Database How To MH By: Mark Hoy On May 22, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. As part of our ongoing commitment to serve the Microsoft Azure developers with the tools of their choice, we are happy to announce that Elasticsearch now provides integration of the hosted model catalog on Microsoft Azure AI Studio into our open inference API. This complements the ability for developers to bring their Elasticsearch vector database to be used in Azure OpenAI . Developers can use the capabilities of the world's most downloaded vector database to store and utilize embeddings generated from OpenAI models from Azure AI studio or access the wide array of chat completion model deployments for quick access to conversational models like mistral-small . Just recently we've added support for Azure OpenAI text embeddings and completion , and now we've added support for utilizing Azure AI Studio. Microsoft Azure developers have complete access to Azure OpenAI & Microsoft Azure AI Studio service capabilities and can bring their Elasticsearch data to revolutionize conversational search . Let's walk you through just how easily you can use these capabilities with Elasticsearch. Deploying a model in Azure AI Studio To get started, you'll need a Microsoft Azure subscription as well as access to Azure AI Studio . Once you are set up, you'll need to deploy either a text embedding model or a chat completion model from the Azure AI Studio model catalog . Once your model is deployed, on the deployment overview page take note of the target URL and your deployment's API key - you'll need these later to create your inference endpoint in Elasticsearch. Furthermore, when you deploy your model, Azure offers two different types of deployment options - a “pay as you go” model (where you pay by the token), and a “realtime” deployment which is a dedicated VM that is billed by the hour. Not all models will have both deployment types available, so be sure to take note as well as which deployment type is used. Creating an Inference API Endpoint in Elasticsearch Once your model is deployed, we can now create an endpoint for your inference task in Elasticsearch. For the examples below we are using the Cohere Command R model to perform chat completion. In Elasticsearch, create your endpoint by providing the service as “azureaistudio”, and the service settings including your API key and target from your deployed model. You'll also need to provide the model provider, as well as the endpoint type from before (either “token” or “realtime”). In our example, we've deployed a Cohere model with a token type endpoint. When you send Elasticsearch the command, it should return back the created model to confirm that it was successful. Note that the API key will never be returned and is stored in Elasticsearch's secure settings. Adding a model for using text embeddings is just as easy. For reference, if we had deployed the Cohere-embed-v3-english model , we can create our inference model in Elasticsearch with the “text_embeddings” task type by providing the appropriate API key and target URL from that deployment's overview page: Let's perform some inference That's all there is to setting up your model. Now that that's out of the way, we can use the model. First, let's test the model out by asking it to provide some text given a simple prompt. To do this, we'll call the _inference API with our input text: And we should see Elasticsearch provide a response. Behind the scenes, Elasticsearch is calling out to Azure AI Studio with the input text and processes the results from the inference. In this case, we received the response: We've tried to make it easy for the end user to not have to deal with all the technical details behind the scenes, but we can also control our inference a bit more by providing additional parameters to control the processing such as sampling temperature and requesting the maximum number of tokens to be generated: That was easy. What else can we do? This becomes even more powerful when we are able to use our new model in other ways such as adding additional text to a document when it's used in an Elasticsearch ingestion pipeline. For example, the following pipeline definition will use our model and anytime a document using this pipeline is ingested, any text in the field “question_field” will be sent through the inference API and the response will be written to the “completed_text_answer” field in the document. This allows large batches of documents to be augmented. Limitless possibilities By harnessing the power of Azure AI Studio deployed models in your Elasticsearch inference pipelines, you can enhance your search experience's natural language processing and predictive analytics capabilities. In upcoming versions of Elasticsearch, users can take advantage of new field mapping types that simplify the process even further where designing an ingest pipeline would no longer be necessary. Also, as alluded to in our accelerated roadmap for semantic search the future will provide dramatically simplified support for inference tasks with Elasticsearch retrievers at query time. These capabilities are available through the open inference API in our stateless offering on Elastic Cloud. It'll also be soon available to everyone in an upcoming versioned Elasticsearch release. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Deploying a model in Azure AI Studio Creating an Inference API Endpoint in Elasticsearch Let's perform some inference That was easy. What else can we do? Limitless possibilities Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch open inference API adds Azure AI Studio support - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-azure-ai-studio-support",
+    "meta_description": "Elasticsearch open inference API now supports Azure AI Studio. Learn how to use Azure AI Studio capabilities with Elasticsearch in this blog."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API for Google AI Studio Elasticsearch open inference API adds support for Google AI Studio Integrations Python How To JV By: Jeff Vestal On September 27, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch's open inference API supports the Gemini Developer API. When using Google AI Studio, developers can chat with data in their Elasticsearch indexes, run experiments, and build applications using Google Cloud’s models, such as Gemini 1.5 Flash. AI Studio is where Google releases the latest models from Google DeepMind and is the fastest way to start building with Gemini. In this blog, we will create a new Google AI Studio project, create an Elasticsearch inference API endpoint to use Gemini 1.5 Flash, and implement a sample chat app to estimate how many ducks fit on an American football field! (Because why not?) AI Studio API key To get started, we need to create an API key for AI Studio. Head over to ai.google.dev/aistudio and click “Sign In to Google AI Studio.” If you aren’t already logged in, you will be prompted to do so. Once logged in, you are presented with two options: using AI Studio in the browser to test prompts with Gemini or creating an API key. We will create an API key to allow Elasticsearch to connect to AI Studio. You are prompted to accept Google Cloud’s terms and conditions the first time you create an API key. If you use a personal account, you will be given the option to create an API key in a new project. You may not see that option if you use an enterprise account, depending on your access roles. Either way, you can select an existing project to create the key. Select an existing project or create a new project. Copy the generated API key someplace safe for use in the next section. Elasticsearch Inference API We will use Python to configure the Inference API to connect to Google AI Studio and test the chat completion with Gemini. Create the Inference Endpoint Create an Elasticsearch connection. Create the Inference Endpoint to connect to Google AI Studio. For this blog, we will use the Gemini 1.5 Flash model. For a list of available models, consult the Gemini docs. Confirm the endpoint was created. The output should be similar to: Chat time! That's all it takes to create an Elasticsearch API Endpoint to access Google AI Studio! With that done, you can start using it. We will ask it to estimate how many ducks fit on an American football field. Why? Why not. Response Simple and powerful, at the same time With the addition of Google AI Studio , the Elastic open inference API provides access to a growing choice of powerful generative AI capabilities for developers. Google AI Studio is designed to enable simple, quick generative AI experiments to test your best ideas. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to AI Studio API key Elasticsearch Inference API Create the Inference Endpoint Chat time! Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch open inference API for Google AI Studio - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/google-ai-studio-elasticsearch-open-inference-api",
+    "meta_description": "Elasticsearch open inference API supports Google AI Studio. Learn how to create a Google AI Studio project, an Elasticsearch inference API endpoint and a chat app. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Semantic reranking & the Elastic Rerank model Introducing the concept of semantic reranking and Elastic Rerank, Elastic's new semantic re-ranker model. Part1 ML Research Search Relevance October 29, 2024 What is semantic reranking and how to use it? Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou Part2 ML Research November 25, 2024 Introducing Elastic Rerank: Elastic's new semantic re-ranker model Learn about how Elastic's new re-ranker model was trained and how it performs. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou Part3 ML Research December 5, 2024 Exploring depth in a 'retrieve-and-rerank' pipeline Select an optimal re-ranking depth for your model and dataset. TP TV QH By: Thanos Papaoikonomou , Thomas Veasey and Quentin Herreros Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Semantic reranking & the Elastic Rerank model - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/semantic-reranking-and-the-elastic-rerank-model",
+    "meta_description": "Introducing the concept of semantic reranking and Elastic Rerank, Elastic's new semantic re-ranker model."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Vector search in Elasticsearch: The rationale behind the design In this blog, you'll learn how vector search has been integrated into Elasticsearch and the trade-offs that we made. Vector Database ML Research AG By: Adrien Grand On July 24, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Are you interested to learn about the characteristics of Elasticsearch for vector search and what the design looks like? As always, design decisions come with pros and cons. This blog aims to break down how we chose to build vector search in Elasticsearch. Vector search is integrated in Elasticsearch through Apache Lucene Some background about Lucene first: Lucene organizes data into immutable segments that are merged periodically. Adding more documents requires adding more segments. Modifying existing documents requires atomically adding more segments and marking previous versions of these documents as deleted. Every document within a segment is identified by a doc ID, which is the index of this document within the segment, similar to indices of an array. The motivation for this approach is managing inverted indices, which aren't good at in-place modifications but can be merged efficiently. In addition to inverted indices, Lucene also supports stored fields (a document store), doc values (columnar storage), term vectors (per-document inverted indices), and multi-dimensional points in its segments. Vectors have been integrated the same way: New vectors get buffered into memory at index time. These in-memory buffers get serialized as part of segments when the size of the index-time buffer is exceeded or when changes must be made visible. Segments get periodically merged together in the background in order to keep the total number of segments under control and limit the overall per-segment search-time overhead. Since they are part of segments, vectors need to be merged too. Searches must combine top vector hits across all segments in the index. Searches on vectors must look at the set of live documents in order to exclude documents that are marked as deleted. The system above is driven by the way that Lucene works. Lucene currently uses the hierarchical navigable small world (HNSW) algorithm to index vectors. At a high level, HNSW organizes vectors into a graph where similar vectors are likely to be connected. HNSW is a popular choice for vector search because it is rather simple, performs well on comparative benchmarks for vector search algorithms, and supports incremental insertions. Lucene's implementation of HNSW follows Lucene's guideline of keeping the data on disk and relying on the page cache to speed up access to frequently accessed data. Approximate vector search is exposed in Elasticsearch's _search API through the knn section . Using this feature will directly leverage Lucene's vector search capabilities. Vectors are also integrated in Elasticsearch's scripting API, which allows performing exact brute-force search , or leveraging vectors for rescoring. Let's now dive into the pros and cons of integrating vector search through Apache Lucene. Cons The main cons of taking advantage of Apache Lucene for vector search come from the fact that Lucene ties vectors to segments. However, as we will see later in the pros section, tying vectors to segments is also what enables major features such as efficient pre-filtering, efficient hybrid search, and visibility consistency, among others. Merges need to recompute HNSW graphs Segment merges need to take N input segments, typically 10 with the default merge policy, and merge them into a single segment. Lucene currently creates a copy of the HNSW graph from the largest input segment that doesn't have deletes and then adds vectors from other segments to this HNSW graph. This approach incurs index-time overhead as segments are merged compared to mutating a single HNSW graph in-place over the lifetime of the index. Searches need to combine results from multiple segments Because an index is composed of multiple segments, searches need to compute the top-k vectors on every segment and then merge these per-segment top-k hits into global top-k hits. The impact on latency may be mitigated by searching segments in parallel, but this approach still incurs some overhead compared to searching a single HNSW graph. RAM needs to scale with the size of the data set to retain optimal performance Traversing the HNSW graph incurs lots of random access. To perform efficiently, data sets should fit into the page cache, which requires sizing RAM based on the size of the vector data set that is managed. There exist other algorithms than HNSW for vector search that have more disk-friendly access patterns, though they come with other downsides, like higher query latency or worse recall. Pros Data sets can scale beyond the total RAM size Because data is stored on disk, Elasticsearch will allow data sets that are larger than the total amount of RAM that is available on the local host, and performance will degrade as the ratio of the HNSW data that can fit in the page cache decreases. As described in the previous section, performance-savvy users will need to scale RAM size with the size of the data set to retain optimal performance. Lock-free search Systems that update data structures in place generally need to take locks in order to guarantee thread safety under concurrent indexing and search. Lucene's segment-based indices never require taking locks at search time, even in the case of concurrent indexing. Instead, the set of segments that the index is composed of is atomically updated on a regular basis. Support for incremental changes New vectors may be added, removed, or updated at any time. Some other approximate nearest-neighbor search algorithms require being fed with the entire data set of vectors. Then once all the vectors are provided, an index training step is executed. For these other algorithms, any significant update to the vector data set requires the training step to be completed again, and this can get computationally expensive. Visibility consistency with other data structures A benefit of integrating at such a low level into Lucene is that we get consistency with other data structures out of the box when looking at a point-in-time view of the index. If you perform an update of a document to update both its vector and some other keyword field, then concurrent searches are guaranteed to either see the old value of the vector field and the old value of the keyword field — if the point-in-time view was created prior to the update — or the new value of the vector field and the new value of the keyword field — if the point-in-time view was created after the update. Likewise for deletes, if a document gets marked as deleted, then either all data structures including the vector store will ignore it, or they will see it if they operate on a point-in-time view that was created prior to the deletion. Incremental snapshots The fact that vectors are part of segments helps snapshots remain incremental by taking advantage of the fact that two subsequent snapshots usually share the majority of their segments, especially the bigger ones. Incremental snapshots would not be possible with a single HNSW graph that gets mutated in-place. Filtering and hybrid support Integrating directly into Lucene also makes it possible to integrate efficiently with other Lucene features, such as pre-filtering vector searches with an arbitrary Lucene filter or combining hits coming from a vector query with hits coming from a traditional full-text query. By having its own HNSW graph that is tied to a segment and where nodes are indexed by doc ID, Lucene can make interesting decisions about how best to pre-filter vector searches: either by linearly scanning documents that match the filter if it is selective, or by traversing the graph and only considering nodes that match the filter as candidates for top-k vectors otherwise. Compatibility with other features Because the vector store is like any other Lucene data structure, many features are compatible with vectors and vector search automatically, including: Aggregations Document-level security Field-level security Index sorting Access to vectors through scripts (e.g., from a script_score query or a reranker) Looking ahead: Separation of indexing and search As discussed in another blog , future versions of Elasticsearch will run indexing and search workloads on different instances. The implementation will essentially look as if you were continuously creating snapshots on indexing nodes and restoring them on search nodes. This will help prevent the high cost of vector indexing from impacting searches. Such a separation of indexing and search wouldn't be possible with a single shared HNSW graph instead of multiple segments, short of sending the full HNSW graph over the wire every time changes need to be reflected on new searches. Conclusion In general, Elasticsearch provides excellent vector search capabilities that are integrated with other Elasticsearch features: Vector searches can be pre-filtered by any supported filter, including the most sophisticated ones. Vector hits can be combined with hits of arbitrary queries. Vector searches are compatible with aggregations, document-level security, field-level security, index sorting, and more. Indices that contain vectors still obey the same semantics as other indices, including for the _refresh, _flush and _snapshot APIs. They will also support separation of indexing and search in stateless Elasticsearch. This is done at the expense of some index-time and search-time overhead. That said, vector search still typically runs in the order of tens or hundreds of milliseconds and is much faster than a brute-force exact search. More generally, both the index-time and search-time overheads seem manageable compared to other vector stores in existing comparative benchmarks * (look for the \"luceneknn\" line). We also believe that a lot of the value of vector search gets unlocked through the ability to combine vector search with other functionality. Furthermore, we recommend checking out our tuning guide for KNN search , which lists a number of measures that help mitigate the negative impact of the aforementioned cons. I hope you enjoyed this blog. Don't hesitate to reach out via Discuss if you have questions. And feel free to try out vector search in your existing deployment, or spin up a free trial of Elasticsearch Service on Elastic Cloud (which always has the latest version of Elasticsearch). *At the time of this writing, these benchmarks do not yet take advantage of vectorization. For more information on vectorization, read this blog . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Jump to Vector search is integrated in Elasticsearch through Apache Lucene Cons Merges need to recompute HNSW graphs Searches need to combine results from multiple segments RAM needs to scale with the size of the data set to retain optimal performance Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Vector search in Elasticsearch: The rationale behind the design - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/vector-search-elasticsearch-rationale",
+    "meta_description": "In this blog, you'll learn how vector search has been integrated into Elasticsearch and the trade-offs that we made."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Implementing image search: vector search via image processing in Elasticsearch Learn how to implement image search with an example. This blog covers how to use vector search through image processing in Elasticsearch. Vector Database AS By: Alex Salgado On November 8, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Case study: Finding your puppy with image search Have you ever been in a situation where you found a lost puppy on the street and didn’t know if it had an owner? Using vector search through image processing in Elasticsearch, this task can be as simple as reading a comic strip. Imagine this scene: On a tumultuous afternoon, Luigi, a small and lively puppy, found himself wandering alone through busy streets after accidentally slipping out of his leash during a walk around Elastic. His desperate owner was searching for him at every corner, calling his name with a voice full of hope and anxiety. Meanwhile, somewhere in the city, an attentive person noticed the puppy with a lost expression and decided to help. Quickly, they took a photo of Luigi and, using the vector image search technology of the company they worked for, began a search in the database hoping to find some clue about the owner of the little runaway. If you want to follow and execute the code while reading, access the file Python code running on a Jupyter Notebook (Google Collab) . The architecture We'll solve this problem using a Jupyter Notebook. First, we download the images of the puppies to be registered, and then we install the necessary packages. *Note: To implement this sample, we will need to create an index in Elasticsearch before populating our vector database with our image data. Begin by deploying Elasticsearch (we have a 14-day free trial for you) . During the process, remember to store the credentials (username, password) to be used in our Python code. For simplicity, we will use Python code running on a Jupyter Notebook (Google Colab). Download the code zip file and install the necessary packages Let's create 4 classes to assist us in this task, and they are: Util class : responsible for handling preliminary tasks and Elasticsearch index maintenance. Dog class : responsible for storing the attributes of our little dogs. DogRepository class : responsible for data persistence tasks. DogService class : it will be our service layer. Util class The Util class provides utility methods for managing the Elasticsearch index, such as creating and deleting the index. Methods: create_index() : Creates a new index in Elasticsearch. delete_index() : Deletes an existing index from Elasticsearch. Dog class The Dog class represents a dog and its attributes, such as ID, image path, breed, owner name, and image embeddings. Attributes dog_id : The dog's ID. image_path : The path to the dog's image. breed : The dog's breed. owner_name : The dog's owner's name. image_embedding : The dog's image embedding. Methods __init__() : Initializes a new Dog object. generate_embedding() : Generates the dog's image embedding. to_dict() : Converts the Dog object to a dictionary. DogRepository Class The DogRepository class provides methods for persisting and retrieving dog data from Elasticsearch. Methods insert() : Inserts a new dog into Elasticsearch. bulk_insert() : Inserts multiple dogs into Elasticsearch in bulk. search_by_image() : Searches for similar dogs by image. DogService Class The DogService class provides business logic for managing dog data, such as inserting and searching for dogs. Methods insert_dog() : Inserts a new dog into Elasticsearch. search_dogs_by_image() : Searches for similar dogs by image. The classes presented above provide a solid foundation for building a dog data management system. The Util class provides utility methods for managing the Elasticsearch index. The Dog class represents the attributes of a dog. The DogRepository class offers methods for persisting and retrieving dog data from Elasticsearch. The DogService class provides the business logic for efficient dog data management. The main code We'll basically have 2 main flows or phases in our code: Register the Dogs with basic information and image. Perform a search using a new image to find the Dog in the vector database. Phase 01: Registering the Puppy To store the information about Luigi and the other company's little dogs, we'll use the Dog class. For this purpose, let's code the sequence: Start registering the puppies Output Registering Luigi Registering all the others puppies Visualizing the new dogs Output Phase 02: Finding the lost dog Now that we have all the little dogs registered, let's perform a search. Our developer took this picture of the lost puppy. Output Let's see if we find the owner of this cute little puppy? Get the results Let's see what we found... Output Voilà!! We found it!!! But who will be the owner and their name? Output Luigi Jack Russel/Rat Terrier Ully Happy end We found Luigi !!! Let's notify Ully. Output In no time, Ully and Luigi were reunited. The little puppy wagged his tail with pure delight, and Ully hugged him close, promising to never let him out of her sight again. They had been through a whirlwind of emotions, but they were together now, and that was all that mattered. And so, with hearts full of love and joy, Ully and Luigi lived happily ever after. Conclusion In this blog post, we have explored how to use vector search to find a lost puppy using Elasticsearch. We have demonstrated how to generate image embeddings for dogs, index them in Elasticsearch, and then search for similar dogs using a query image. This technique can be used to find lost pets, as well as to identify other objects of interest in images. Vector search is a powerful tool that can be used for a variety of applications. It is particularly well-suited for tasks that require searching for similar objects based on their appearance, such as image retrieval and object recognition. We hope that this blog post has been informative and that you will find the techniques we have discussed to be useful in your own projects. Resources Elasticsearch Guide Elasticsearch Python client Hugging Face - Sentence Transformers What is vector search? | Elastic Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Case study: Finding your puppy with image search The architecture Download the code zip file and install the necessary packages Util class Dog class Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Implementing image search: vector search via image processing in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/implementing-image-search-with-elasticsearch",
+    "meta_description": "Learn how to implement image search with an example. This blog covers how to use vector search through image processing in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Chunking large documents via ingest pipelines plus nested vectors equals easy passage search Learn how to chunk large documents using ingest pipelines and nested vectors in Elasticsearch for easy passage search in vector search. Vector Database How To MH By: Michael Heldebrant On November 15, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector search is a powerful way to search data based on meaning rather than exact or inexact token matching techniques. However the text embedding models that power vector search can only process short passages of text on the order of several sentences rather than BM25 based techniques that can work on arbitrarily large amounts of text. Combining large documents seamlessly with vector search is now possible with Elasticsearch. How does it work at a high level? The combination of Elasticsearch features such as ingest pipelines, the flexibility of a script processor and new support for nested documents with dense_vectors allows for a straightforward way to at ingest time chunk large documents into small enough passages that can then be processed by text embedding models to generate all the vectors needed to represent the full meaning of the large documents. Ingest your document data as you would normally, and add to your ingest pipeline a script processor to break the large text data into an array of sentence or other types of chunks followed by a for_each processor to run an inference processor on each chunk. Mappings for the index are defined such that the array of chunks is set up as a nested object with a dense_vector mapping as a subobject which will then properly index each of the vectors and make them searchable. How to chunk large documents via ingest pipelines & nested vectors Load a text embedding model The first thing you will need is a model to create the text embeddings out of the chunks, you can use whatever you would like, but this example will run end to end on the all-distilroberta-v1 model. With an Elastic Cloud cluster created or another Elasticsearch cluster ready, we can upload the text embedding model using the eland library. Mappings example Next step is to prepare the mappings to handle the array of sentences and vector objects that will be created during the ingest pipeline. For this particular text embedding model the dimensions are 384 and dot_product similarity will be used for nearest neighbor calculations: Ingest pipeline examples The last preparation step is to define an ingest pipeline to break up the body_content field into chunks of text stored in the passages field. This pipeline has two processors, the first script processor breaks up the body_content field into an array of sentences stored in the passages field via a regular expression. For further research read up on regular expression advanced features such as negative lookbehind and positive lookbehind to understand how it tries to properly split on sentence boundaries, not split on Mr. or Mrs. or Ms., and keep the punctuation with the sentence. It also tries to concatenate the sentence chunks back together as long as the total string length is under the parameter passed to the script. The next for each processor runs the text embedding model on each sentence via an inferrence processor: Add some documents Now we can add documents with large amounts of text in body_content and automatically have them chunked, and each chunk text embedded into vectors by the model: Search those documents To search the data and return what chunk matched the query best you use inner_hits with the knn clause to return just that best matching chunk of the document in the hits output from the query: Will return the best document and the relevant portion of the larger document text: Review The approach used here shows the power of leveraging the different capabilities of Elasticsearch to solve a larger problem. Ingest pipelines allow you to preprocess your documents before indexing, and while there are many processors that do specific targeted tasks, sometimes you need the power of a scripting language to be able to do things like break up text into an array of sentences. Because you can access the document before it is indexed you have the ability to remake the data in nearly any fashion you can imagine as long as all the information is within the document itself. The foreach processor allows us to wrap something that may run zero to N times without knowing in advance how many times it needs to execute. In this case we are using it to run over as many sentences as we extract to run the infer processor to make vectors. The mappings of the index are prepared to handle the array of now objects of text and vector that did not exist in the original document with a nested object which indexes the data in a way that we can properly search the document. Using knn with nested support for vectors allows the use of inner_hits to present the best scoring portion of the document which can substitute for what would be usually done via highlighting in a BM25 query. Conclusion This hopefuly shows how Elasticsearch can do what it does best, just bring your data and Elasticsearch will make it searchable for you. Take your skills to the next level and learn how to implement a recursive chunking strategy by watching this video. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to How does it work at a high level? How to chunk large documents via ingest pipelines & nested vectors Load a text embedding model Mappings example Ingest pipeline examples Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Chunking large documents via ingest pipelines plus nested vectors equals easy passage search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/chunking-via-ingest-pipelines",
+    "meta_description": "Learn how to chunk large documents using ingest pipelines and nested vectors in Elasticsearch for easy passage search in vector search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Steps to improve search relevance In this first blog post, we will list and explain the differences between the primary building blocks available in the Elastic Stack to do information retrieval. Generative AI GC QH TV By: Grégoire Corbière , Quentin Herreros and Thomas Veasey On July 13, 2023 Part of Series Improving information retrieval in the Elastic Stack Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Since 8.0 and the release of third-party natural language processing (NLP) models for text embeddings, users of the Elastic Stack have access to a wide variety of models to embed their text documents and perform query-based information retrieval using vector search. Given all these components and their parameters, and depending on the text corpus you want to search in, it can be overwhelming to choose which settings will give the best search relevance. In this series of blog posts, we will introduce a number of tests we ran using various publicly available data sets and information retrieval techniques that are available in the Elastic Stack. We’ll then provide recommendations of the best techniques to use depending on the setup. To kick off this series of blogs, we want to set the stage by describing the problem we are addressing and describe some methods we will dig further into in subsequent blogs. Background and terminology BM25: A sparse, unsupervised model for lexical search The classic way documents are ranked for relevance by Elasticsearch according to a text query uses the Lucene implementation of the Okapi BM25 model. Although a few hyperparameters of this model were fine-tuned to optimize the results in most scenarios, this technique is considered unsupervised as labeled queries and documents are not required to use it: it’s very likely that the model will perform reasonably well on any corpus of text, without relying on annotated data. BM25 is known to be a strong baseline in zero-shot retrieval settings. Under the hood, this kind of model builds a matrix of term frequencies (how many times a term appears in each document) and inverse document frequencies (inverse of how many documents contain each term). It then scores each query term for each document that was indexed based on those frequencies. Because each document typically contains a small fraction of all words used in the corpus, the matrix contains a lot of zeros. This is why this type of representation is called sparse . Also, this model sums the relevance score of each individual term within a query for a document, without taking into account any semantic knowledge (synonyms, context, etc.). This is called lexical search (as opposed to semantic search). Its shortcoming is the so-called vocabulary mismatch problem, that query vocabulary is slightly different to the document vocabulary. This motivates other scoring models that try to incorporate semantic knowledge to avoid this problem. Dense models: A dense, supervised model for semantic search More recently, transformer-based models have allowed for a dense, context aware representation of text, addressing the principal shortcomings mentioned above. To build such models, the following steps are required: 1. Pre-training We first need to train a neural network to understand the basic syntax of natural language. Using a huge corpus of text, the model learns semantic knowledge by training on unsupervised tasks (like Masked Word Prediction or Next Sentence Prediction). BERT is probably the best known example of these models — it was trained on Wikipedia (2.5B words) and BookCorpus (800M words) using Masked Word Prediction. This is called pre-training . The model learns vector representations of language tokens, which can be adapted for other tasks with much less training. Note that at this step, the model wouldn’t perform well on downstream NLP tasks. This step is very expensive, but many such foundational models exist that can be used off the shelf. 2. Task-specific training Now that the model has built a representation of natural language, it’ll train much more effectively on a specific task such as Dense Passage Retrieval (DPR) that allows Question Answering. To do so, we must slightly adapt the model’s architecture and then train it on a large number of instances of the task, which, for DPR, consists in matching a relevant passage taken from a relevant document. So this requires a labeled data set, that is, a collection of triplets : A query: \"What is gold formed in?\" A document or passage taken from a document: \"The core of large stars, especially during a nova\" Optionally, a score of degree of relevance for this (query, document) pair (If no score is given, we assume that the score is binary, and that all the other documents can be considered as irrelevant for the given query.) A very popular and publicly available data set to perform such a training for DPR is the MS MARCO data set. This data set was created using queries and top results from Microsoft’s Bing search engine. As such, the queries and documents it contains fall in the general knowledge linguistic domain, as opposed to specific linguistic domain (think about research papers or language used in law). This notion of linguistic domain is important, as the semantic knowledge learned by those models is giving them an important advantage “in-domain”: when BERT came out, it improved previous state of the art models on this MS MARCO data set by a huge margin. 3. Domain-specific training Depending on how different your data is from the data set used for task-specific training, you might need to train your model using a domain specific labeled data set. This step is also referred to as fine tuning for domain adaptation or domain-adaptation. The good news is that you don’t need as large a data set as was required for the previous steps — a few thousands or tens of thousands of instances of the tasks can be enough. The bad news is that these query-document pairs need to be built by domain experts, so it’s usually a costly option. The domain adaptation is roughly similar to the task-specific training. Having introduced these various techniques, we will measure how they perform on a wide variety of data sets. This sort of general purpose information retrieval task is of particular interest for us. We want to provide tools and guidance for a range of users, including those who don’t want to train models themselves in order to gain some of the benefits they bring to search. In the next blog post of this series, we will describe the methodology and benchmark suite we will be using. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid retrieval Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Background and terminology BM25: A sparse, unsupervised model for lexical search Dense models: A dense, supervised model for semantic search Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving information retrieval in the Elastic Stack: Steps to improve search relevance - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/improving-information-retrieval-elastic-stack-search-relevance",
+    "meta_description": "In this first blog post, we will list and explain the differences between the primary building blocks available in the Elastic Stack to do information retrieval."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Understanding sparse vector embeddings with trained ML models Learn about sparse vector embeddings, understand what they do/mean, and how to implement semantic search with them. Vector Database Search Relevance How To DS By: Dai Sugimori On February 24, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch provides a semantic search feature that allows users to query in natural language and retrieve relevant information. To achieve this, target documents and queries must first be transformed into vector representations through an embedding process, which is handled by a trained Machine Learning (ML) model running either inside or outside Elasticsearch. Since choosing a good machine learning model is not easy for most search users, Elastic introduced a house-made ML model called ELSER (Elastic Learned Sparse EncodeR). It is bundled with Elasticsearch, so they can use it out-of-the-box under a platinum license (refer to the subscription page ). It has been a widely used model for implementing semantic search. Unlike many other models that generate \"dense vectors,\" ELSER produces \"sparse vectors,\" which represent embeddings differently. Although sparse vector models work in many use cases, only dense vector models could be uploaded to Elasticsearch until Elasticsearch and Eland 8.16. However, starting from version 8.17, you can now upload sparse vector models as well using Eland’s eland_import_hub_model CLI. This means you can generate sparse vector embeddings using models from Hugging Face, not just ELSER. In this article, I’d like to recap what sparse vectors are compared to dense ones as well as introduce how to upload them from outside for use in Elasticsearch. What is the sparse vector? What is the difference between dense and sparse vectors? Let's start with dense vectors, which are more commonly used for search today. Dense vectors When text is embedded as a dense vector, it looks like this: Key characteristics of dense vectors: The vector has a fixed dimension. Each element is a numeric value (float by default). Every element represents some feature, but their meanings are not easily interpretable by humans. Most elements have nonzero values. The ML models, especially those based on transformers, will produce geometrically similar vectors if the meanings of the input text are similar. The similarity is calculated by some different functions such as cosine, l2_norm, etc. For example, if we embed the words \"cat\" and \"kitten\", their vectors would be close to each other in the vector space because they share similar semantic meaning. In contrast, the vector for \"car\" would be farther away, as it represents a completely different concept. Elastic has many interesting articles about vector search. Refer to them if you are keen to learn more: A quick introduction to vector search - Elasticsearch Labs Navigating an Elastic vector database - Elasticsearch Labs Vector similarity techniques and scoring - Elasticsearch Labs Sparse vectors In contrast, sparse vectors have a different structure. Instead of assigning a value to every dimension, they mostly consist of zeros, with only a few nonzero values. A sparse vector representation looks like this: {\"f1\":1.2,\"f2\":0.3,… } Key characteristics of sparse vectors: Most values in the vector are zero. Instead of a fixed-length array, they are often stored as key-value pairs, where only nonzero values are recorded. The feature which has zero value never appears in the sparse vector representation. The representation is more interpretable, as each key (feature) often corresponds to a meaningful term or concept. BM25 and sparse vector representation BM25, a well-known ranking function for lexical search, uses a sparse vector representation of text known as a term vector . In this representation, each term (known as token or word) in the document is assigned a weight based on its frequency and importance within the corpus (a set of documents). This approach allows efficient retrieval by matching query terms to document terms. Lexical vs. semantic search BM25 is an example of lexical search , where matching is based on exact terms found in the document. It relies on a sparse vector representation derived from the vocabulary of the corpus. On the other hand, semantic search goes beyond exact term matching. It uses vector embeddings to capture the meaning of words and phrases, enabling retrieval of relevant documents even if they don't contain the exact query terms. In addition, Elasticsearch can do more. You can combine these two searches in one query as a hybrid search. Refer to the links below to learn more about it. When hybrid search truly shines - Elasticsearch Labs Hybrid search with multiple embeddings: A fun and furry search for cats! - Elasticsearch Labs Hybrid Search: Combined Full-Text and kNN Results - Elasticsearch Labs Hybrid Search: Combined Full-Text and ELSER Results - Elasticsearch Labs Tutorial: hybrid search with semantic_text | Elasticsearch Guide | Elastic Dense and sparse vectors in semantic search Semantic search can leverage both dense and sparse vectors: Dense vector models (e.g., BERT-based models) encode semantic meaning into fixed-length vectors, enabling similarity search based on vector distances. Sparse vector models (e.g., ELSER) capture semantic meaning while preserving interpretability, often leveraging term-based weights. With Elasticsearch 8.17, you can now upload and use both dense and sparse vector models, giving you more flexibility in implementing semantic search tailored to your needs. Why sparse vectors? Sparse vectors offer several advantages in Elasticsearch, making them a powerful alternative to dense vectors in certain scenarios. For example, dense vector (knn) search requires the models to be learned with a good/enough corpus of the domain that users are working with. But it is not always easy to have such a model which is suitable for your use case, and fine-tuning the model is even harder for most users. In this case, the sparse vector models will help you. Let’s learn why. Good for zero-shot learning Sparse vectors, especially those generated by models like ELSER, can generalize well to new domains without requiring extensive fine-tuning. Unlike dense vector models that often need domain-specific training, sparse vectors rely on term-based representations, making them more effective for zero-shot retrieval—where the model can handle queries it hasn’t explicitly been trained on. Resource efficiency Sparse vectors are inherently more resource-efficient than dense vectors. Since they contain mostly zeros and only store nonzero values as key-value pairs, they require less memory and storage. Sparse vectors in Elasticsearch Elasticsearch initially supported sparse vector search using the rank_features query. However, with recent advancements, sparse vector search is now natively available through the sparse_vector query, providing better integration with Elasticsearch’s machine learning models. Integration with ML models The sparse_vector query is designed to work seamlessly with trained models running on Elasticsearch ML nodes. This allows users to generate sparse vector embeddings dynamically and retrieve relevant documents using efficient similarity search. Leveraging Lucene’s inverted index One of the key benefits of sparse vectors in Elasticsearch is that they leverage Lucene’s inverted index —the same core technology that powers Elasticsearch’s fast and scalable search. Resource efficiency: Since Lucene is optimized for term-based indexing, sparse vectors benefit from efficient storage and retrieval. Maturity: Elasticsearch has a well-established and highly optimized indexing system, making sparse vector search a natural fit within its architecture. By utilizing Lucene’s indexing capabilities, Elasticsearch ensures that sparse vector search remains fast, scalable, and resource-efficient , making it a strong choice for real-world search applications. Implementing sparse embedding with a preferred model from Hugging Face Starting from Elasticsearch 8.17, you can use any sparse vector model from Hugging Face as long as it employs a supported tokenization method. This allows greater flexibility in implementing semantic search with models tailored to your specific needs. Elasticsearch currently supports the following tokenization methods for sparse and dense vector embeddings: bert – For BERT-style models deberta_v2 – For DeBERTa v2 and v3-style models mpnet – For MPNet-style models roberta – For RoBERTa-style and BART-style models xlm_roberta – For XLM-RoBERTa-style models bert_ja – For BERT-style models trained specifically for Japanese For a full list of supported tokenization methods, refer to the official documentation: Elasticsearch Reference: PUT Trained Models If the tokenization of your model is available, you can select it even if it’s not for non-English languages! Below are some examples of available sparse models on Hugging Face: naver/splade-v3-distilbert hotchpotch/japanese-splade-v2 aken12/splade-japanese-v3 Steps to use a sparse vector model from Hugging Face We already have a good article about sparse vector search with ELSER. Most of the steps are the same, but if you’d like to use the sparse embedding model from Hugging Face, you need to upload it to Elasticsearch beforehand with Eland. Here is the step-by-step guide to use the external sparse model on Elasticsearch for semantic search. 1. Find a sparse vector model Browse Hugging Face ( huggingface.co ) for a sparse embedding model that fits your use case. Ensure that the model uses one of the supported tokenization methods listed above. Let’s select the “ naver/splade-v3-distilbert ” model as an example of the sparse embedding model. Note: Elastic’s ELSER model is heavily inspired by Naver’s SPLADE model. Visit their website to learn more about SPLADE. 2. Upload the model to Elasticsearch You need to install Eland, a Python client and toolkit for DataFrames and machine learning in Elasticsearch. Note that you need Eland 8.17.0 or later for uploading sparse vector models. Once it is installed on your computer, use Eland’s CLI tool ( eland_import_hub_model ) to import the model into Elasticsearch. Alternatively, you can do the same with Docker if you don’t want to install Eland locally. The key point here is that you need to set text_expansion as the task type for the sparse vector embeddings, unlike text_embedding for the dense vector embeddings. (JFYI, there is a discussion about the task name.) 3. Define index mapping with sparse_vector Create an index that has a sparse_vector field. It should be noted that the sparse_vector field type was formerly known as rank_features field type. Although there is no functional difference between them, you should use sparse_vector field type for the clarity of its usage. Note: Elastic recently introduced a semantic_text field. It is super useful and easy-to-implement semantic search. Refer to this article for the details. You can use semantic_text field for the same purpose, but to stay focused on the embedding part, let’s use sparse_vector field for now. 4. Create ingest pipeline with inference processor The text information needs to be embedded into the sparse vector before it is indexed. That can be done by the ingest pipeline with the inference processor. Create the ingest pipeline with an inference processor which refers to the model you have uploaded before. 5. Ingest the data with the pipeline Ingest text data into an index with the “sparse-test-pipeline” we created so that the content will be automatically embedded into the sparse vector representation. Once it is done, let’s check how it is indexed. It will return like this: As you can see, the input text \" Elasticsearch provides a semantic search feature that allows users to query in natural language and retrieve relevant information. \" is embedded as: As you can see, the input text doesn’t directly mention most of the words listed in the response, but they look semantically related. It means, the model knows that the concepts of these words are related to the input text based on the corpus the model was trained on. Therefore, the quality of these sparse embeddings depends on the ML model you configured. Search with a semantic query Now you can perform semantic search against the “sparse-test” index with sparse_vector query. I’ve ingested some Elastic’s blog content into sparse-test index, so let’s test it out. The response was: As you can see, the original content is embedded into the sparse vector representation, so you can easily understand how the model determines the meanings of those texts. The first result doesn’t contain most words that can be found in the query text, but still, the relevance score is high because the sparse vector representations are similar. For better precision, you can also try hybrid search with RRF so that you can combine lexical and semantic search within one query. Refer to the official tutorial to learn more. Conclusion Sparse vectors provide a powerful and efficient way to enhance search capabilities in Elasticsearch. Unlike dense vectors, sparse vectors offer key advantages such as better zero-shot performance and resource efficiency . They integrate seamlessly with Elasticsearch’s machine learning capabilities and leverage Lucene’s mature and optimized inverted index , making them a practical choice for many applications. Starting with Elasticsearch 8.17, users now have greater flexibility in choosing between dense and sparse vector models based on their specific needs. Whether you're looking for interpretable representations , scalable search performance , or efficient memory usage , sparse vectors provide a compelling option for modern search applications. As Elasticsearch continues to evolve, sparse vector search is set to play an increasingly important role in the future of information retrieval. Now, you can take advantage of ELSER and also other Hugging Face models to explore new possibilities in semantic search. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to What is the sparse vector? Dense vectors Sparse vectors BM25 and sparse vector representation Lexical vs. semantic search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Understanding sparse vector embeddings with trained ML models - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/sparse-vector-embedding",
+    "meta_description": "Learn about sparse vector embeddings, understand what they do/mean, and how to implement semantic search with them."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to create custom connectors for Elasticsearch Learn how to create custom connectors for Elasticsearch to simplify your data ingestion process. Ingestion Python How To JB By: Jedr Blaszyk On October 4, 2023 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Elasticsearch has a library of ingestion tools to bring data from several sources. However, sometimes your data sources might not be compatible with Elastic’s existing ingestion tools. In this case, you may need to create a custom connector to connect your data with Elasticsearch. There are several reasons to use Elastic connectors for your apps. For example, you may want to: Bring data from custom or legacy applications to Elasticsearch Introduce a semantic search for your organizational data Extract textual content from files like PDFs, MS Office documents and more Use Kibana UI to manage your data sources (including configuration, filtering rules, setting up periodic sync schedule rules) You want to deploy Elastic connectors on your own infrastructure (some of Elastic-supported connectors are available as native connectors in the Elastic Cloud) Framework to create custom connectors If creating your own connector is the solution to your requirements, the connectors framework will help you create one. We created the framework to enable the creation of custom connectors and help users connect unique data sources to Elasticsearch. Code for connectors is available on GitHub , and we have documentation that can help you get started. The framework is designed to be simple and performant. It is meant to be developer-friendly, hence it is open-code and highly customizable. The connectors you create can be self-managed on your own infrastructure. The goal is to enable developers to integrate their own data sources very easily with Elasticsearch. What you need to know before you use connectors framework The framework is written in async-python There are several courses to learn async-python. In case you want a recommendation, we thought this LinkedIn learning course was really good, but it requires a subscription. A free alternative we liked was this one . Why did we choose async python? Ingestion is IO bound (not CPU bound) so async programming is the optimal approach when building a connector from a resource-utilization perspective. In I/O bound applications, the majority of the time is spent waiting for external resources, such as reading from files, making network requests, or querying databases. During these waiting periods, traditional synchronous code would block the entire program, leading to inefficient resource utilization. Any other pre-requisites? This is not a pre-requisite. It’s definitely worth going through the Connectors Developer's Guide before you get started! Hope you find this useful. Using connectors framework to create a custom connector Getting started is easy. In terminology related to the framework, we refer to a custom connector as a source. We implement a new source by creating a new class, and the responsibility of this class is to send documents to Elasticsearch from the custom data source. As an optional way to get started, users can also check out this example of a directory source . This is a good but basic example that can help you figure out how you can write a custom connector. Outline of steps Once you know which custom data source you want to create a connector for, here’s an outline of steps to write a new source : add a module or a directory in connectors/sources declare your dependencies in requirements.txt . Make sure you pin these dependencies implement a class that implements methods described in connectors.source.BaseDataSource (optional, when contributing to repo) add a unit test in connectors/sources/tests with +90% coverage declare your connector connectors/config.py in the sources section That’s it. We’re done! Now you should be able to run the connector What you need to know before writing your custom connector To enable Elasticsearch users to ingest data and build a search experience on top of that data, we provide a lightweight Connector Protocol . This protocol allows users to easily ingest data, use Enterprise Search features to manipulate that data, and create a search experience while providing them with a seamless user experience in Kibana. To be compatible with Enterprise Search and take full advantage of the connector features available in Kibana, a connector must adhere to the protocol. What you need to know about connectors protocol This documentation page provides a good overview of the protocol. Here’s what you need to know: All communication between connectors and other parts of the system happen asynchronously through an Elasticsearch index Connectors communicate their status to Elasticsearch and Kibana so that users can provide it with configuration and diagnose any issues This allows for simple, developer-friendly deployment of connectors. connectors service is stateless, and doesn’t care where your Elastic deployment runs, as long as it can connect to it over the network it works well. The service is also fault-tolerant, and it can resume operation on a different host after a restart or a failure. Once it reestablishes a connection with Elasticsearch, it will continue its normal operation. Under the hood, the protocol uses Elasticsearch indices to track connector state .elastic-connectors and .elastic-connectors-sync-jobs (described in the docs linked above) Where custom connectors are hosted The connector itself is not tied to Elasticsearch and it can be hosted in your own environment If you have an Elasticsearch deployment, regardless of whether it is self-managed or lives in Elastic Cloud: You, as a developer/company can write a customized connector for your data source Manage the connector on your own infrastructure and configure the connector service for your needs As long as the connector can discover Elasticsearch over the network it is able to index the data You, as the admin can control the connector through Kibana Example: Google Drive connector using connectors framework We wrote a simple connector for Google Drive using the connectors framework. We implemented a new source by creating a new class whose responsibility is to send documents to Elasticsearch from the targeted source. Note: This tutorial is compatible with Elastic stack version 8.10 . For later versions, always check the connectors release notes for updates and refer to the Github repository . We start with a GoogleDriveDataSource class with expected method signatures of BaseDataSource to configure the data source, check its availability (pinging), and retrieve documents. In order to make this connector functional we need to implement those methods. This GoogleDriveDataSource class is a starting point for writing Google Drive source. By following these steps, you will implement the logic needed to sync data with Google Drive: We need to add this file in connectors/sources Set your new connector name and service_type e.g. Google Drive as name and google_drive as service type To get your connector sync data from the source, you need to implement: get_default_configuration - This function should return a collection of RichConfigurableFields . These fields allow you to configure the connector from the Kibana UI. This includes passing authentication details, credentials, and other source-specific settings. Kibana smartly renders these configurations. For example, if you flag a field with \"sensitive\": True Kibana will mask it for security. ping - A simple call to the data source that verifies its status, think of it as a health check. get_docs - This method needs to implement the logic to actually fetch the data from the source. This function should return an async iterator that returns a tuple containing: ( document , lazy_download ), where: document - is a JSON representation of an item in the remote source. (like name, location, table, author, size, etc) lazy_download - is a coroutine to download the object/attachments for content extraction handled by the framework (like text extraction from a PDF document) There are other abstract methods in BaseDataSource class. Note that these methods don’t need to be implemented, if you only want to support content syncing (e.g. fetching all data from google drive). They refer to other connector functionalities such as: Document level security ( get_access_control , access_control_query ) Advanced filtering rules ( advanced_rules_validators ) Incremental syncs ( get_docs_incrementally ) Other functionalities may be added in the future How we approached writing the official Elasticsearch Google Drive connector Start by implementing the methods expected by the BaseDataSource class We needed to implement the methods get_default_configuration , ping and get_docs to have the connector synchronize the data. So let’s dive deeper into the implementation. The first consideration is: How to “talk” to Google Drive to get data? Google provides an official python client , but it is synchronous, so it’s likely to be slow for syncing content. We think a better option is the aiogoogle library, which offers full client functionality written in async python. This might not be intuitive at first, but it is really important to use async operations for performance. So, here in this example, we opted not to use the official google library as it doesn't support async mode. If you use synchronous or blocking code within an asynchronous framework, it can have a significant impact on performance. The core of any async framework is the event loop. The event loop allows for the concurrent execution of asynchronous tasks by continuously polling for completed tasks and scheduling new ones. If a blocking operation is introduced, it would halt the loop's execution, preventing it from managing other tasks. This essentially negates the concurrency benefits provided by the asynchronous architecture. The next concern is the connector authentication We authenticate the Google Drive connector as a service account . More information about authentication can be found in these connector docs pages . Service account can authenticate using keys We pass the authentication key to the service account through the Kibana UI in Elasticsearch Let’s look at the get_default_configuration implementation that allows an end user to pass a credential key that will be stored in the index for authentication during syncs: Next, let’s implement a simple ping method We will make a simple call to google drive api, e.g. /about endpoint. For this step, let's consider a simplified representation of the GoogleDriveClient . Our primary goal here is to guide you through connector creation, so we're not focusing on implementation details of the Google Drive client. However, a minimal client code is essential for the connector's operation, so we will rely on pseudo-code for the GoogleDriveClient class representation. Async iterator to return files from google drive for content extraction The next step is to write get_docs async iterator that will return the files from google drive and coroutines for downloading them for content extraction. From personal experience, it is often simpler to start implementing get_docs as a simple stand-alone python script to get this working and fetch some data. Once the get_docs code is working, we can move it to the data source class. Let’s look at api docs, we can: Use files/list endpoint to iterate over docs in drive with pagination Use files/get and files/export for downloading the files (or exporting google docs to a specific file format) So what is happening in this bit of code? list_files paginates over files in drive. prepare_files formats the file metadata to expected schema get_content is a coroutine that downloads the file and Base64 encodes its content (compatible format for content extraction) Some code details have been omitted for brevity. For a complete implementation, see the actual connector implementation on GitHub . Let’s run the connector! To integrate your custom connector into the framework, you'll need to register its implementation. Do this by adding an entry for your custom connector in the sources section in connectors/config.py . For the Google Drive example, the addition would appear as: Now in the Kibana interface: Go to Search -> Indices -> Create a new index -> Use a Connector Select Customized connector (when using a custom connector) Configure your connector. Generate the Elasticsearch API key and connector ID, and put these details in config.yml as instructed, and start your connector. At this point, your connector should be detected by Kibana! Schedule a recurring data sync or just click “Sync” to start a full sync. A connector can be configured to use Elasticsearch’s ingestion pipelines to perform transformations on data before storing it in an index. A common use case is document enrichment with machine learning . For example, you can: analyze text fields using a Text embedding model that will generate a dense vector representation of your data run text classification for sentiment analysis extract key information from text with Named Entitiy Recogintion (NER) Once your sync finishes, your data will be available in a search-optimized Elasticsearch index. At this point, you can dive into building search experiences or delve into analytics. Do you want to create and contribute a new connector? If you create a custom connector for a source that may help the Elasticsearch community, consider contributing it. Here are the promotion path guidelines to get a customized connector to become an Elastic-supported connector. Acceptance criteria for contributing connectors Also, before you start spending some time developing a connector, you should create an issue and reach out to get some initial feedback on the connector and what libraries it will use. Once your connector idea has some initial feedback, ensure your project meets a few acceptance criteria: add a module or a directory in connectors/sources implement a class that implements all methods described in connectors.source.BaseDataSource add a unit test in connectors/sources/tests with +90% coverage declare your connector in connectors/config.py in the sources section declare your dependencies in requirements.txt . Make sure you pin these dependencies for each dependency you are adding, including indirect dependencies, list all the licences and provide the list in your patch. make sure you use an async lib for your source. If not possible, make sure you don't block the loop when possible, provide a docker image that runs the backend service, so we can test the connector. If you can't provide a docker image, provide the credentials needed to run against an online service. the test backend needs to return more than 10k documents due to 10k being the default size limit for Elasticsearch pagination. Having more than 10k documents returned from the test backend will help test the connector Supporting tools to test your connector We also have some supporting tools that profile the connector code and run performance tests. You can find those resources here: Perf8 - Performance library and dashboard, to profile the quality of python code to assess resource utilization and detect blocking calls E-2-E functional tests that make use of perf8 library to profile each connector Wrap up We hope this blog and the example were useful for you. Here’s the complete list of available native connectors and connector clients for Elasticsearch. If you don’t find your data source listed, perhaps create a custom connector? Here are some useful resources relevant to this article: connectors GitHub repository and documentation page Async Python learning course New custom connector community guidelines Licensing details for Elastic’s connector-framework (search for Connector Framework at this link ) If you don’t have an Elastic account, you can always spin up a trial account to get started! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Framework to create custom connectors What you need to know before you use connectors framework The framework is written in async-python Why did we choose async python? Any other pre-requisites? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to create custom connectors for Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/how-to-create-customized-connectors-for-elasticsearch",
+    "meta_description": "Learn how to create custom connectors for Elasticsearch to simplify your data ingestion process."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Designing for large scale vector search with Elasticsearch Explore the cost, performance and benchmarking for running large-scale vector search in Elasticsearch, with a focus on high-fidelity dense vector search. Vector Database JF By: Jim Ferenczi On June 12, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Part 1: High-fidelity dense vector search Introduction When designing for a vector search experience, the sheer number of available options can feel overwhelming. Initially, managing a small number of vectors is straightforward, but as applications scale, this can quickly become a bottleneck. In this blog post series, we’ll explore the cost and performance of running large-scale vector search with Elasticsearch across various datasets and use cases. We begin the series with one of the largest publicly available vector datasets: the Cohere/msmarco-v2-embed-english-v3 . This dataset includes 138 million passages extracted from web pages in the MSMARCO-passage-v2 collection , embedded into 1024 dimensions using Cohere's latest embed-english-v3 model . For this experiment, we defined a reproducible track that you can run on your own Elastic deployment to help you benchmark your own high fidelity dense vector search experience. It is tailored for real-time search use cases where the latency of a single search request must be low (<100ms). It uses Rally , our open source tool, to benchmark across Elasticsearch versions. In this post, we use our default automatic quantization for floating point vectors . This reduces the RAM cost of running vector searches by 75% without compromising retrieval quality. We also provide insights into the impact of merging and quantization when indexing billions of dimensions. We hope this track serves as a useful baseline, especially if you don’t have vectors specific to your use case at hand. Notes on embeddings Picking the right model for your needs is outside the scope of this blog post but in the next sections we discuss different techniques to compress the original size of your vectors. Matryoshka Representation Learning (MRL) By storing the most important information in earlier dimensions, new methods like Matryoshka embeddings can shrink dimensions while keeping good search accuracy. With this technique, certain models can be halved in size and still maintain 90% of their NDCG@10 on MTEB retrieval benchmarks. However, not all models are compatible. If your chosen model isn't trained for Matryoshka reduction or if its dimensionality is already at its minimum, you'll have to manage dimensionality directly in the vector database. Fortunately, the latest models from mixedbread or OpenAI come with built-in support for MRL. For this experiment we choose to focus on a use case where the dimensionality is fixed (1024 dimensions), playing with the dimensionality of other models will be the topic for another time. Embedding quantization learning Model developers are now commonly offering models with various trade-offs to address the expense of high-dimensional vectors. Rather than solely focusing on dimensionality reduction, these models achieve compression by adjusting the precision of each dimension. Typically, embedding models are trained to generate dimensions using 32-bit floating points. However, training them to produce dimensions with reduced precision helps minimize errors. Developers usually release models optimized for well-known precisions that directly align with native types in programming languages. For example, int8 represents a signed integer ranging from -127 to 127, while uint8 denotes an unsigned integer ranging from 0 to 255. Binary, the simplest form, represents a bit (0 or 1) and corresponds to the smallest possible unit per dimension. Implementing quantization during training allows for fine-tuning the model weights to minimize the impact of compression on retrieval performance. However, delving into the specifics of training such models is beyond the scope of this blog. In the following section, we will introduce a method for applying automatic quantization if the chosen model lacks this feature. Adaptive embedding quantization In cases where models lack quantization-aware embeddings, Elasticsearch employs an adaptive quantization scheme that defaults to quantizing floating points to int8. This generic int8 quantization typically results in negligible performance loss. The benefit of this quantization lies in its adaptability to data drift . It utilizes a dynamic scheme where quantization boundaries can be recalculated from time to time to accommodate any shifts in the data. Large scale benchmark Back-of-the-envelope estimation With 138.3 million documents and 1024-dimensional vectors, the raw size of the MSMARCO-v2 dataset to store the original float vectors exceeds 520GB. Using brute force to search the entire dataset would take hours on a single node. Fortunately, Elasticsearch offers a data structure called HNSW (Hierarchical Navigable Small World Graph), designed to accelerate nearest neighbor search . This structure allows for fast approximate nearest neighbor searches but requires every vector to be in memory. Loading these vectors from disk is prohibitively expensive, so we must ensure the system has enough memory to keep them all in memory. With 1024 dimensions at 4 bytes each, each vector requires 4 kilobytes of memory. Additionally, we need to account for the memory required to load the Hierarchical Navigable Small World (HNSW) graph into memory. With the default setting of 32 neighbors per node in the graph, an extra 128 bytes (4 bytes per neighbor) of memory per vector is necessary to store the graph, which is equivalent to approximately 3% of the memory cost of storing the vector dimensions. Ensuring sufficient memory to accommodate these requirements is crucial for optimal performance. On Elastic Cloud , our vector search-optimized profile reserves 25% of the total node memory for the JVM (Java Virtual Machine), leaving 75% of the memory on each data node available for the system page cache where vectors are loaded. For a node with 60GB of RAM, this equates to 45GB of page cache available for vectors. The vector search optimized profile is available on all Cloud Solution Providers (CSP) AWS, Azure and GCP . To accommodate the 520GB of memory required, we would need 12 nodes, each with 60GB of RAM, totaling 720GB. At the time of this blog this setup can be deployed in our Cloud environment for a total cost of $14.44 per hour on AWS: (please note that the price will vary for Azure and GCP environments): By leveraging auto-quantization to bytes, we can reduce the memory requirement to 130gb, which is just a quarter of the original size. Applying the same 25/75 memory allocation rule, we can allocate a total of 180 gb of memory on Elastic Cloud. At the time of this blog this optimized setup results in a total cost of $3.60 per hour on Elastic Cloud (please note that the price will vary for Azure and GCP environments): Start a free trial on Elastic Cloud and simply select the new Vector Search optimized profile to get started. In this post, we'll explore this cost-effective quantization using the benchmark we created to experiment with large-scale vector search performance. By doing so, we aim to demonstrate how you can achieve significant cost savings while maintaining high search accuracy and efficiency. Benchmark configuration The msmarco-v2-vector rally track defines the default mapping that will be used. It includes one dense vector field with 1024 dimensions, indexed with auto int8 quantization, and a doc_id field of type keyword to uniquely identify each passage. For this experiment, we tested with two configurations: Default : This serves as the baseline, using the track on Elasticsearch with default options. Aggressive Merge : This configuration provides a comparison point with different trade-offs. As previously explained , each shard in Elasticsearch is composed of segments. A segment is an immutable division of data that contains the necessary structures to directly lookup and search the data. Document indexing involves creating segments in memory, which are periodically flushed to disk. To manage the number of segments, a background process merges segments to keep the total number under a certain budget. This merge strategy is crucial for vector search since HNSW graphs are independent within each segment. Each dense vector field search involves finding the nearest neighbors in every segment, making the total cost dependent on the number of segments. By default, Elasticsearch merges segments of approximately equal size, adhering to a tiered strategy controlled by the number of segments allowed per tier. The default value for this setting is 10, meaning each level should have no more than 10 segments of similar size. For example, if the first level contains segments of 50MB, the second level will have segments of 500MB, the third level 5GB, and so on. The aggressive merge configuration adjusts the default settings to be more assertive: It sets the segments per tier to 5, enabling more aggressive merges. It increases the maximum merged segment size from 5GB to 25GB to maximize the number of vectors in a single segment. It sets the floor segment size to 1GB, artificially starting the first level at 1GB. With this configuration, we expect faster searches at the expense of slower indexing. For this experiment, we kept the default settings for m , ef_construction , and confidence_interval options of the HNSW graph in both configurations. Experimenting with these indexing parameters will be the subject of a separate blog. In this first part, we chose to focus on varying the merge and search parameters. When running benchmarks, it's crucial to separate the load driver, which is responsible for sending documents and queries, from the evaluated system (Elasticsearch deployment). Loading and querying hundreds of millions of dense vectors require additional resources that would interfere with the searching and indexing capabilities of the evaluated system if run together. To minimize latency between the system and the load driver, it's recommended to run the load driver in the same region of the Cloud provider as the Elastic deployment, ideally in the same availability zone. For this benchmark, we provisioned an im4gn.4xlarge node on AWS with 16 CPUs, 64GB of memory, and 7.5TB of disk in the same region as the Elastic deployment. This node is responsible for sending queries and documents to Elasticsearch. By isolating the load driver in this manner, we ensure accurate measurement of Elasticsearch's performance without the interference of additional resource demands. We ran the entire benchmarks with the following configuration: The initial_indexing_bulk_indexing_clients value of 12 indicates that we will ingest data from the load driver using 12 clients. With a total of 23.9 vCPUs in the Elasticsearch data nodes, using more clients to send data increases parallelism and enables us to fully utilize all available resources in the deployment. For search operations, the standalone_search_clients and parallel_indexing_search_clients values of 8 mean that we will use 8 clients to query Elasticsearch in parallel from the load driver. The optimal number of clients depends on multiple factors; in this experiment, we selected the number of clients to maximize CPU usage across all Elasticsearch data nodes. To compare the results, we ran a second benchmark on the same deployment, but this time we set the parameter aggressive_merge to true. This effectively changes the merge strategy to be more aggressive, allowing us to evaluate the impact of this configuration on search performance and indexing speed. Indexing performance In Rally, a challenge is configured with a list of scheduled operations to execute and report. Each operation is responsible for performing an action against the cluster and reporting the results. For our new track, we defined the first operation as initial-documents-indexing , which involves bulk indexing the entire corpus. This is followed by wait-until-merges-finish-after-index , which waits for background merges to complete at the end of the bulk loading process. This operation does not use force merge; it simply waits for the natural merge process to finish before starting the search evaluation. Below, we report the results of these operations of the track , they correspond to the initial loading of the dataset in Elasticsearch. The search operations are reported in the next section. With Elasticsearch 8.14.0, the initial indexing of the 138M vectors took less than 5 hours, achieving an average rate of 8,000 documents per second. Please note that the bottleneck is typically the generation of the embeddings, which is not reported here. Waiting for the merges to finish at the end added only 2 extra minutes: Total Indexing performance (8.14.0 default int8 HNSW configuration) For comparison, the same experiment conducted on Elasticsearch 8.13.4 required almost 6 hours for ingestion and an additional 2 hours to wait for merges: Total Indexing performance (8.13.4 default int8 HNSW configuration) Elasticsearch 8.14.0 marks the first release to leverage native code for vector search . A native Elasticsearch codec is employed during merges to accelerate similarities between int8 vectors, leading to a significant reduction in overall indexing time. We're currently exploring further optimizations by utilizing this custom codec for searches, so stay tuned for updates! The aggressive merge run completed in less than 6 hours, averaging 7,000 documents per second. However, it required nearly an hour to wait for merges to finish at the end. This represents a 40% decrease in speed compared to the run with the default merge strategy: Total Indexing performance (8.14.0 aggressive merge int8 HNSW configuration) This additional work performed by the aggressive merge configuration can be summarized in the two charts below. The aggressive merge configuration merges 2.7 times more documents to create larger and fewer segments. The default merge configuration reports nearly 300 million documents merged from the 138 million documents indexed. This means each document is merged an average of 2.2 times. Total number of merged documents per node (8.14.0 default int8 HNSW configuration) Total number of merged documents per node (8.14.0 aggressive merge int8 HNSW configuration) In the next section we’ll analyze the impact of these configurations on the search performance. Search evaluation For search operations, we aim to capture two key metrics: the maximum query throughput and the level of accuracy for approximate nearest neighbor searches. To achieve this, the standalone-search-knn-* operations evaluate the maximum search throughput using various combinations of approximate search parameters. This operation involves executing 10,000 queries from the training set using parallel_indexing_search_clients in parallel as rapidly as possible. These operations are designed to utilize all available CPUs on the node and are performed after all indexing and merging tasks are complete. To assess the accuracy of each combination, the knn-recall-* operations compute the associated recall and Normalized Discounted Cumulative Gain (nDCG). The nDCG is calculated from the 76 queries published in msmarco-passage-v2/trec-dl-2022/judged , using the 386,000 qrels annotations. All nDCG values range from 0.0 to 1.0, with 1.0 indicating a perfect ranking. Due to the size of the dataset, generating ground truth results to compute recall is extremely costly. Therefore, we limit the recall report to the 76 queries in the test set, for which we computed the ground truth results offline using brute force methods. The search configuration consists of three parameters: k : The number of passages to return. num_candidates : The size of the queue used to limit the search on the nearest neighbor graph. num_rescore : The number of passages to rescore using the full fidelity vectors. Using automatic quantization, rescoring slightly more than k vectors with the original float vectors can significantly boost recall. The operations are named according to these three parameters. For example, knn-10-100-20 means k=10, num_candidates=100, and num_rescore=20 . If the last number is omitted, as in knn-10-100 , then num_rescore defaults to 0. See the track.py file for more information on how we create the search requests. The chart below illustrates the expected Queries Per Second (QPS) at different recall levels. For instance, the default configuration (the orange series) can achieve 50 QPS with an expected recall of 0.922. Recall versus Queries Per Second (Elasticsearch 8.14.0) The aggressive merge configuration is 2 to 3 times more efficient for the same level of recall. This efficiency is expected since the search is conducted on larger and fewer segments as demonstrated in the previous section. The full results for the default configuration are presented in the table below: Queries per second, latencies (in milliseconds), recall and NDCG@10 with different parameters combination (8.14 default int8 HNSW configuration) The %best column represents the difference between the actual NDCG@10 for this configuration and the best possible NDCG@10, determined using the ground truth nearest neighbors computed offline with brute force. For instance, we observe that the knn-10-20-20 configuration, despite having a recall@10 of 67.4%, achieves 90% of the best possible NDCG for this dataset. Note that this is just a point result and results may vary with other models and/or datasets. The table below shows the full results for the aggressive merge configuration: Queries per second, latencies (in milliseconds), recall and NDCG@10 with different parameters combination (8.14 aggressive merge int8 HNSW configuration) Using the knn-10-500-20 search configuration, the aggressive merge setup can achieve > 90% recall at 150 QPS. Conclusion In this post, we described a new rally track designed to benchmark large-scale vector search on Elasticsearch. We explored various trade-offs involved in running an approximate nearest neighbor search and demonstrated how in Elasticsearch 8.14 we've reduced the cost by 75% while increasing index speed by 50% for a realistic large scale vector search workload. Our ongoing efforts focus on optimization and identifying opportunities to enhance our vector search capabilities. Stay tuned for the next installment of this series, where we will delve deeper into the cost and efficiency of vector search use cases, specifically examining the potential of int4 and binary compression techniques. By continually refining our approach and releasing tools for testing performance at scale, we aim to push the boundaries of what is possible with Elasticsearch, ensuring it remains a powerful and cost-effective solution for large-scale vector search. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Part 1: High-fidelity dense vector search Introduction Notes on embeddings Matryoshka Representation Learning (MRL) Embedding quantization learning Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Designing for large scale vector search with Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-vector-large-scale-part1",
+    "meta_description": "Explore the cost, performance and benchmarking for running large-scale vector search in Elasticsearch, with a focus on high-fidelity dense vector search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to migrate your Ruby app from OpenSearch to Elasticsearch A guide to migrate a Ruby codebase from the OpenSearch client to the Elasticsearch client. Integrations Ruby How To FB By: Fernando Briano On December 13, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The OpenSearch Ruby client was forked from the Elasticsearch Ruby Client in version 7.x , so the codebases are relatively similar. This means when migrating a Ruby codebase from OpenSearch to Elasticsearch, the code from the respective client libraries will look very familiar. In this blog post, I'm going to show an example Ruby app that uses OpenSearch and the steps to migrate this code to Elasticsearch. Both clients are released under the popular Apache License 2.0, so they're open source and free software. Elasticsearch's license was recently updated and the core of Elasticsearch and Kibana are published under the OSI approved Open Source license AGPL since version 8.16. Considering Elasticsearch version when migrating Ruby app One consideration when migrating is which version of Elasticsearch is going to be used. We recommend using the latest stable release, which at the time of writing this is 8.17.0 . The Elasticsearch Ruby Client minor versions follow the Elasticsearch minor versions. So for Elasticsearch 8.17.x , you can use version 8.17.x of the Ruby gem. OpenSearch was forked from Elasticsearch 7.10.2. So the APIs may have changed and different features could be used on either. But that's out of scope for this post, and I'm only going to look into the most common operations in an example app. For Ruby on Rails, you can use the official Elasticsearch client, or the Rails integration libraries . We recommend migrating to the latest stable version of Elasticsearch and client respectively. The elasticsearch-rails gem version 8.0.0 support Rails 6.1 , 7.0 and 7.1 and Elasticsearch 8.x . The code For this example, I followed the steps to install OpenSearch from a tarball . After downloading and extracting the tarball, I needed to set an initial admin password which I'm going to use later to instantiate the client. I created a directory with a Gemfile that looks like this: After running bundle install , the gem is installed for my project. This installed opensearch-ruby version 3.4.0 and the version of OpenSearch I'm running is 2.18.0 . I wrote the code in an example_code.rb file in the same directory. The initial code in this file is the instantiation of an OpenSearch client: The transport option ssl: { verify: false} parameter is being passed as per the user guide to make things easier for testing. In production, this should be set up depending on the deployment of OpenSearch. Since version 2.12.0 of OpenSearch, the OPENSEARCH_INITIAL_ADMIN_PASSWORD environment variable must be set to a strong password when running the install script. Following the steps to install OpenSearch from a tarball, I exported the variable in my console and now it's available for my Ruby script. A simple API to make sure the client is connecting to OpenSearch is using the cluster.health API: And indeed it works: I tested some of the common examples we have on the Elasticsearch Ruby client documentation, and they work as expected: Migrating Ruby app to Elasticsearch The first step is to add elasticsearch-ruby in the Gemfile. After running bundle install , the Elasticsearch Ruby client gem will be installed. If you want to test your code before fully migrating, you can initially leave the opensearch-ruby gem there. The next important step is going to be the client instantiation. This is going to depend on how you're running Elasticsearch. To keep a similar approach for these examples, I am following the steps in Download Elasticsearch and running it locally. When running bin/elasticsearch , Elasticsearch will start with security features automatically configured. Make sure you copy the password for the elastic user (but you can reset it by running bin/elasticsearch-reset-password -u elastic ). If you're following this example, make sure you stop OpenSearch before starting Elasticsearch, since they run on the same port. At the beginning of example_code.rb , I commented out the OpenSearch client instantiation and added the instantiation for an Elasticsearch client: As you can see, the code is almost identical in this testing scenario. It will differ according to the deployment of Elasticsearch and how you decide to connect and authenticate with it. The same applies here as in OpenSearch regarding security, the option to not verify ssl is just for testing purposes and should not be used in production. Once the client is set up, I run the code again with: bundle exec ruby example_code.rb . And everything just works! Debugging migration issues Depending on the APIs your application is using, there is a possibility that you receive an error when running your code against Elasticsearch if the APIs from OpenSearch diverge. The REST APIs documentation is an essential reference for detailed information on how to use the APIs. Make sure to check the documentation for the version of Elasticsearch that you're using. You can also refer to the Elasticsearch::API reference. Some errors you may encounter from Elasticsearch could be: ArgumentError: Required argument '<ARGUMENT>' missing - This is a Client error and it will be raised when a request is missing a required parameter. Elastic::Transport::Transport::Errors::BadRequest: [400] {\"error\":{\"root_cause\":[{\"type\":\"illegal_argument_exception\",\"reason\":\"request [/example/_doc] contains unrecognized parameter: [test]\"}]... This error comes from Elasticsearch and it means the client code is using a parameter that Elasticsearch doesn't recognize for the API being used. The Elasticsearch client will raise errors from Elasticsearch with the detailed error message sent by the server. So for unsupported parameters or endpoints even, the error should inform you what is different. Conclusion As we demonstrated with this example code, the migration of a Ruby app from OpenSearch to Elasticsearch is not too complex from the Ruby side of things. You need to be aware of the versioning and any potential divergent APIs between the search engines. But for the most common actions, the main change when migrating clients is in the instantiation. They're both similar in that respect, but the way the host and credentials are defined varies in relation to how the Stack is being deployed. Once the client is set up, and you verify it's connecting to Elasticsearch, you can replace the OpenSearch client seamlessly with the Elasticsearch client. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Considering Elasticsearch version when migrating Ruby app The code Migrating Ruby app to Elasticsearch Debugging migration issues Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to migrate your Ruby app from OpenSearch to Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/ruby-opensearch-elasticsearch-migration",
+    "meta_description": "Learn how to migrate your Ruby app from OpenSearch to Elasticsearch. This blog includes step-by-step instructions, debugging tips, and best practices."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Improving text expansion performance using token pruning Learn about token pruning and how it boosts the performance of text expansion queries by making them more efficient without sacrificing recall. Vector Database How To KD By: Kathleen DeRusso On April 2, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. This blog talks about token pruning, an exciting enhancement to ELSER performance released with Elasticsearch 8.13.0! The strategy behind token pruning We've already talked in great detail about lexical and semantic search in Elasticsearch and text similarity search with vector fields . These articles offer great, in-depth explanations of how vector search works. We've also talked in the past about reducing retrieval costs by optimizing retrieval with ELSER v2 . While Elasticsearch is limited to 512 tokens per inference field ELSER can still produce a large number of unique tokens for multi-term queries. This results in a very large disjunction query, and will return many more documents than an individual keyword search would - in fact, queries with a large number of resulting queries may match most or all of the documents in an index! Now, let's take a more detailed look into an example using ELSER v2. Using the infer API we can view the predicted values for the phrase \"Is Pluto a planet?\" This returns the following inference results: These are the inference results that would be sent as input into a text expansion search. When we run a text expansion query, these terms eventually get joined together in one large weighted boolean query, such as: Speed it up by removing tokens Given the large number of tokens produced by ELSER text expansion, the quickest way to realize a performance improvement is to reduce the number of tokens that make it into that final boolean query. This reduces the total work that Elasticsearch invests when performing the search. We can do this by identifying non-significant tokens produced by the text expansion and removing them from the final query. Non-significant tokens can be defined as tokens that meet both of the following criteria: The weight/score is so low that the token is likely not very relevant to the original term The token appears much more frequently than most tokens, indicating that it is a very common word and may not benefit the overall search results much. We started with some default rules to identify non-significant tokens, based on internal experimentation using ELSER v2: Frequency : More than 5x more frequent than the average token frequency for all tokens in that field Score : Less than 40% of the best scoring token Missing : If we see documents with a frequency of 0, that means that it never shows up at all and can be safely pruned If you're using text expansion with a model other than ELSER, you may need to adjust these values in order to return optimal results. Both the token frequency threshold and weight threshold must show the token is non-significant in order for the token to be pruned. This lets us ensure we keep frequent tokens that are very high scoring or very infrequent tokens that may not have as high of a score. Performance improvements with token pruning We benchmarked these changes using the MS Marco Passage Ranking benchmark . Through this benchmarking, we observed that enabling token pruning with the default values described above resulted in a 3-4x improvement in 99th pctile latency and above! Relevance impact of token pruning Once we measured a real performance improvement, we wanted to validate that relevance was still reasonable. We used a small dataset against the MS Marco passage ranking dataset. We did observe an impact on relevance when pruning the tokens; however, when we added the pruned tokens back in a rescore block the relevance was close to the original non-pruned results with only a marginal increase in latency. The rescore, adding in the tokens that were previously pruned, queries the pruned tokens only against the documents that were returned from the previous query. Then it updates the score including the dimensions that were previously left behind. Using a sample of 44 queries with judgments against the MS Marco Passage Ranking dataset: Top K Rescore Window Size Avg rescored recall vs control Control NDCG@K Pruned NDCG@K Rescored NDCG@K 10 10 0.956 0.653 0.657 0.657 10 100 1 0.653 0.657 0.653 10 1000 1 0.653 0.657 0.653 100 100 0.953 0.51 0.372 0.514 100 1000 1 0.51 0.372 0.51 Now, this is only one dataset - but it's encouraging to see this even at smaller scale! How to use: Pruning configuration Pruning configuration will launch in our next release as an experimental feature. It's an optional, opt-in feature so if you perform text expansion queries without specifying pruning, there will be no change to how text expansion queries are formulated - and no change in performance. We have some examples of how to use the new pruning configuration in our text expansion query documentation . Here's an example text expansion query with both the pruning configuration and rescore: Note that the rescore query sets only_score_pruned_tokens to false, so it only adds those tokens that were originally pruned back into the rescore algorithm. This feature was released as a technical preview feature in 8.13.0. You can try it out in Cloud today! Be sure to head over to our discuss forums and let us know what you think. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to The strategy behind token pruning Speed it up by removing tokens Performance improvements with token pruning Relevance impact of token pruning How to use: Pruning configuration Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving text expansion performance using token pruning - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/text-expansion-pruning",
+    "meta_description": "Learn about token pruning and how it boosts the performance of text expansion queries by making them more efficient without sacrificing recall."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Using hybrid search for gopher hunting with Elasticsearch and Go Learn how to achieve hybrid search by combining keyword and vector search using Elasticsearch and the Elasticsearch Go client. Vector Database How To CR LS By: Carly Richmond and Laurent Saint-Félix On November 2, 2023 Part of Series Using the Elasticsearch Go client for keyword search, vector search & hybrid search Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. In the previous parts of this series, it was demonstrated how to use the Elasticsearch Go client for traditional keyword search and vector search . This third part covers hybrid search. We'll share examples of how you can combine both vector search and keyword search using Elasticsearch and the Elasticsearch Go client . Prerequisites Just like part one in this series, the following prerequisites are required for this example: Installation of Go version 1.21 or later Create your own Go repo using the recommended structure and package management covered in the Go documentation Creating your own Elasticsearch cluster, populated with a set of rodent-based pages , including for our friendly Gopher , from Wikipedia: Connecting to Elasticsearch As a reminder, in our examples, we will make use of the Typed API offered by the Go client. Establishing a secure connection for any query requires configuring the client using either: Cloud ID and API key if making use of Elastic Cloud Cluster URL, username, password and the certificate Connecting to our cluster located on Elastic Cloud would look like this: The client connection can then be used for searching, as demonstrated in the subsequent sections. Manual boosting for hybrid search When combining any set of search algorithms, the traditional approach has been to manually configure constants to boost each query type. Specifically, a factor is specified for each query, and the combined results set is compared to the expected set to determine the recall of the query. Then we repeat for several sets of factors and pick the one closest to our desired state. For example, combining a single text search query boosted by a factor of 0.8 with a knn query with a lower factor of 0.2 can be done by specifying the Boost field in both query types, as shown in the below example: The factor specified in the Boost option for each query is added to the document score. By increasing the score of our match query by a larger factor than the knn query, results from the keyword query are more heavily weighted. The challenge of manual boosting, particularly if you're not a search expert, is that it requires tuning to figure out the factors that will lead to the desired result set. It's simply a case of trying out random values to see what gets you closer to your desired result set. Reciprocal Rank Fusion in hybrid search & Go client Reciprocal Rank Fusion , or RRF, was released under technical preview for hybrid search in Elasticsearch 8.9. It aims to reduce the learning curve associated with tuning and reduce the amount of time experimenting with factors to optimize the result set. With RRF, the document score is recalculated by blending the scores by the below algorithm: The advantage of using RRF is that we can make use of the sensible default values within Elasticsearch. The ranking constant k defaults to 60 . To provide a tradeoff between the relevancy of returned documents and the query performance when searching over large data sets, the size of the result set for each considered query is limited to the value of window_size , which defaults to 100 as outlined in the documentation . k and windows_size can also be configured within the Rrf configuration within the Rank method in the Go client, as per the below example: Conclusion Here we've discussed how to combine vector and keyword search in Elasticsearch using the Elasticsearch Go client . Check out the GitHub repo for all the code in this series. If you haven't already, check out part 1 and part 2 for all the code in this series. Happy gopher hunting! Resources Elasticsearch Guide Elasticsearch Go client What is vector search? | Elastic Reciprocal Rank Fusion Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Connecting to Elasticsearch Manual boosting for hybrid search Reciprocal Rank Fusion in hybrid search & Go client Conclusion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using hybrid search for gopher hunting with Elasticsearch and Go - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/hybrid-search-with-the-elasticsearch-go-client",
+    "meta_description": "Learn how to achieve hybrid search by combining keyword and vector search using Elasticsearch and the Elasticsearch Go client."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Spotify Wrapped part 2: Diving deeper into the data We will dive deeper into your Spotify data than ever before and explore connections you didn't even know existed. How To PK By: Philipp Kahr On February 25, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the first part of this series, written by Iulia Feroli, we talked about how to get your Spotify Wrapped data and visualize it in Kibana. In part 2, we're diving deeper into the data to see what else we can find out. To do this we're going to leverage a bit of a different approach and use Spotify to Elasticsearch to index the data into Elasticsearch. This tool is a bit more advanced and requires a bit more setup, but it is worth it. The data is more structured and we can ask more complex questions. What is different from the first approach? In the first blog we used the Spotify export directly and didn't perform any normalisation tasks, or any other data processing. This time we will use the same data but we will perform some data processing to make the data more usable. This will allow us to answer much more complex questions such as: What is the average duration of a song in my top 100? What is the average popularity of a song in my top 100? What is the median listening duration to a song? What is my most skipped tracked? When do I like to skip tracks? Am I listening to a particular hour of the day more than others? Am I listening to a particular day of the week more than others? Is a month of particular interest? What is the artist with the longest listening time? Spotify wrapped is a fun experience every year showing you what you listened to this year. It does not give you year over the year changes and thus you might miss some artists that were once in your top 10, but now have vanished. Data processing There is a large difference in the way we process the data in the first and the second post. If you want to keep working with the data from the first post, you will need to account for some field name changes, as well as need to revert to ES|QL to do certain extractions like hour of day on the fly. Nonetheless, you all should be able to follow this post. The data processing is done in the Spotify to Elasticsearch repository involves asking the Spotify API for the duration of the song, popularity and also renames and enhances some fields. For example the artist field in the Spotify export itself is just a String and does not represent features or multi-artist tracks Dashboarding I created a dashboard in Kibana to visualize the data. The dashboard is available here and you can import it into your Kibana instance. The dashboard is quite extensive and answers many of the above questions. Let's get into some of the questions and how to answer them together! What is the average duration of a song in my top 100? To answer this question we can use Lens, or ES|QL. Let's explore all three options. Let's phrase this question correctly in an Elasticsearch manner. We want to find the top 100 songs and then calculate the average duration of all of those songs combined. In Elasticsearch terms that would be two aggregations: Figure out the top 100 songs Calculate the average duration of those 100 songs. Lens In Lens this is rather simple, create a new Lens, switch to a table and drag and drop the title field into the table. Then click on the title field and set the size to 100, as well as set accuracy mode. Then drag and drop the duration field into the table and use last value , because we really only need the last value of each of the songs duration. The same song will only have one duration. In the bottom of this last value aggregation is a dropdown for a summary row, select average and it will show it to you. ES|QL ES|QL is a pretty fresh language compared to DSL & aggregations, but it is very powerful and easy to use. To answer the same question in ES|QL you would write the following query: Let me take you step through step of this ES|QL query: from spotify-history - This is the index pattern we are using. stats duration=max(duration), count=count() by title - This is the first aggregation, we are calculating the maximum duration of each song and the count of each song. We use max instead of last value as used in the Lens, that is because ES|QL right now does not have a first or last. sort count desc - We sort the songs by the count of each song, so the most listened to song is on top. limit 100 - We limit the result to the top 100 songs. stats Average duration of the songs=avg(duration) - We calculate the average duration of the songs. Is a month of particular interest to me? To answer this question we can use Lens with the help of runtime field and ES|QL. What do we notice straight away, that there is no field in the data that denotes the month directly, instead we need to calculate it from the @timestamp field. There are multiple ways to do this: Use a runtime field, to power the Lens ES|QL I personally think that ES|QL is the neater and quicker solution. That's it, nothing fancy needed to do, we can leverage the DATE_EXTRACT function to extract the month from the @timestamp field and then we can aggregate on it. Using the ES|QL visualisation we can drop that onto the dashboard. What is my listening duration per artist per year? The idea behind that is to see if an artist is just a one-time thing or if there is a reoccurrence. If I remember correctly, Spotify only shows you the top 5 artists in the yearly wrapped. Maybe your number 6 artist stays the same all the time, or they heavily change after the 10th position? One of the simplest representation of this is a percentage bar chart. We can use Lens for this. Follow the steps along: Drag and drop the listened_to_ms field. This field represents how long you listened to a song in milliseconds. Now per default Lens will create a median aggregation, we do not want that, alter that to a sum . In the top select percentage instead of stacked for the bar chart type. For the breakdown select artist and say top 10. In the Advanced dropdown don't forget to select accuracy mode . Now every color block represents how much you listened to this single artist. Depending on your timepicker the bars might represent values from days, to weeks, to months, to years. If you want a weekly breakdown, select the @timestamp and set the mininum interval to year . Now what we can tell in my case is that Fred Again.. is the artist I listened to most, nearly 12% of my total listening time was consumed by Fred Again.. . We also see that Fred Again.. dropped a bit in 2024, but Jamie XX grew largely. If we compare just the size of the bars. We can also tell that whilst Billie Eilish is constantly being played in 2024 the bar widthend. This means that I listened to Billie Eilish more in 2024 than in 2023. What about the top tracks per artist per listening time versus overall listening time? That's a mouthfull of a question. Let me try to explain what I want to say with that. Spotify tells you about the top song from a single artist, or your overall 5 top songs. Well, that's definitely interesting, but what about the breakdown of an artist? Is all my time consumed just by a single song that I play over and over again, or is that evenly distributed? Create a new lens and select Treemap as type. For the metric , same as before: select sum and use listened_to_ms as the field. For the group by we need two values. The first one is artist and then add a second one with title . The intermediate result looks like this: Let's change that to top 100 artists and deselect the other in the advanced dropdown, as well as enable accuracy mode. For title change that to top 10 and enable accuracy mode. The final result looks like this: What does this tell us now exactly? Without looking at any time component, we can tell that over all my listening history with Spotify, I spent 5.67% listening to Fred Again.. . In particularly I spent 1.21% of that time, listening to Delilah (pull me out of this) . It is interesting to see, if there is a single song that occupies an artist, or if there are other songs as well. The treemap itself is a nice form to represent such data distributions. Do I listen on a particular hour and day? Well, that we can answer super simple with a Lens visualisation leveraging the Heat Map . Create a new Lens, select Heat Map . For the Horizontal Axis select dayOfWeek field and set it to Top 7 instead of Top 3. For the Vertical Axis select the hourOfDay and for Cell Value just a simple Count of records . Now this will produce this panel: There are a couple of annoying things around this Lens, that just disturb me when interpreting. Let's try and clean it up a bit. First of all, I don't care about the legend too much, use the symbol in the top with the triangle, square, circle and disable it. Now the 2nd part that is annoying is the sorting of the days. It's Monday, Wednesday, Thursday, or anything else, depending on the values you have. The hourOfDay is correctly sorted. The way to sort the days is a funny hack and that is called to use Filters instead of Top Values . Click on dayOfWeek and select Filters , it should now look like this: Now just start typing the days. One filter for each day. \"dayOfWeek\" : Monday and give it the label Monday and rinse and repeat. One caveat in all of this though is, that Spotify provides the data in UTC+0 without any timezone information. Sure, they also provide the IP address and the country where you listened to and we could infere the timezone information from that, but that can be wonky and for countries like the U.S. that has multiple timezones, it can be too much of a hassle. This is important because Elasticsearch and Kibana have timezone support and by providing the correct timezone in the @timestamp field, Kibana would automatically adjust the time to your browser time. It should look like this when finalized, and we can tell that I am a very active listener during working hours and less so on Saturdays and Sundays. Conclusion In this blog we dove a bit deeper into the intricacies that the Spotify data offers. We showed a few simple and quick ways to get some visualizations up and running. It is simply amazing to have this much control over your own listening history. Stay tuned for more follow up blog posts! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is different from the first approach? Data processing Dashboarding What is the average duration of a song in my top 100? Is a month of particular interest to me? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Spotify Wrapped part 2: Diving deeper into the data - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/spotify-wrapped-part-02",
+    "meta_description": "We will dive deeper into your Spotify data than ever before and explore connections you didn't even know existed."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Detecting relationships in data: Spotify Wrapped, part 4 Graphs are a powerful tool for detecting relationships in data. In this blog, we'll explore the relationships between artists and your music taste. How To PK By: Philipp Kahr On April 1, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the first part , we talked about how to get your Spotify Wrapped data and how to visualize it. In the second part , we talked about how to process the data and how to visualize it. In the third part , we explored anomaly detection and how it helps us find interesting listening behaviour. In this part, we will talk about how to find relationships in your music. What is a relationship? In the context of music, a relationship can be many things. It can be the relationship between artists, genres, songs, or even the relationship between the time of day and the music you listen to. In this blog post, we will focus on the relationship between artists. How can we explore relationships? There is a simple solution to this and this is called Kibana Graph. Make sure that you have followed along for the data import in the second blog post, otherwise this won't work. What does Kibana Graph actually do under the hood? Let's assume we have the following documents, each row represents a new document. Now, Kibana Graph will compute all term co-occurrences to build the connections between each node. Therefore, we would expect a graph to look like this: This is a very simple example, but it shows the basic idea of how Kibana Graph works. It takes all the documents and creates a graph from them. There is some terminology involved here: the circles with the artist are known as nodes or vertices and the connection between those circles is known as an edge . There are also a few tuning options that we can use: Significant links Is a feature that helps clear out noisy terms from the dataset. This can be useful when the dataset is very noisy and documents have a large number of terms. This setting is expensive as Kibana Graph has to perform frequency checks on the Elasticsearch side for each request in order to compute the terms score, so it is recommended to turn it off if not strictly required. Certainity By default, this is set to a value of 3 , meaning the link between two artists has to appear at least 3 times in the dataset to be considered a link. I often reduce this value to 0 for other use cases, but for music, this value might be alright since I don't want to see one-time flukes in my graph. Instead, I want to see a relationship between songs I listened to more often. Turning down this value to 0 (or any other value) will increase the potential number of edges. For example, when listening once to a song that features the artists Jamie xx and Fred Again.. , this is enough for the relationship to show up with a value 0 . In contrast, setting it to something higher like 3 means I need to listen to the song at least 3-4 times to see a connection between those two artists. Sample Size The graph doesn't read all documents from the index. Instead, it relies on a sample approach to create the graph. This is done to keep the performance of the graph high. We can change this number to whatever we think is representative of our dataset. However, don't forget to adjust the timeout value as you increase the sample size. Timeout This is easy to explain. It refers to how long Elasticsearch has time to report back. Using Kibana Graph With the fundamentals explained, go to Kibana and click on the Graph app. You will need to select a data view, which is the Spotify History . When prompted to select a field, use the artist field. By default, that should turn out violet with a musical 16th note on it. I had to adjust my sample size to 5000 to get a good starting graph. We can tell that we have multiple artists that are connected to each other. This allows us to select one of those artists and press the + sign. There are already some clusters forming, which is important. Those standalone artists are not interesting to us. There is an all button in the right panel. Select it and press the + again. This will now explode your graph and pull in the additional artists. If we continue this process, increase the sample size and start exploding the graph more and more. Depending on your listening style, you should either get a lot of little islands or a few big clusters. In the next picture, we see one big cluster in the middle that interconnects Rudimental , Fred again.. , and Jamie XX . This makes sense, as all of them belong to the same genre which heavily features the same artists. At the same time, we have some tinier islands around. Kraftklub is a German band and is connected to mostly all of the German music I listen to, like Casper and Blond . There are some isolated vertices such as Harry Styles . Let's dig into why Harry Styles is alone. Does he not feature anyone? How does he fit into my listening behavior when all of my other listened-to music is more or less connected based on the featuring of artists? Go to Discover and perform the following: In the search bar, write artist: \"Harry Styles\" . This filters down to all documents that have Harry Styles in the name. We can simply click on the field artist in the field picker on the left side and see that there is only 1 value. Even though I listened to Harry Styles 2535 times, he has never featured another artist (or at least, according to Spotify data, it is not listed as such). Compare that to e.g. Jamie XX and we can see the difference. Conclusion In this blog, we explored relationships and how easy it is to leverage Kibana Graph and Elasticsearch's graph capabilities. Stay tuned for more parts in this blog series! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is a relationship? How can we explore relationships? Using Kibana Graph Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Detecting relationships in data: Spotify Wrapped, part 4 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/find-relationships-in-data",
+    "meta_description": "Learn how to find relationships in data with an example. We'll leverage Kibana Graph & Elasticsearch's graph capabilities to explore data relationships."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to implement image similarity search in Elasticsearch Searching through images to find the right one has always been challenging. With similarity image search, you can create a more intuitive search experience. Learn how to implement image search in Elastic. Generative AI RO By: Radovan Ondas On June 20, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Learn how to implement similarity image search in Elastic in just a few steps. Start setting up the application environment, then import the NLP model, and finally complete generating embeddings for your set of images. Get an overview of image similarity search with Elastic >> How to set up your environment The first step is setting up the environment for your application. General requirements include: Git Python 3.9 Docker Hundreds of images It is important to use hundreds of images to ensure the best results. Go to the working folder and check out the repository code created. Then navigate to the repository folder. Because you will be using Python to run the code, you need to make sure all requirements are met and the environment is ready. Now create the virtual environment and install all the dependencies. Elasticsearch cluster and embedding model Log in to your account to spin up an Elasticsearch cluster. Set up a small cluster with: One HOT node with 2GB of memory One ML (Machine learning) node with 4GB of memory (The size of this node is important as the NLP model you will import into Elasticsearch consumes ~1.5GB of memory.) After your deployment is ready, go to Kibana and check the capacity of your machine learning nodes. You will see one machine learning node in the view. There is no model loaded at the moment. Upload the CLIP embedding model from OpenAI using the Eland library. Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch and is able to handle both text and images. You'll use this model to generate embeddings from the text input and query for matching images. Find more details in the documentation of the Eland library. For the next step, you will need the Elasticsearch endpoint. You can get it from the Elasticsearch cloud console in the deployment detail section. Using the endpoint URL, execute the following command in the root directory of the repository. The Eland client will connect to the Elasticsearch cluster and upload the model into the machine learning node. You refer to your actual cluster URL with the –url parameter, for example, below refers to ‘image-search.es.europe-west1.gcp.cloud.es.io’ as cluster URL. Enter the Eland import command. The output will be similar to the following: The upload might take a few minutes depending on your connection. When finished, check the list of Trained models on the machine learning Kibana page: Menu -> Analytics -> Machine Learning -> Model management ->Trained models. Verify that the NLP Clip model is in the state ‘started’. If you receive a message on the screen — ML job and trained model synchronization required — click on the link to synchronize models. How to create image embeddings After setting up the Elasticsearch cluster and importing the embedding model, you need to vectorize your image data and create image embeddings for every single image in your data set. To create image embeddings, use a simple Python script. You can find the script here: create-image-embeddings.py . The script will traverse the directory of your images and generate individual image embeddings. It will create the document with the name and relative path and save it into an Elasticsearch index ‘ my-image-embeddings ’ using the supplied mapping . Put all your images (photos) into the folder ‘ app%2Fstatic%2Fimages ’. Use a directory structure with subfolders to keep the images organized. Once all images are ready, execute the script with a few parameters. It is crucial to have at least a few hundred images to achieve reasonable results. Having too few images will not give the expected results, as the space you will be searching in will be very small and distances to search vectors will be very similar. In the folder image_embeddings, run the script and use your values for the variables. Depending on the number of images, their size, your CPU, and your network connection, this task will take some time. Experiment with a small number of images before you try to process the full data set. After the script completes, you can verify if the index my-image-embeddings exists and has corresponding documents using the Kibana dev tools. Looking at the documents, you will see very similar JSON objects (like the example). You will see the image name, image id, and the relative path inside the images folder. This path is used in the frontend application to properly display the image when searching. The most important part of the JSON document is the ‘ image_embedding ’ that contains the dense vector produced by the CLIP model. This vector is used when the application is searching for an image or a similar image. Use the Flask application to search images Now that your environment is all set up, you can take the next step and actually search images using natural language and find similar images, using the Flask application that we provide as a proof of concept. The web application has a simple UI that makes image search simple. You can access the prototype Flask application in this GitHub repo . The application in the background performs two tasks. After you input the search string into the search box, the text will be vectorized using the machine learning _infer endpoint. Then, the query with your dense vector is executed against the index my-image-embeddings with the vectors. You can see those two queries in the example. The first API call uses the _infer endpoint, and the result is a dense vector. In the second task, search query, we will utilize the dense vector and get images sorted by score. To get the Flask application up and running, navigate to the root folder of the repository and configure the .env file. The values in the configuration file are used to connect to the Elasticsearch cluster. You need to insert values for the following variables. These are the same values used in the image embedding generation. ES_HOST='URL:PORT' ES_USER='elastic' ES_PWD='password' When ready, run the flask application in the main folder and wait until it starts. If the application starts, you will see an output similar to the below, which at the end indicates which URL you need to visit to access the application. Congrats! Your application should now be up and running and accessible on http:%2F%2F127.0.0.1:5001 via the internet browser. Navigate to the image search tab and input the text that describes your image best. Try to use a non-keyword or descriptive text. In the example below, the text entered was “ endless route to the top .” The results are shown from our data set. If a user likes one particular image in the result set, simply click the button next to it, and similar images will display. Users can do this endless times and build their own path through the image data set. The search also works by simply uploading an image. The application will convert the image into a vector and search for a similar image in the data set. To do this, navigate to the third tab Similar Image , upload an image from the disk, and hit Search . Because the NLP ( sentence-transformers%2Fclip-ViT-B-32-multilingual-v1 ) model we are using in Elasticsearch is multilingual and supports inference in many languages, try to search for the images in your own language. Then verify the results by using English text as well. It’s important to note that the models used are generic models, which are pretty accurate but the results you get will vary depending on the use case or other factors. If you need higher accuracy, you will have to adapt a generic model or develop your own model — the CLIP model is just intended as a starting point. Code summary You can find the complete code in the GitHub repository . You may be inspecting the code in routes.py , which implements the main logic of the application. Besides the obvious route definition, you should focus on methods that define the _infer and _search endpoints ( infer_trained_model and knn_search_images ). The code that generates image embeddings is located in create-image-embeddings.py file. Summary Now that you have the Flask app set up, you can search your own set of images with ease! Elastic provides native integration of vector search within the platform, which avoids communication with external processes. You get the flexibility to develop and employ custom embedding models that you may have developed using PyTorch. Semantic image search delivers the following benefits of other traditional approaches to image search: Higher accuracy: Vector similarity captures context and associations without relying on textual meta descriptions of the images. Enhanced user experience: Describe what you’re looking for, or provide a sample image, compared to guessing which keywords may be relevant. Categorization of image databases: Don’t worry about cataloging your images — similarity search finds relevant images in a pile of images without having to organize them. If your use case relies more on text data, you can learn more about implementing semantic search and applying natural language processing to text in previous blogs. For text data, a combination of vector similarities with traditional keyword scoring presents the best of both worlds. Ready to get started? Sign up for a hands-on vector search workshop at our virtual event hub and engage with the community in our online discussion forum . Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to How to set up your environment Elasticsearch cluster and embedding model How to create image embeddings Use the Flask application to search images Code summary Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to implement image similarity search in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/implement-image-similarity-search-elastic",
+    "meta_description": "Searching through images to find the right one has always been challenging. With similarity image search, you can create a more intuitive search experience. Learn how to implement image search in Elastic."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Evaluating search relevance Blog posts discussing how to think about evaluating your own search systems in the context of better understanding the BEIR benchmark. We will introduce specific tips and techniques to improve your search evaluation processes. Part1 ML Research Python July 16, 2024 Evaluating search relevance part 1 - The BEIR benchmark Learn to evaluate your search system in the context of better understanding the BEIR benchmark, with tips & techniques to improve your search evaluation processes. TP TV By: Thanos Papaoikonomou and Thomas Veasey Part2 ML Research Python September 19, 2024 Evaluating search relevance part 2 - Phi-3 as relevance judge Using the Phi-3 language model as a search relevance judge, with tips & techniques to improve the agreement with human-generated annotation. TP TV By: Thanos Papaoikonomou and Thomas Veasey Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Evaluating search relevance - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/evaluating-search-relevance",
+    "meta_description": "Blog posts discussing how to think about evaluating your own search systems in the context of better understanding the BEIR benchmark. We will introduce specific tips and techniques to improve your search evaluation processes."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Making Lucene faster with vectorization and FFI/madvise Discover how modern Java features, including vectorization and FFI/madvise, are speeding up Lucene's performance. Lucene CH By: Chris Hegarty On April 17, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Over in Lucene-land we've been eagerly adopting features of new Java versions. These features bring Lucene closer to both the JVM and the underlying hardware, which improves performance and stability. This keeps Lucene modern and competitive. The next major release of Lucene, Lucene 10, will require a minimum of Java 21. Let's take a look at why we decided to do this and how it will benefit Lucene. Foreign memory For efficiency reasons, indices and their various supporting structures are stored outside of the Java heap - they are stored on-disk and mapped into the process' virtual address space. Until recently, the way to do this in Java is with direct byte buffers, which is exactly what Lucene has been doing. Direct byte buffers have some inherent limitations. For example, they can address a maximum of 2GB, requiring more structures and code to span over larger sizes. However, most significant is the lack of deterministic closure, which we workaround by calling Unsafe::invokeCleaner to effectively close the buffer and release the memory. This is, as the name suggests, inherently an unsafe operation. Lucene adds safeguards around this, but, by definition, there is still a miniscule risk of failure if memory were to be accessed after it was released. More recently Java has added MemorySegment , which overcomes the limitations that we encounter with direct byte buffers. We now have safe deterministic closure and can address memory far beyond that of previous limits. While Lucene 9.x already has optional support for a mapped directory implementation backed by memory segments, the upcoming Lucene 10 drops support for byte buffers. All this means that Lucene 10 only operates with memory segments, so is finally operating in a safe model. Foreign function Different workloads; search or indexing, or different types of data, say, doc values or vector embeddings, have different access patterns. As we've seen, because of the way Lucene maps its index data, interaction with the operating system page cache is crucial to performance. Over the years a lot of effort and consideration has been given to optimizations around memory usage and page cache. First through native JNI code that calls madvise directly, and later with a directory implementation that uses direct I/O. However, while good at the time, both these solutions are a little less than ideal. The former requires platform specific builds and artifacts, and the latter leverages an optional JDK-specific API . For these reasons, neither solution is part of Lucene core, but instead lives in the further afield misc module. Mike McCandless has a good blog about this, from 2010! On modern Java we can now use the Panama Foreign Function Interface (FFI) to call library functions native on the system. We use this, directly in Lucene core , to call posix_madvise from the Standard C library - all from Java, and without the need for any JNI code or non-standard features. With this we can now advise the system about the type of memory access patterns we intend to use. Vectorization Parallelism and concurrency, while distinct, often translate to \"splitting a task so that it can be performed more quickly\", or \"doing more tasks at once\". Lucene is continually looking at new algorithms and striving to implement existing ones in more performant and efficient ways. One area that is now more straightforward to us in Java is data level parallelism - the use of SIMD (Single Instruction Multiple Data) vector instructions to boost performance. Lucene is using the latest JDK Vector API to implement vector distance computations that result in efficient hardware specific SIMD instructions. These instructions, when run on supporting hardware, can perform floating point dot product computations 8 times faster than the equivalent scalar code. This blog contains more specific information on this particular optimization. With the move to Java 21 minimum, it is a lot more straightforward to see how we can use the JDK Vector API in more places. We're even experimenting with the possibility of calling customized SIMD implementations with FFI, since the overhead of the native call is now quite minimal. Conclusion While the latest Lucene 9.x releases are able to benefit from many of the recent Java features, the requirement to run on versions of Java as early as Java 11 means that we're reaching a level of complexity with 9.x that, while maybe still ok today, is not where we want to be in the future. The upcoming Lucene 10 will be closer to the JVM and the hardware than ever before. By requiring a minimum of Java 21, we are able to drop the older direct byte buffer directory implementation, reliably advise the system about memory access patterns through posix_madvise , and continue our efforts around levering hardware accelerated instructions. Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Foreign memory Foreign function Vectorization Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Making Lucene faster with vectorization and FFI/madvise - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/lucene-and-java-moving-forward-together",
+    "meta_description": "Discover how modern Java features, including vectorization and FFI/madvise, are speeding up Lucene's performance."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to make your own Spotify Wrapped in Kibana Based on the downloadable Spotify personal history, we'll make a custom version of \"Spotify Wrapped\" with the top artists, songs, and trends over the year How To IF By: Iulia Feroli On January 14, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. I am probably not the only one who was a little disapointed by the Spotify Wrapped this year (and the internet seems to agree). Looking back at our yearly musical history has become a highly anticipated moment of the year for heavy Spotify users. However, at the end of the day all \"Wrapped\" is, is a data analytics problem, with a great PR team. So perhaps the mantle must fall on fellow data analysts to attempt to solve this problem in a more satisfying way. With the back-to-work and brand-new-year motivation fueling us - let's see if we can do any better. (Spoiler alert: we definitely can!) Getting started with your custom Spotify Wrapped The best part about this exercise is that it's fully replicable. Spotify allows users to download their own historical streaming data via this link , which you can request out of your account settings. If you want to generate your own version of this dashboard - request your own data to run this example through! Please note, that this could take a few days to a few weeks but should take no longer than 30 days. You will need to confirm that you would like this data and a copy will be sent to your email directly. Alternatively, you can try it out first on the sample data I've provided with a reduced sub-section from my own data. Once this has been generated we can dive into years worth of data and start building our own fun dashboards. Check out this notebook for the code examples. These were built and run using Elasticsearch 8.15 and Python 3.11. To get started with this notebook be sure to first clone the repository and download the required packages: Historical data will be generated as a list of JSON documents pre-formatted and chunked by Spotify, where each json represents an action. In most cases such an action means a song that you've listened to with some additional metadata such as length of time in milliseconds, artist information, as well as device properties. Naturally, if you have any experience with Elastic, the first thought looking at this data would be that this data practically screams \"add me to an index and search me!\". So we will do just that. Building an Elasticsearch Index As you can see in the same notebook once you've connected to your preferred Elasticsearch client (in my case Python, but any language client could be used), it takes a few simple lines of code to send the json documents into a new index: Even the mapping is handled automatically by Elastic due to the high quality and consistency of the data as prepared by Spotify, so you do not need to define it when indexing. One key element to pay attention to is noticing fields like \"Artist Name\" are seen as keywords which will allow us to run more complex aggregations for our dashboards. Wrapping queries to explore your listening habits With the index fully populated you can explore the data through code to run a few simple test queries. For example, my top artist has been Hozier for quite a few years now, so I start with the simplest possible term query to check my data: This gives me back 5653 hits - which means I've played more than five thousand Hozier songs since 2015 (as far as my data goes back). Seems pretty accurate. You can run the same test, or query any of the other fields like album name or song title with a simple text match query. The next steps in the notebook are to build more complex queries, like the most anticipated question - is my top artist list in Wrapped accurate? You can calculate this by either number of hits (how many times songs have been played) or perhaps more accurately, by summing up the total number of milliseconds of playtime by artist bucket. You can read more about aggregations in Elasticsearch here. Building Spotify Wrapped dashboards After these few examples you should have a good understanding of the Elasticsearch mechanics you can use to drill into this data. However, to both save time and make the insights more consumable (and pretty) you can also build a lot of these insights directly in a Kibana dashboard. In my case, I've run these in my cloud Elastic cluster but this can also be run locally. You first need to build a data view from the index and then you can directly build visualizations by dragging the data fields and choosing view types. The best part is - you can really make it your own by choosing the visualization type, date range you want to explore, which fields to showcase, etc. Just pick a visual and drag the needed fields into the graph. For most examples we will use ts (the time field) as a horizontal axis, and a combination of record counts or aggregations over the song_name or artist_name fields on the vertical axis. Within a few hours I built my own Spotify Wrapped - Iulia's Version, going deeper than ever before. Let's take a look. Starting with the \"classic\" wrapped insights - I've first built the top artist and song rank. Here's an example of how one of these graphs is built: Looking at the points of interest in this graph if you want to recreate it: make sure to select the correct time interval for your data to cover 2024 in 1 choose to show the top values of the artist name field, and exclude the other bucket to make your visualization neat in 2 map this against the count of records to rank the artists based on how many times they appear in the data (equivalent to time the songs were played) in 3 From here, I've gone even further by adding more metadata like time or location and looking at how these trends have changed throughout the year. Here you can see the listening time over the year (in weekly buckets), the locations I've been listening from while traveling, and how my top artists have varied month by month (including a sighting of brat summer). Some more tricks worth noting for these graphs: when you work with the playing time instead of just count of records, you should choose to aggregate all the instances of a song or artist being played by using the sum function. This is the kibana equivalent to the aggs operator we were using in the code in the first notebook examples. you can additionally convert the milliseconds into minutes or hours for neater visualisation *you can layer as many field breakdowns as you want, like for example adding the top 3 artists name on top of the monthly aggregations. Comparing my final dashboard to my actual Wrapped - it seems the results were close enough, but maybe not entirely accurate. It seems this year the top song choices are a little off from the way I calculate my ranking in this example. It could be that Spotify used a different formula to build this ranking, which makes it a bit harder to interpret. That's one of the benefits of building this dashboard from scratch - you have full transparency on the type of aggregations and scoring used for your insights. Finally, to really drive the point of the full customization benefit - I've had my colleague Elisheva also send me her own 2024 data. Here's another dashboard example, this time with a few more swifty insights: This time I've highlighted the albums breakdown since it gives some cool \"Eras\" insights' and added the \"hours of playtime per album per month\" - from which you can really pinpoint when Tortured Poets came out as an extra fun treat. Make your own Spotify Wrapped Just having the data stored in an index makes this a really fun and simple Elasticsearch use case, really showcasing some of the coolest features like aggregations or custom visualizations - and I hope to inspire you to try out your very own search engine and personal dashboard! Explore other parts of this series: Spotify Wrapped part 2: Diving deeper into the data Spotify Wrapped part 3: Anomaly detection population jobs Spotify Wrapped part 4: Detecting relationships in data Spotify Wrapped part 5: Finding your best music friend with vectors Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Getting started with your custom Spotify Wrapped Building an Elasticsearch Index Wrapping queries to explore your listening habits Building Spotify Wrapped dashboards Make your own Spotify Wrapped Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to make your own Spotify Wrapped in Kibana - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/spotify-wrapped-create-in-kibana",
+    "meta_description": "Here's how you can make your own Spofity Wrapped in Kibana with the top artists, songs, and trends over the year."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder Learn how to ingest data from AWS S3 using Elastic Serverless Forwarder (ESF). Ingestion How To HL By: Hemendra Singh Lodhi On October 2, 2024 Part of Series How to ingest data from AWS S3 into Elastic Cloud Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. This is the first installment in a multi-part blog series exploring different options for ingesting data from AWS S3 into Elastic Cloud . Elasticsearch offers several options for ingesting data from AWS S3 buckets, allowing customers to select the most suitable method based on their specific needs and architectural strategy. These are the key options for ingesting data from AWS S3: Elastic Serverless Forwarder (ESF) - our focus in this blog Elastic Agent - part 2 Elastic S3 Native Connector - part 3 Data ingestion options comparison Features ESF Elastic Agent S3 Connector Logs ✅ ✅ ✅[[^1]] Metrics ❌ ✅ ✅[[^2]] Cost Medium-Lambda,SQS Low-EC2,SQS Low-Elastic Enterprise Search Scaling Auto - Unlimited EC2 instance size Enterprise Search Node size Operation Low - Monitor Lambda function High - Manage Agents Low PrivateLink ✅ ✅ NA (Pull from S3) Primary Use Case Logs Logs & Metrics Content & Search Note1: ESF doesn't support metrics collection due to AWS limitation on services that can trigger Lambda function and you can't invoke Lambda using subscription filter on CloudWatch metrics. However, taking cost consideration into account it is possible to store metrics in S3 and via SQS trigger ingest into Elastic. Note2: [[^1]][[^2]]Although S3 connector can pull logs and metrics from S3 bucket, it is most suited for ingesting content, files, images and other data types In this blog we will focus on how to ingest data from AWS S3 using Elastic Serverless Forwarder(ESF). In the next parts, we will explore Elastic Agent and Elastic S3 Native Connector methods. Let's begin. Follow these steps to launch the Elastic Cloud deployment: Elastic Cloud Create an account if not created already and create an Elastic deployment in AWS . Once the deployment is created, note the Elasticsearch endpoint. This can be found in the Elastic Cloud console under -> Manage -> Deployments . Elastic Serverless Forwarder The Elastic Serverless Forwarder is an AWS Lambda function that forward logs such as VPC Flow logs, WAF, Cloud Trail etc. from AWS environment to Elastic. It can be used to send data to Elastic Cloud as well as self-managed deployment. Features of Elastic Serverless Forwarder Support multiple inputs S3 (via SQS event notification) Kinesis Data Streams CloudWatch Logs subscription filters SQS message payload At least once delivery using \"continuing queue\" and \"replay queue\" (created automatically by serverless forwarder) Support data transfer over PrivateLink which allows data transfer within the AWS Virtual Private Cloud (or VPC) and not on public network. Lambda function is an AWS Serverless compute managed service with automatic scaling in response to code execution request Function execution time is optimised with optimal memory size allocated as required Pay as you go pricing, only pay for compute time during Lambda function execution and for SQS event notification Data flow: I ngesting data from AWS S3 with Elastic Serverless Forwarder We will use S3 input with SQS notification to send VPC flow logs to Elastic Cloud: VPC flow log is configured to write to S3 bucket Once log is written to S3 bucket, S3 event notification (S3:ObjectCreated) is sent to SQS SQS event notification containing event metadata triggers the Lambda function which read the logs from the bucket Continuing queue is created when forwarder is deployed and ensures at least once delivery. Forwarder keeps track of last event sent and helps in processing pending events when forwarder function exceed runtime of 15 min (Lambda max default) Replay queue is also created when forwarder is deployed and handles log ingestion exceptions. Forwarder keeps track of failed events and writes them to the replay queue for later ingestion. For e.g. in my testing, I put the wrong Elastic API key, causing authentication failure, which filled up the replay queue. You can enable the replay queue as a trigger for the ESF lambda function to consume the messages from the S3 bucket again. It is important to address the delivery failure first; otherwise message will accumulate in the replay queue. You can set this trigger permanently but may need to remove/re-enable depending on the message failure issue. To enable the trigger go to SQS -> elastic-serverless-forwarder-replay-queue- -> under Lambda triggers -> Configure Lambda function trigger -> Select the ESF lamnda function Setting up Elastic Serverless Forwarder for AWS S3 data ingestion Create S3 Bucket s3-vpc-flow-logs-elastic to store VPC flow logs AWS Console -> S3 -> Create bucket. You may leave other settings as default or change as per the requirements: Copy the bucket ARN, required to configure flow logs in next step: Enable VPC Flow logs and send to S3 bucket s3-vpc-flow-logs-elastic AWS Console -> VPC -> Select VPC -> Flow logs. Leave other settings as is or change as per the requirements: Provide name of the flow logs, select what filters to apply, aggregation interval and destination for the flow log storage: Once done, it will look like below with S3 as the destination. Going forward all the flow traffic through this VPC will be stored in the bucket s3-vpc-flow-logs-elastic : Create SQS queue Note 1: Create SQS queue in same region as S3 bucket Note 2: Set the visiblity timeout of 910 second which is 10 sec more than AWS Lambda function max runtime of 900 sec. AWS Console -> Amazon SQS -> Create queue Provide queue name and update visiblity timeout to 910 sec. Lambda function runs for max 900 sec (15min) and setting a higher value for visibility timeout allows consumer Elastic Serverless Forwarder(ESF) to process and delete the message from the queue: Update the SQS Access Policy (Advance) to allow S3 bucket to send notification to SQS queue. Replace account-id with your AWS account ID. Keep other options as default. Here, we are specifying S3 to send message to SQS queue (ARN) from the S3 bucket: More details on permission requirement (IAM user) for AWS integration is available here . Copy the SQS ARN, in queue setting under Details : Enable VPC flow log event notification in S3 bucket AWS Console > S3. Select bucket s3-vpc-flow-logs-elastic -> Properties and Create event notification Provide name and on what event type you want to trigger SQS. We have selected object create when any object is added to the bucket: Select destination as SQS queue and choose sqs-vpc-flow-logs-elastic-serverless-forwarder : Once saved, configuration will look like below: Create another S3 bucket to store configuration file for Elastic Serverless Forwarder: Create a file named config.yaml and update with below configuration. Full set of options here : input type : s3-sqs . We are using S3 with SQS notification option output : elasticsearch_url : elasticsearch endpoint from Elastic Cloud deployment Create section above api_key : Create Elasticsearch API key (User API key) using instruction here es_datastream_name : forwarder supports automatic routing of aws.cloudtrail, aws.cloudwatch_logs, aws.elb_logs, aws.firewall_logs, aws.vpcflow, and aws.waf logs . For other log types you can set it to the naming convention required. Leave other options as default. Upload the config.yaml in s3 bucket s3-vpc-flow-logs-serverless-forwarder-config : Install AWS integration assets Elastic integrations comes pre-packaged with assets that simplify collection, parsing , indexing and visualisation. The integrations uses data stream with specific naming convention for indices which is helpful in getting started. Forwarder can write to any other stream name too. Follow the steps to install Elastic AWS integration. Kibana -> Management -> Integrations, Search for AWS: Deploy the Elastic Serverless Forwarder There are several options available to deploy Elastic Serverless Forwarder from SAR (Serverless Application Repository): Using AWS Console Using AWS Cloudformation Using Terraform Deploy directly which provides more customisation options We will use AWS Console option to deploy ESF. Note : Only one deployment per region is allowed when using the AWS console directly. AWS Console -> Lambda -> Application -> Create Application , search for elastic-serverless-forwarder: Under Application settings provide the following details: Application name - elastic-serverless-forwarder ElasticServerlessForwarderS3Buckets - s3-vpc-flow-logs-elastic ElasticServerlessForwarderS3ConfigFile - s3://s3-vpc-flow-logs-serverless-forwarder-config/config.yaml ElasticServerlessForwarderS3SQSEvent - arn:aws:sqs:ap-southeast-2:xxxxxxxxxxx:sqs-vpc-flow-logs-elastic-serverless-forwarder On successful deployment, status of Lambda deployment should be Create Complete : Below are the SQS queues automatically created upon successful deployment of ESF: Once everything is set up correctly, published flow logs in S3 bucket s3-vpc-flow-logs-elastic will send notification to SQS and you will see the messages available in the queue sqs-vpc-flow-logs-elastic-serverless-forwarder to be consumed by ESF. In case of issues such as SQS message count keep on increasing then check the Lambda execution logs Lambda -> Application -> serverlessrepo-elastic-serverless-forwarder-ElasticServerlessForwarderApplication* -> Monitoring -> Cloudwatch Log Insights. Click on LogStream for detailed information: More on troubleshooting here . Validate VPC flow logs in Kibana Discover and Dashboard Kibana -> Discover . This will show VPC flow logs: Kibana -> Dashboards . Look for VPC Flow log Overview dashboard: More dashboards! As mentioned earlier, AWS integration provides pre-built dashboards in addition to other assets. We can monitor involved AWS services in our setup using the Elastic agent ingestion method which we will cover in Part 2 of this series. This will help in tracking usage and help in optimisation. Conclusion Elasticsearch provides multiple options to sync data from AWS S3 into Elasticsearch deployments. In this walkthrough, we have demonstrated that it is relatively easy to implement Elastic Serverless Forwarder(ESF) ingestion options to ingest data from AWS S3 and leverage Elastic's industry-leading search & analytics capabilities. In Part 2 of this series , we'll dive into using Elastic Agent as another option for ingesting AWS S3 data. And in part 3 , we'll explain how to ingest data from AWS S3 using the Elastic S3 Native connector. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Data ingestion options comparison Elastic Cloud Elastic Serverless Forwarder Features of Elastic Serverless Forwarder Data flow: Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/ingest-aws-s3-data-elastic-cloud-elastic-serverless-forwarder",
+    "meta_description": "Learn how to ingest data from AWS S3 into Elastic Cloud using the Elastic Serverless Forwarder. Follow this guide to start the AWS S3 ingesting process."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Ruby Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Ruby +1 December 13, 2024 How to migrate your Ruby app from OpenSearch to Elasticsearch A guide to migrate a Ruby codebase from the OpenSearch client to the Elasticsearch client. FB By: Fernando Briano ES|QL Ruby +1 October 24, 2024 How to use the ES|QL Helper in the Elasticsearch Ruby Client Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results. FB By: Fernando Briano Ruby How To October 16, 2024 How to use Elasticsearch with popular Ruby tools Take a look at how to use Elasticsearch with some popular Ruby libraries. FB By: Fernando Briano Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Ruby - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/ruby-programming",
+    "meta_description": "Ruby articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Better Binary Quantization (BBQ) in Lucene and Elasticsearch How Better Binary Quantization (BBQ) works in Lucene and Elasticsearch. Lucene Vector Database BT By: Benjamin Trent On November 11, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Embedding models output float32 vectors, often too large for efficient processing and practical apps. Elasticsearch supports int8 scalar quantization to reduce vector size while preserving performance. Other methods reduce retrieval quality and are impractical for real world use. In Elasticsearch 8.16 and Lucene, we introduced Better Binary Quantization (BBQ), a new approach developed from insights drawn from a recent technique - dubbed “ RaBitQ ” - proposed by researchers from Nanyang Technological University, Singapore. BBQ is a leap forward in quantization for Lucene and Elasticsearch, reducing float32 dimensions to bits, delivering ~95% memory reduction while maintaining high ranking quality. BBQ outperforms traditional approaches like Product Quantization (PQ) in indexing speed (20-30x less quantization time), query speed (2-5x faster queries), with no additional loss in accuracy. In this blog, we will explore BBQ in Lucene and Elasticsearch, focusing on recall, efficient bitwise operations, and optimized storage for fast, accurate vector search. Note, there are differences in this implementation than the one proposed by the original RaBitQ authors. Mainly: Only a single centroid is used for simple integration with HNSW and faster indexing Because we don't randomly rotate the codebook we do not have the property that the estimator is unbiased over multiple invocations of the algorithm Rescoring is not dependent on the estimated quantization error Rescoring is not completed during graph index search and is instead reserved only after initial estimated vectors are calculated Dot product is fully implemented and supported. The original authors focused on Euclidean distance only. While support for dot product was hinted at, it was not fully considered, implemented, nor measured. Additionally, we support max-inner product, where the vector magnitude is important, so simple normalization just won't suffice. What does the \"better\" in Better Binary Quantization mean? In Elasticsearch 8.16 and in Lucene, we have introduced what we call \"Better Binary Quantization\". Naive binary quantization is exceptionally lossy and achieving adequate recall requires gathering 10x or 100x additional neighbors to rerank. This just doesn't cut it. In comes Better Binary Quantization! Here are some of the significant differences between Better Binary Quantization and naive binary quantization: All vectors are normalized around a centroid. This unlocks some nice properties in quantization. Multiple error correction values are stored. Some of these corrections are for the centroid normalization, some are for the quantization. Asymmetric quantization. Here, while the vectors themselves are stored as single bit values, queries are only quantized down to int4. This significantly increases search quality at no additional cost in storage. Bit-wise operations for fast search. The query vectors are quantized and transformed in such a way that allows for efficient bit-wise operations. Indexing with Better Binary Quantization (BBQ) Indexing is simple. Remember, Lucene builds individual read only segments. As vectors come in for a new segment the centroid is incrementally calculated. Then once the segment is flushed, each vector is normalized around the centroid and quantized. Here is a small example: v 1 = [ 0.56 , 0.85 , 0.53 , 0.25 , 0.46 , 0.01 , 0.63 , 0.73 ] c = [ 0.65 , 0.65 , 0.52 , 0.35 , 0.69 , 0.30 , 0.60 , 0.76 ] v c 1 ′ = v 1 − c = [ − 0.09 , 0.19 , 0.01 , − 0.10 , − 0.23 , − 0.38 , − 0.05 , − 0.03 ] b i n ( v c 1 ′ ) = { { 1 x > 0 0 o t h e r w i s e : x ∈ v c 1 ′ } b i n ( v c 1 ′ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] 0 b 00000110 = 6 v_{1} = [0.56, 0.85, 0.53, 0.25, 0.46, 0.01 , 0.63, 0.73] \\newline c = [0.65, 0.65, 0.52, 0.35, 0.69, 0.30, 0.60, 0.76] \\newline v_{c1}' = v_{1} - c = [-0.09, 0.19, 0.01, -0.10, -0.23, -0.38, -0.05, -0.03] \\newline bin(v_{c1}') = \\left\\{ \\begin{cases} 1 & x\\gt 0 \\\\ 0 & otherwise \\end{cases} : x \\in v_{c1}'\\right\\} \\newline bin(v_{c1}') = [0, 1, 1, 0, 0, 0, 0, 0] \\newline 0b00000110 = 6 v 1 ​ = [ 0.56 , 0.85 , 0.53 , 0.25 , 0.46 , 0.01 , 0.63 , 0.73 ] c = [ 0.65 , 0.65 , 0.52 , 0.35 , 0.69 , 0.30 , 0.60 , 0.76 ] v c 1 ′ ​ = v 1 ​ − c = [ − 0.09 , 0.19 , 0.01 , − 0.10 , − 0.23 , − 0.38 , − 0.05 , − 0.03 ] bin ( v c 1 ′ ​ ) = { { 1 0 ​ x > 0 o t h er w i se ​ : x ∈ v c 1 ′ ​ } bin ( v c 1 ′ ​ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] 0 b 00000110 = 6 When quantizing down to the bit level, 8 floating point values are transformed into a single 8bit byte. Then, each of the bits are packed into a byte and stored in the segment along with any error correction values required for the vector similarity chosen. For each vector, bytes stored are dims/8 number of bytes and then any error correction values; 2 floating point values for Euclidean, or 3 for dot product. A quick side quest to talk about how we handle merging When segments are merged, we can take advantage of the previously calculated centroids. Simply doing a weighted average of the centroids and then re-quantizing the vectors around the new centroid. What gets tricky is ensuring HNSW graph quality and allowing the graph to be built with the quantized vectors. What's the point of quantizing if you still need all the memory to build the index?! In addition to appending vectors to the existing largest HNSW graph, we need to ensure vector scoring can take advantage of asymmetric quantization. HNSW has multiple scoring steps: one for the initial collection of neighbors, and another for ensuring only diverse neighbors are connected. In order to efficiently use asymmetric quantization, we create a temporary file of all vectors quantized as 4bit query vectors. So, as a vector is added to the graph we first: Get the already quantized query vector that is stored in the temporary file. Search the graph as normal using the already existing bit vectors. Once we have the neighbors, diversity and reverse-link scoring can be done with the previously int4 quantized values. After the merge is complete, the temporary file is removed leaving only the bit quantized vectors. The temporary file stores each query vector as an int4 byte array which takes dims/2 number of bytes, some floating point error correction values (3 for Euclidean, 4 for dot product), and a short value for the sum of the vector dimensions. Asymmetric quantization, the interesting bits I have mentioned asymmetric quantization and how we lay out the queries for graph building. But, how are the vectors actually transformed? How does it work? The \"asymmetric\" part is straight forward. We quantize the query vectors to a higher fidelity. So, doc values are bit quantized and query vectors are int4 quantized. What gets a bit more interesting is how these quantized vectors are transformed for fast queries. Taking our example vector from above, we can quantize it to int4 centered around the centroid. v c 1 ′ = v 1 − c = [ − 0.09 , 0.19 , 0.01 , − 0.10 , − 0.23 , − 0.38 , − 0.05 , − 0.03 ] m a x v c 1 ′ = m a x ( v c 1 ′ ) = 0.19 , m i n v c 1 ′ = m i n ( v c 1 ′ ) = − 0.38 Q ( x s ) = { ( x − m i n v c 1 ′ ) × 15 m a x v c 1 ′ − m i n v c 1 ′ : x ∈ x s } Q ( v c 1 ′ ) = { ( x − ( − 0.38 ) ) × 15 0.19 − ( − 0.38 ) : x ∈ v c 1 ′ } = { ( x + 0.38 ) × 26.32 : x ∈ v c 1 ′ } = [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] v_{c1}' = v_{1} - c = [-0.09, 0.19, 0.01, -0.10, -0.23, -0.38, -0.05, -0.03] \\newline max_{v_{c1}'}=max(v_{c1}')=0.19,min_{v_{c1}'}=min(v_{c1}')=-0.38 \\newline Q(x_{s}) = \\{(x-min_{v_{c1}'}) \\times \\frac{15}{max_{v_{c1}'} - min_{v_{c1}'}} : x \\in x_{s} \\} \\newline Q(v_{c1}') = \\{(x-(-0.38)) \\times \\frac{15}{0.19 -(-0.38)} : x \\in v_{c1}' \\} \\newline = \\{(x + 0.38) \\times 26.32 : x \\in v_{c1}' \\} \\newline = [8, 15, 10, 7, 4, 0, 9, 9] v c 1 ′ ​ = v 1 ​ − c = [ − 0.09 , 0.19 , 0.01 , − 0.10 , − 0.23 , − 0.38 , − 0.05 , − 0.03 ] ma x v c 1 ′ ​ ​ = ma x ( v c 1 ′ ​ ) = 0.19 , mi n v c 1 ′ ​ ​ = min ( v c 1 ′ ​ ) = − 0.38 Q ( x s ​ ) = {( x − mi n v c 1 ′ ​ ​ ) × ma x v c 1 ′ ​ ​ − mi n v c 1 ′ ​ ​ 15 ​ : x ∈ x s ​ } Q ( v c 1 ′ ​ ) = {( x − ( − 0.38 )) × 0.19 − ( − 0.38 ) 15 ​ : x ∈ v c 1 ′ ​ } = {( x + 0.38 ) × 26.32 : x ∈ v c 1 ′ ​ } = [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] With the quantized vector in hand, this is where the fun begins. So we can translate the vector comparisons to a bitwise dot product, the bits are shifted. Its probably better to just visualize what is happening: Here, each int4 quantized value has its relative positional bits shifted to a single byte. Note how all the first bits are packed together first, then the second bits, and so on. But how does this actually translate to dot product? Remember, dot product is the sum of the component products. For the above example, let's write this fully out: b i n ( v c 1 ′ ) ⋅ Q ( v c 1 ′ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] ⋅ [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] = [ 0 × 8 + 1 × 15 + 1 × 10 + 0 × 7 + 0 × 4 + 0 × 0 + 0 × 9 + 0 × 9 ] = 15 + 10 = 25 bin(v_{c1}') \\cdot Q(v_{c1}') = [0, 1, 1, 0, 0, 0, 0, 0] \\cdot [8, 15, 10, 7, 4, 0, 9, 9] \\newline = [0 \\times 8 + 1 \\times 15 + 1 \\times 10 + 0 \\times 7 + 0 \\times 4 + 0 \\times 0 + 0 \\times 9 + 0 \\times 9] \\newline = 15 + 10 = 25 bin ( v c 1 ′ ​ ) ⋅ Q ( v c 1 ′ ​ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] ⋅ [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] = [ 0 × 8 + 1 × 15 + 1 × 10 + 0 × 7 + 0 × 4 + 0 × 0 + 0 × 9 + 0 × 9 ] = 15 + 10 = 25 We can see that its simply the summation of the query components where the stored vector bits are 1. And since all numbers are just bits, when expressed using a binary expansion, we can move things around a bit to take advantage of bitwise operations. The bits that will be flipped after the & \\& & will be the individual bits of the numbers that contribute to the dot product. In this case 15 and 10. Remember our originally stored vector s t o r e d V e c B i t s = b i n ( v c 1 ′ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] We rotate the combine the bits resulting in s t o r e d V e c t B i t s = 0 b 11000000 The query vector, int4 quantized Q ( v c 1 ′ ) = [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] The binary values of each dimension b i t s ( Q ( v c 1 ′ ) ) = [ 0 b 1 0 0 0 , 0 b 1 1 1 1 , 0 b 1 0 1 0 , 0 b 0 1 1 1 , 0 b 0 1 0 0 , 0 b 0 0 0 0 , 0 b 1 0 0 1 , 0 b 1 0 0 1 ] We shift the bits and align as shown in above visualization q V e c B i t s = a l i g n ( b i t s ( Q ( v c 1 ′ ) ) ) = [ 0 b 11001010 , 0 b 00001110 , 0 b 00011010 , 0 b 11000111 ] q V e c B i t s & s t o r e d V e c t B i t s = { q V e c B i t s & b i t s : b i t s ∈ s t o r e d V e c t B i t s } = [ 0 b 00000010 , 0 b 00000110 , 0000000010 , 0 b 0000110 ] \\text{Remember our originally stored vector} storedVecBits = bin(v_{c1}') = [0, 1, 1, 0, 0, 0, 0, 0] \\newline \\text{We rotate the combine the bits resulting in} \\newline storedVectBits = 0b11000000 \\newline \\text{The query vector, int4 quantized} \\newline Q(v_{c1}') = [8, 15, 10, 7, 4, 0, 9, 9] \\newline \\text{The binary values of each dimension} \\newline bits(Q(v_{c1}')) = [0b\\textcolor{lime}{1}\\textcolor{red}{0}\\textcolor{cyan}{0}\\textcolor{orange}{0}, 0b\\textcolor{lime}{1}\\textcolor{red}{1}\\textcolor{cyan}{1}\\textcolor{orange}{1}, 0b\\textcolor{lime}{1}\\textcolor{red}{0}\\textcolor{cyan}{1}\\textcolor{orange}{0}, 0b\\textcolor{lime}{0}\\textcolor{red}{1}\\textcolor{cyan}{1}\\textcolor{orange}{1}, 0b\\textcolor{lime}{0}\\textcolor{red}{1}\\textcolor{cyan}{0}\\textcolor{orange}{0}, 0b\\textcolor{lime}{0}\\textcolor{red}{0}\\textcolor{cyan}{0}\\textcolor{orange}{0}, 0b\\textcolor{lime}{1}\\textcolor{red}{0}\\textcolor{cyan}{0}\\textcolor{orange}{1}, 0b\\textcolor{lime}{1}\\textcolor{red}{0}\\textcolor{cyan}{0}\\textcolor{orange}{1}] \\newline \\text{We shift the bits and align as shown in above visualization} \\newline qVecBits = align(bits(Q(v_{c1}'))) = [0b\\textcolor{orange}{11001010}, 0b\\textcolor{cyan}{00001110}, 0b\\textcolor{red}{00011010}, 0b\\textcolor{lime}{11000111}] \\newline qVecBits \\, \\& \\, storedVectBits = \\{qVecBits \\, \\& \\, bits : bits \\in storedVectBits\\} \\newline = [0b00000010, 0b00000110, 0000000010, 0b0000110] Remember our originally stored vector s t ore d V ec B i t s = bin ( v c 1 ′ ​ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] We rotate the combine the bits resulting in s t ore d V ec tB i t s = 0 b 11000000 The query vector, int4 quantized Q ( v c 1 ′ ​ ) = [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] The binary values of each dimension bi t s ( Q ( v c 1 ′ ​ )) = [ 0 b 1 0 0 0 , 0 b 1 1 1 1 , 0 b 1 0 1 0 , 0 b 0 1 1 1 , 0 b 0 1 0 0 , 0 b 0 0 0 0 , 0 b 1 0 0 1 , 0 b 1 0 0 1 ] We shift the bits and align as shown in above visualization q V ec B i t s = a l i g n ( bi t s ( Q ( v c 1 ′ ​ ))) = [ 0 b 11001010 , 0 b 00001110 , 0 b 00011010 , 0 b 11000111 ] q V ec B i t s & s t ore d V ec tB i t s = { q V ec B i t s & bi t s : bi t s ∈ s t ore d V ec tB i t s } = [ 0 b 00000010 , 0 b 00000110 , 0000000010 , 0 b 0000110 ] Now we can count the bits, shift and sum back together. We can see that all the bits that are left over are the positional bits for 15 and 10. = ( b i t C o u n t ( 0 b 00000010 ) < < 0 ) + ( b i t C o u n t ( 0 b 00000110 ) < < 1 ) + ( b i t C o u n t ( 0 b 00000010 ) < < 2 ) + ( b i t C o u n t ( 0 b 0000110 ) < < 3 ) = ( 1 < < 0 ) + ( 2 < < 1 ) + ( 1 < < 2 ) + ( 2 < < 3 ) = 25 = (bitCount(0b00000010) << 0) + (bitCount(0b00000110) << 1) + (bitCount(0b00000010) << 2) + (bitCount(0b0000110) << 3) \\newline = (1 << 0) + (2 << 1) + (1 << 2) + (2 << 3) \\newline = 25 = ( bi tC o u n t ( 0 b 00000010 ) << 0 ) + ( bi tC o u n t ( 0 b 00000110 ) << 1 ) + ( bi tC o u n t ( 0 b 00000010 ) << 2 ) + ( bi tC o u n t ( 0 b 0000110 ) << 3 ) = ( 1 << 0 ) + ( 2 << 1 ) + ( 1 << 2 ) + ( 2 << 3 ) = 25 Same answer as summing the dimensions directly. Here is the example but simplifed java code: Testing with BBQ: Alright, show me the numbers: We have done extensive testing with BBQ both in Lucene and Elasticsearch directly. Here are some of the results: Lucene benchmarking The benchmarking here is done over three datasets: E5-small , CohereV3 , and CohereV2 . Here, each element indicates recall@100 with oversampling by [1, 1.5, 2, 3, 4, 5]. E5-small This is 500k vectors for E5-small built from the quora dataset. quantization Index Time Force Merge time Mem Required bbq 161.84 42.37 57.6MB 4 bit 215.16 59.98 123.2MB 7 bit 267.13 89.99 219.6MB raw 249.26 77.81 793.5MB It's sort of mind blowing that we get recall of 74% with only a single bit precision. Since the number of dimensions are fewer, the BBQ distance calculation isn't that much faster than our optimized int4. CohereV3 This is 1M 1024 dimensional vectors, using the CohereV3 model. quantization Index Time Force Merge time Mem Required bbq 338.97 342.61 208MB 4 bit 398.71 480.78 578MB 7 bit 437.63 744.12 1094MB raw 408.75 798.11 4162MB Here, 1bit quantization and HNSW gets above 90% recall with only 3x oversampling. CohereV2 This is 1M 768 dimensional vectors, using the CohereV2 model and max inner product similarity. quantization Index Time Force Merge time Mem Required bbq 395.18 411.67 175.9MB 4 bit 463.43 573.63 439.7MB 7 bit 500.59 820.53 833.9MB raw 493.44 792.04 3132.8MB It's really interesting to see how much BBQ and int4 are in lock-step with this benchmark. Its neat that BBQ can get such high recall with inner-product similarity with only 3x oversampling. Larger scale Elasticsearch benchmarking As referenced in our larger scale vector search blog we have a rally track for larger scale vector search benchmarking. This data set has 138M floating point vectors of 1024 dimensions. Without any quantization, this would require around 535 GB of memory with HNSW. With better-binary-quantization, the estimate drops to around 19GB. For this test, we used a single 64GB node in Elastic cloud with the following track parameters: Important note, if you want to replicate, it will take significant time to download all the data and requires over 4TB of disk space. The reason for all the additional disk space is that this dataset also contains text fields, and you need diskspace for both the compressed files and their inflated size. The parameters are as follows: k is the number of neighbors to search for num_candidates is the number of candidates used to explore per shard in HNSW rerank is the number of candidates to rerank, so we will gather that many values per shard, collect the total rerank size and then rescore the top k values with the raw float32 vectors. For indexing time, it took around 12 hours. And instead of showing all the results, here are three interesting ones: k-num_candidates-rerank Avg Nodes Visited % Of Best NDGC Recall Single Query Latency Multi-Client QPS knn-recall-10-100-50 36,079.801 90% 70% 18ms 451.596 knn-recall-10-20 15,915.211 78% 45% 9ms 1,134.649 knn-recall-10-1000-200 115,598.117 97% 90% 42.534ms 167.806 This shows the importance of balancing recall, oversampling, reranking and latency. Obviously, each needs to be tuned for your specific use case, but considering this was impossible before and now we have 138M vectors in a single node, it's pretty cool. Conclusion Thank you for taking a bit of time and reading about Better Binary Quantization. Originally being from Alabama and now living in South Carolina, BBQ already held a special place in my life. Now, I have more reason to love BBQ! We will release this as tech-preview in 8.16, or in serverless right now. To use this, just set your dense_vector.index_type as bbq_hnsw or bbq_flat in Elasticsearch. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to What does the \"better\" in Better Binary Quantization mean? Indexing with Better Binary Quantization (BBQ) A quick side quest to talk about how we handle merging Asymmetric quantization, the interesting bits Testing with BBQ: Alright, show me the numbers: Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Better Binary Quantization (BBQ) in Lucene and Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/better-binary-quantization-lucene-elasticsearch",
+    "meta_description": "Learn about Elastic's BBQ (Better Binary Quantization) and how it works in Elasticsearch and Lucene."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Adding passage vector search to Lucene Here's how to add passage vectors to Lucene, the benefits of doing so and how existing Lucene structures can be used to create an efficient retrieval experience. Vector Database Lucene BT By: Benjamin Trent On August 24, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Vector search is a powerful tool in the information retrieval tool box. Using vectors alongside lexical search like BM25 is quickly becoming commonplace. But there are still a few pain points within vector search that need to be addressed. A major one is text embedding models and handling larger text input. Where lexical search like BM25 is already designed for long documents, text embedding models are not. All embedding models have limitations on the number of tokens they can embed. So, for longer text input it must be chunked into passages shorter than the model’s limit. Now instead of having one document with all its metadata, you have multiple passages and embeddings. And if you want to preserve your metadata, it must be added to every new document. Figure 1: Now instead of having a single piece of metadata indicating the first chapter of Little Women, you have to index that information data for every sentence. A way to address this is with Lucene's “join” functionality. This is an integral part of Elasticsearch’s nested field type. It makes it possible to have a top-level document with multiple nested documents, allowing you to search over nested documents and join back against their parent documents. This sounds perfect for multiple passages and vectors belonging to a single top-level document! This is all awesome! But, wait, Elasticsearch doesn’t support vectors in nested fields. Why not, and what needs to change? The (kNN) problem with parents and children The key issue is how Lucene can join back to the parent documents when searching child vector passages. Like with kNN pre-filtering versus post-filtering , when the joining occurs determines the result quality and quantity. If a user searches for the top four nearest parent documents (not passages) to a query vector , they usually expect four documents. But what if they are searching over child vector passages and all four of the nearest vectors are from the same parent document? This would end up returning just one parent document, which would be surprising. This same kind of issue occurs with post-filtering. Figure 2: Documents 3, 5, 10 are parent docs. 1, 2 belong to 3; 4 to 5; 6, 7, 8, 9 to 10. Let us search with query vector A, and the four nearest passage vectors are 6, 7, 8, 9. With “post-joining,” you only end up retrieving parent document 10. Figure 3: Vector “A” matching nearest all the children of 10. What can we do about this problem? One answer could be, “Just increase the number of vectors returned!” However, at scale, this is untenable. What if every parent has at least 100 children and you want the top 1,000 nearest neighbors? That means you have to search for at least 100,000 children! This gets out of hand quickly. So, what’s another solution? Pre-joining to the rescue The solution to the “post-joining” problem is “pre-joining.” Recently added changes to Lucene enable joining against the parent document while searching the HNSW graph! Like with kNN pre-filtering , this ensures that when asked to find the k nearest neighbors of a query vector, we can return not the k nearest passages as represented by dense vectors, but k nearest documents , as represented by their child passages that are most similar to the query vector. What does this actually look like in practice? Let’s assume we are searching the same nested documents as before: Figure 4: Documents 3, 5, 10 are parent docs. 1,2 belong to 3; 4 to 5; 6, 7, 8, 9 to 10. As we search and score documents, instead of tracking children, we track the parent documents and update their scores. Figure 5 shows a simple flow. For each child document visited, we get its score and then track it by its parent document ID. This way, as we search and score the vectors we only gather the parent IDs. This ensures diversification of results with no added complexity to the HNSW algorithm using already existing and powerful tools within Lucene. All this with only a single additional bit of memory required per vector stored. Figure 5: As we search the vectors, we score and collect the associated parent document. Only updating the score if it is more competitive than the previous. But, how is this efficient? Glad you asked! There are certain restrictions that provide some really nice short cuts. As you can tell from the previous examples, all parent document IDs are larger than child IDs. Additionally, parent documents do not contain vectors themselves, meaning children and parents are purely disjoint sets . This affords some nice optimizations via bit sets . A bit set provides an exceptionally fast structure for “tell me the next bit that is set.” For any child document, we can ask the bit set, “Hey, what's the number that is larger than me that is in the set?” Since the sets are disjoint, we know the next bit that is set is the parent document ID. Conclusion In this post, we explored both the challenges of supporting dense document retrieval at scale and our proposed solution using nested fields and joins in Lucene. This work in Lucene paves the way to more naturally storing and searching dense vectors of passages from long text in documents and an overall improvement in document modeling for vector search in Elasticsearch . This is a very exciting step forward for vector search in Elasticsearch! If you want to chat about this or anything else related to vector search in Elasticsearch, come join us in our Discuss forum . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. In this blog post, we may have used or referred to third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch, ESRE, Elasticsearch Relevance Engine and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to The (kNN) problem with parents and children Pre-joining to the rescue Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Adding passage vector search to Lucene - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/adding-passage-vector-search-to-lucene",
+    "meta_description": "Here's how to add passage vectors to Lucene, the benefits of doing so and how existing Lucene structures can be used to create an efficient retrieval experience."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Apache Lucene 9.9, the fastest Lucene release ever Lucene 9.9 brings major speedups to query evaluation. Here are the performance improvements observed in nightly benchmarks & optimization resources. Lucene AG By: Adrien Grand On December 7, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Apache Lucene development has always been vibrant, but the last few months have seen an especially high number of optimizations to query evaluation. There isn't one optimization that can be singled out, it's rather a combination of many improvements around mechanical sympathy and improved algorithms. What is especially interesting here is that these optimizations do not only benefit some very specific cases, they translate into actual speedups in Lucene's nightly benchmarks , which aim at tracking the performance of queries that are representative of the real world. Just hover on annotations to see where a speedup (or slowdown sometimes!) is coming from. By the way, special thanks to Mike McCandless for maintaining Lucene's nightly benchmarks on his own time and hardware for almost 13 years now! Key speedup benchmarks in Lucene Here are some speedups that nightly benchmarks observed between Lucene 9.6 (May 2023) and Lucene 9.9 (December 2023): AndHighHigh : 35% faster AndHighMed : 15% faster OrHighHigh : 60% faster OrHighMed : 38% faster CountAndHighHigh : 15% faster CountAndHighMed : 11% faster CountOrHighHigh : 145% faster CountOrHighMed : 155% faster TermDTSort : 24% faster TermTitleSort : 290% faster (not a typo!) TermMonthSort : 7% faster DayOfYearSort : 25% faster VectorSearch : 5% faster Lucene optimization resources In case you are curious about these changes, here are resources that describe some of the optimizations that we applied: Bringing speedups to top-k queries with many and/or high-frequency terms (annotation FK) More skipping with block-max MAXSCORE (annotation FU) Accelerating vector search with SIMD instructions Vector similarity computations FMA-style Lucene 9.9 was just released and is expected to be integrated into Elasticsearch 8.12, which should get released soon. Stay tuned! Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Key speedup benchmarks in Lucene Lucene optimization resources Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Apache Lucene 9.9, the fastest Lucene release ever - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/apache-lucene-9.9-search-speedups",
+    "meta_description": "Lucene 9.9 brings major speedups to query evaluation. Here are the performance improvements observed in nightly benchmarks & optimization resources."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series The Spotify Wrapped series Here's how to create your own Spotify Wrapped in Kibana and dive deep into your data. Part1 How To January 14, 2025 How to make your own Spotify Wrapped in Kibana Based on the downloadable Spotify personal history, we'll make a custom version of \"Spotify Wrapped\" with the top artists, songs, and trends over the year IF By: Iulia Feroli Part2 How To February 25, 2025 Spotify Wrapped part 2: Diving deeper into the data We will dive deeper into your Spotify data than ever before and explore connections you didn't even know existed. PK By: Philipp Kahr Part3 How To March 24, 2025 Anomaly detection population jobs: Spotify Wrapped, part 3 Anomaly detection can be a daunting task at first, but in this blog, we'll dive into it and figure out how the different jobs can help us find unusual patterns in our Spotify Wrapped data. PK By: Philipp Kahr Part4 How To April 1, 2025 Detecting relationships in data: Spotify Wrapped, part 4 Graphs are a powerful tool for detecting relationships in data. In this blog, we'll explore the relationships between artists and your music taste. PK By: Philipp Kahr Part5 How To April 10, 2025 Finding your best music friend with vectors: Spotify Wrapped, part 5 Understanding vectors has never been easier. Handcrafting vectors and figuring out various techniques to find your music friend in a heavily biased dataset. PK VB By: Philipp Kahr and Vincent Bosc Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "The Spotify Wrapped series - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/spotify-wrapped-kibana",
+    "meta_description": "Here's how to create your own Spotify Wrapped in Kibana and dive deep into your data."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Elasticsearch geospatial search This series covers how to use the new geospatial search features in ES|QL, including how to ingest geospatial data and how to use it in ES|QL queries. Part1 Python How To August 12, 2024 Elasticsearch geospatial search with ES|QL Geospatial search in Elasticsearch Query Language (ES|QL). Elasticsearch has powerful geospatial search features, which are now coming to ES|QL for dramatically improved ease of use and OGC familiarity. CT By: Craig Taverner Part2 How To October 25, 2024 Ingest geospatial data into Elasticsearch with Kibana for use in ES|QL How to use Kibana and the csv ingest processor to ingest geospatial data into Elasticsearch for use with search in Elasticsearch Query Language (ES|QL). Elasticsearch has powerful geospatial search features, which are now coming to ES|QL for dramatically improved ease of use and OGC familiarity. But to use these features, we need Geospatial data. CT By: Craig Taverner Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch geospatial search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/elasticsearch-geospatial-search",
+    "meta_description": "This series covers how to use the new geospatial search features in ES|QL, including how to ingest geospatial data and how to use it in ES|QL queries."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to choose between exact and approximate kNN search in Elasticsearch Learn more about exact and approximate kNN search in Elasticsearch, and when to use each one. Vector Database How To CD By: Carlos Delgado On May 20, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. What is kNN? Semantic search is a powerful tool for relevance ranking. It allows you to go beyond using just keywords, but consider the actual meaning of your documents and queries. Semantic search is based on vector search . In vector search, the documents we want to search have vector embeddings calculated for them. These embeddings are calculated using machine learning models, and returned as vectors that are stored alongside our document data. When a query is performed, the same machine learning model is used to calculate the embeddings for the query text. Semantic search consists of finding the closest results to the query, by comparing the query embeddings to the document embeddings. kNN, or k nearest neighbors , is a technique for obtaining the top k closest results to a specific embedding. There are two main approaches for calculating kNN for a query using embeddings: Exact and approximate. This post will help you: Understand what exact and approximate kNN search is How to prepare your index for these approaches How to decide which approach is best for your use case Exact kNN: Search everything One approach for calculating the closer results would be comparing all the existing document embeddings with the one for the query. This would ensure that we're getting the closest matches possible, as we will be comparing all of them. Our search results would be as accurate as possible, as we're considering our whole document corpus and comparing all our document embeddings with the query embeddings. Of course, comparing against all the documents has a downside: it takes time. We will be calculating the embeddings similarity one by one, using a similarity function, over all the documents. This also means that we will be scaling linearly - having twice the number of documents will potentially take twice as long. Exact search can be done on vector fields using script_score with a vector function for calculating similarity between vectors. Approximate kNN: A good estimate A different approach is to use an approximation instead of considering all the documents. For providing an efficient approximation to kNN, Elasticsearch and Lucene use Hierarchical Navigation Small Worlds HNSW . HNSW is a graph data structure that maintains links between elements that are close together, at different layers. Each layer contains elements that are connected, and are also connected to elements of the layer below it. Each layer contains more elements, with the bottom layer containing all the elements. Figure 1 - An example of a HNSW graph. The top layer contains the initial nodes to start the search. These initial nodes serve as entry points to the lower layers, each containing more nodes. The lower layer contains all nodes. Think of it as driving; there are highways, roads and streets. When driving on a highway you will see some exit signs that describe some high-level areas (like a town or a neighborhood). Then you get to a road that has directions for specific streets. Once you get to a street, you can reach a specific address and the ones that are in the same neighborhood. HNSW is similar to that, as it creates different levels of vector embeddings. It calculates the highway that is closer to the initial query, and chooses the exits that look more promising to keep looking for the closer addresses to the one we're looking for. This is great in terms of performance, as it doesn't have to consider all documents, but uses this multi-level approach to quickly find an approximation of the closer ones. But, it's an approximation. Not all nodes are interconnected, meaning that it's possible to overlook results that are closer to specific nodes as they might not be connected. The interconnection of the nodes depends on how the HNSW structure is created. How good HNSW is depends on several factors: How it is constructed. The HNSW construction process will consider a number of candidates to track as the closer ones to a specific node. Increasing the number of candidates to consider will produce a more precise structure, at the cost of spending more time creating it at indexing time. The ef_construction parameter in the dense vector index_options is used for this. How many candidates we're considering when searching. When looking for the closer results, the process will keep track of a number of candidates. The bigger this number is, the more precise it will be, and the slower our search will be. The num_candidates in kNN parameters controls this behavior. How many segments we're searching. Each segment has a HNSW graph that needs to be searched for, and its results combined with the other segment graphs. Having fewer segments will mean searching fewer graphs (so it'll be faster), but will have a less diverse set of results (so it will be less precise). Overall, HNSW offers a good tradeoff between performance and recall, and allows fine tuning both at indexing and query side. Searching using HNSW can be done using the kNN search section in most of the situations. Using a kNN query is also possible for more advanced use cases, like: Combining kNN with other queries (as part of bool queries, or pinned queries) Using function_score to fine tune the scoring Improve aggregations and field collapse diversity You can check about kNN query and the differences to the kNN search section in this post . We'll dive into when you'll want to use this method versus the others below. Indexing for exact and approximate search dense_vector field type There are two main indexing types for dense_vector fields you can choose from for storing your embeddings: flat types (including flat and int8_flat ) store the raw vectors, without adding HNSW data structures. dense_vectors that use flat indexing type will always use exact kNN - the kNN query will actually perform an exact query instead of an approximate one. HNSW types (including hnsw and int8_hnsw ) create the HNSW data structure, allowing approximate kNN search to be used. Does it mean that you can't use exact kNN with HNSW field types? Not really! You can use both exact kNN via a script_score query , or use approximate kNN via the kNN section and the kNN query . This allows more flexibility depending on your search use case. Using HNSW field types means that the HNSW graph structure needs to be built, and that takes time, memory and disk space. If you'll just be using exact search, you can use the flat vector field type. This ensures that your embeddings are indexed optimally and use less space. Remember to always avoid storing your embeddings in _source in any case, to reduce your storage needs. Quantization Using quantization , either flat (int8_flat) or HNSW (int8_hnsw) types of indexing will help you reduce your embeddings size, so you will be able to use less memory and disk storage for holding your embeddings information. As search performance relies on your embeddings fitting as much as possible in memory, you should always look for ways of reducing your data if possible. Using quantization is a trade off between memory and recall. How do I choose between exact and approximate kNN search? There's no one-size-fits-all answer here. You need to consider a number of factors, and experiment, in order to get to the optimal balance between performance and accuracy: Data size Searching everything is not something you should avoid at all costs. Depending on your data size (in terms of number of documents and embedding dimensions), it might make sense to do an exact kNN search. As a rule of thumb, having less than 10 thousand documents to search could be an indication that exact search should be used. Keep in mind that the number of documents to search can be filtered in advance, so the effective number of documents to search can be restricted by the filters applied. Approximate search scales better in terms of the number of documents, so it should be the way to go if you're having a big number of documents to search, or expect it to increase significantly. Figure 2 - an example run of exact and approximate kNN using 768 dimensions vectors from the so_vector rally track . The example demonstrates the linear runtime of exact kNN vs the logarithmic runtime of HNSW searches. Filtering Filtering is important, as it reduces the number of documents to consider for searching. This needs to be taken into account when deciding on using exact vs approximate. You can use query filters for reducing the number of documents to consider, both for exact and approximate search. However, approximate search takes a different approach for filtering. When doing approximate searches using HNSW, query filters will be applied after the top k results have been retrieved. That's why using a query filter along with a kNN query is referred to as a post-filter for kNN . Figure 3 - Post-filtering in kNN search. The problem with using post-filters in kNN is that the filter is being applied after we have gathered the top k results. This means that we can end up with less than k results, as we need to remove the elements that don't pass the filter from the top k results that we've already retrieved from the HNSW graph. Fortunately, there is another approach to use with kNN, and is specifying a filter in the kNN query itself. This filter applies to the graph elements as the results are being gathered traversing the HNSW graph, instead of applying it afterwards. This ensures that the top k elements are returned, as the graph will be traversed - skipping the elements that don't pass the filter - until we get the top k elements. Figure 4 - Pre-filtering in kNN search. This specific kNN query filter is called a kNN pre-filter to specify that it is applied before retrieving results, as opposed to applying it afterwards. That's why, in the context of using a kNN query, regular query filters are referred to as post-filters. Using a kNN pre-filter affects approximate search performance, as we need to consider more elements while searching in the HNSW graph - we will be discarding the elements that don't pass the filter, so we need to look for more elements on each search to retrieve the same number of results. Future enhancements for kNN There are some improvements that are coming soon that will help with exact and approximate kNN. Elasticsearch will add the possibility to upgrade a dense_vector type from flat to HNSW. This means that you'll be able to start using a flat vector type for exact kNN, and eventually start using HNSW when you need the scale. Your segments will be searched transparently for you when using approximate kNN, and will be transformed automatically to HNSW when they are merged together. A new exact kNN query will be added so a simple query will be used to do exact kNN for both flat and HNSW fields, instead of relying on a script score query. This will make exact kNN more straightforward. Conclusion So, should you use approximate or exact kNN on your documents? Check the following: How many documents? Less than 10 thousand (after applying filters) would probably be a good case for exact. Are your searches using filters? This impacts how many documents will be searched. If you need to use approximate kNN, remember to use kNN pre-filters to get more results at the expense of performance. You can compare the performance of both approaches by indexing using a HNSW dense_vector, and comparing kNN search against a script_score for doing exact kNN. This allows comparing both methods using the same field type (just remember to change your dense_vector field type to flat in case you decide to go for exact search!) Happy searching! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to What is kNN? Exact kNN: Search everything Approximate kNN: A good estimate Indexing for exact and approximate search dense_vector field type Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to choose between exact and approximate kNN search in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/knn-exact-vs-approximate-search",
+    "meta_description": "Learn more about exact and approximate kNN search in Elasticsearch, and when to use each one."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog From ES|QL to PHP objects Learn how to execute and manage ES|QL queries in PHP. Follow this guide to map ES|QL results to a PHP object or custom class. ES|QL PHP How To EZ By: Enrico Zimuel On April 8, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Starting from elasticsearch-php v8.13.0 you can execute ES|QL queries and map the result to a PHP object of stdClass or a custom class. ES|QL ES|QL is a new Elasticsearch Query Language introduced in Elasticsearch 8.11.0. Right now, it is available in technical preview. It provides a powerful way to filter, transform, and analyze data stored in Elasticsearch. It makes use of \"pipes\" ( | ) to manipulate and transform data in a step-by-step fashion. This approach allows users to compose a series of operations, where the output of one operation becomes the input for the next, enabling complex data transformations and analysis. For instance, the following query returns the first 3 documents (rows) of the sample_data index: Use case: ES|QL features in the official PHP client To illustrate the ES|QL features developed in the official PHP client, we stored in Elasticsearch a CSV file of 81,828 books (54.4 MB) including the following information: We extracted this list from the public available Amazon Books Reviews dataset . We created a books index with the following Elasticsearch mappings: The rating value is the average of the ranking reviews taken from the Books_rating.csv file of 2.9 GB. Here you can find the PHP script that we used to bulk import all the books in Elasticsearch. The bulk operation took 7 sec and 28 MB RAM using PHP 8.2.17. With the proposed mapping the index size in Elasticsearch is about 62 MB. Map ES|QL results to a PHP object or custom class We can execute ES|QL query in PHP using the esql()->query() endpoint. The result of this query is a a table data structure. This is expressed in JSON using the columns and values fields. In the columns field we have the name and type definition. Here is an example of ES|QL query to retrieve the top-10 books written by Stephen King ordered by the user ranking reviews: The JSON result from Elasticsearch looks as follows: In this example we have 6 properties (author, description, publisher, rating, title, year) related to a book and 10 results, all books by Stephen King. A list of all the supported types in ES|QL is reported here . The $result response object can be accessed as an array, a string or as an object (see here for more information). Using the object interface, we can access the values using properties and indexes. For instance, $result->values[0][4] returns the title (4) of the first book (0) in the list, $result->values[1][3] returns the rank score (3) of the second book (1), etc. Remember, the index of an array in PHP starts from zero. This interface can be good enough for some use cases but most of the time we would like to have an array of objects as result. To map the result into an array of objects we can use the new mapTo() feature of elasticsearch-php. This function is available directly in the Elasticsearch response object . That means you can access it as follows: If you have a custom Book class, you can map the result using it, as follows: If your class has other properties in addition to the ones included in the ES|QL result, this will work as well. The mapTo() function will use only the properties returned as columns of the ES|QL result. You can download all the examples reported in this article here . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to ES|QL Use case: ES|QL features in the official PHP client Map ES|QL results to a PHP object or custom class Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "From ES|QL to PHP objects - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-php-map-object-class",
+    "meta_description": "Learn how to execute and manage ES|QL queries in PHP. Follow this guide to map ES|QL results to a PHP object or custom class."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Perform vector search in Elasticsearch with the Elasticsearch Go client Learn how to perform vector search in Elasticsearch using the Elasticsearch Go client through a practical example. Vector Database How To CR LS By: Carly Richmond and Laurent Saint-Félix On November 1, 2023 Part of Series Using the Elasticsearch Go client for keyword search, vector search & hybrid search Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Building software in any programming language, including Go, is committing to a lifetime of learning. Through her university and working career, Carly has dabbled in many programming languages and technologies, including the latest and greatest implementations of vector search. But that wasn't enough! So recently Carly started playing with Go, too. Just like animals, programming languages, and your friendly author, search has undergone an evolution of different practices that can be difficult to decide between for your own search use case. In this blog, we'll share an overview of vector search along with examples of each approach using Elasticsearch and the Elasticsearch Go client . These examples will show you how to find gophers and determine what they eat using vector search in Elasticsearch and Go. Prerequisites To follow with this example, ensure the following prerequisites are met: Installation of Go version 1.21 or later Creation of your own Go repo with the Creation of your own Elasticsearch cluster, populated with a set of rodent-based pages, including for our friendly Gopher , from Wikipedia: Connecting to Elasticsearch In our examples, we shall make use of the Typed API offered by the Go client. Establishing a secure connection for any query requires configuring the client using either: Cloud ID and API key if making use of Elastic Cloud. Cluster URL, username, password and the certificate. Connecting to our cluster located on Elastic Cloud would look like this: The client connection can then be used for vector search, as shown in subsequent sections. Vector search Vector search attempts to solve this problem by converting the search problem into a mathematical comparison using vectors. The document embedding process has an additional stage of converting the document using a model into a dense vector representation, or simply a stream of numbers. The advantage of this approach is that you can search non-text documents such as images and audio by translating them into a vector alongside a query. In simple terms, vector search is a set of vector distance calculations. In the below illustration, the vector representation of our query Go Gopher is compared against the documents in the vector space, and the closest results (denoted by constant k ) are returned: Depending on the approach used to generate the embeddings for your documents, there are two different ways to find out what gophers eat. Approach 1: Bring your own model With a Platinum license, it's possible to generate the embeddings within Elasticsearch by uploading the model and using the inference API. There are six steps involved in setting up the model: Select a PyTorch model to upload from a model repository. For this example, we're using the sentence-transformers/msmarco-MiniLM-L-12-v3 from Hugging Face to generate the embeddings. Load the model into Elastic using the Eland Machine Learning client for Python using the credentials for our Elasticsearch cluster and task type text_embeddings . If you don't have Eland installed, you can run the import step using Docker , as shown below: Once uploaded, quickly test the model sentence-transformers__msmarco-minilm-l-12-v3 with a sample document to ensure the embeddings are generated as expected: Create an ingest pipeline containing an inference processor. This will allow the vector representation to be generated using the uploaded model: Create a new index containing the field text_embedding.predicted_value of type dense_vector to store the vector embeddings generated for each document: Reindex the documents using the newly created ingest pipeline to generate the text embeddings as the additional field text_embedding.predicted_value on each document: Now we can use the Knn option on the same search API using the new index vector-search-rodents , as shown in the below example: Converting the JSON result object via unmarshalling is done in the exact same way as the keyword search example. Constants K and NumCandidates allow us to configure the number of neighbor documents to return and the number of candidates to consider per shard. Note that increasing the number of candidates increases the accuracy of results but leads to a longer-running query as more comparisons are performed. When the code is executed using the query What do Gophers eat? , the results returned look similar to the below, highlighting that the Gopher article contains the information requested unlike the prior keyword search: Approach 2: Hugging Face inference API Another option is to generate these same embeddings outside of Elasticsearch and ingest them as part of your document. As this option does not make use of an Elasticsearch machine learning node, it can be done on the free tier. Hugging Face exposes a free-to-use, rate-limited inference API that, with an account and API token, can be used to generate the same embeddings manually for experimentation and prototyping to help you get started. It is not recommended for production use. Invoking your own models locally to generate embeddings or using the paid API can also be done using a similar approach. In the below function GetTextEmbeddingForQuery we use the inference API against our query string to generate the vector returned from a POST request to the endpoint: The resulting vector, of type []float32 is then passed as a QueryVector instead of using the QueryVectorBuilder option to leverage the model previously uploaded to Elastic. Note that the K and NumCandidates options remain the same irrespective of the two options and that the same results are generated as we are using the same model to generate the embeddings Conclusion Here we've discussed how to perform vector search in Elasticsearch using the Elasticsearch Go client . Check out the GitHub repo for all the code in this series. Follow on to part 3 to gain an overview of combining vector search with the keyword search capabilities covered in part one in Go. Until then, happy gopher hunting! Resources Elasticsearch Guide Elasticsearch Go client What is vector search? | Elastic Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Connecting to Elasticsearch Vector search Approach 1: Bring your own model Approach 2: Hugging Face inference API Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Perform vector search in Elasticsearch with the Elasticsearch Go client - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/perform-vector-search-with-the-elasticsearch-go-client",
+    "meta_description": "Learn how to perform vector search in Elasticsearch using the Elasticsearch Go client through a practical example."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing LangChain4j to simplify LLM integration into Java applications LangChain4j (LangChain for Java) is a powerful toolset to build your RAG application in plain Java. Integrations Java How To DP By: David Pilato On September 23, 2024 Part of Series Introducing LangChain4j: Building RAG apps in plain Java Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The LangChain4j framework was created in 2023 with this target : The goal of LangChain4j is to simplify integrating LLMs into Java applications. LangChain4j is providing a standard way to: create embeddings (vectors) from a given content, let say a text for example store embeddings in an embedding store search for similar vectors in the embedding store discuss with LLMs use a chat memory to remember the context of a discussion with an LLM This list is not exhaustive and the LangChain4j community is always implementing new features. This post will cover the first main parts of the framework. Adding LangChain4j OpenAI to our project Like in all Java projects, it's just a matter of dependencies. Here we will be using Maven but the same could be achieved with any other dependency manager. As a first step to the project we want to build here, we will be using OpenAI so we just need to add the langchain4j-open-ai artifact: For the rest of the code we will be using either our own API key, which you can get by registering for an account with OpenAI , or the one provided by LangChain4j project for demo purposes only: We can now create an instance of our ChatLanguageModel: And finally we can ask a simple question and get back the answer: The given answer might be something like: If you'd like to run this code, please check out the Step1AiChatTest.java class. Providing more context with langchain4j Let's add the langchain4j artifact: This one is providing a toolset which can help us build a more advanced LLM integration to build our assistant. Here we will just create an Assistant interface which provides the chat method which will be calling automagically the ChatLanguageModel we defined earlier: We just have to ask LangChain4j AiServices class to build an instance for us: And then call the chat(String) method: This is having the same behavior as before. So why did we change the code? In the first place, it's more elegant but more than that, you can now give some instructions to the LLM using simple annotations: This is now giving: If you'd like to run this code, please check out the Step2AssistantTest.java class. Switching to another LLM: langchain4j-ollama We can use the great Ollama project . It helps to run a LLM locally on your machine. Let's add the langchain4j-ollama artifact: As we are running the sample code using tests, let's add Testcontainers to our project: We can now start/stop Docker containers: We \"just\" have to change the model object to become an OllamaChatModel instead of the OpenAiChatModel we used previously: Note that it could take some time to pull the image with its model, but after a while, you could get the answer: Better with memory If we ask multiple questions, by default the system won't remember the previous questions and answers. So if we ask after the first question \"When was he born?\", our application will answer: Which is nonsense. Instead, we should use Chat Memory : Running the same questions now gives a meaningful answer: Conclusion In the next post , we will discover how we can ask questions to our private dataset using Elasticsearch as the embedding store. That will give us a way to transform our application search to the next level. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Adding LangChain4j OpenAI to our project Providing more context with langchain4j Switching to another LLM: langchain4j-ollama Better with memory Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing LangChain4j to simplify LLM integration into Java applications - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/langchain4j-llm-integration-introduction",
+    "meta_description": "LangChain4j (LangChain for Java) is a powerful toolset to build your RAG app in plain Java. Here's how to add LangChain4j to your project and more."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog From ES|QL to Pandas dataframes in Python Learn how to export ES|QL queries as Pandas dataframes in Python through practical examples. Python ES|QL How To QP By: Quentin Pradet On March 11, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Update: When we published this article in March 2024, Elasticsearch did not yet support Apache Arrow streaming format. This is possible now, see \"From ES|QL to native Pandas dataframes in Python\" for more details. The Elasticsearch Query Language (ES|QL) provides a powerful way to filter, transform, and analyze data stored in Elasticsearch. Designed to be easy to learn and use, it is a perfect fit for data scientists familiar with Pandas and other dataframe-based libraries. Indeed, ES|QL queries produce tables with named columns, which is the definition of dataframes! This blog explains how to export ES|QL queries as Pandas dataframes in Python. ES|QL to Pandas dataframes in Python Importing test data First, let's import some test data. We will be using the employees sample data and mappings . The easiest way to load this dataset is to run those two Elasticsearch API requests in the Kibana Console . Converting dataset to a Pandas DataFrame object OK, with that out of the way, let's convert the full employees dataset to a Pandas DataFrame object using the ES|QL CSV export: Even though this dataset only contains 100 records, we use a LIMIT command to avoid ES|QL warning us about potentially missing records. This prints the following dataframe: This means you can now analyze the data with Pandas. But you can also continue massaging the data using ES|QL, which is particuarly useful when queries return more than 10,000 rows, the current maximum number of rows that ES|QL queries can return. Analyzing the data with Pandas In the next example, we're counting how many employees are speaking a given language by using STATS ... BY (not unlike GROUP BY in SQL). And then we sort the result with the languages column using SORT : Note that we've used the dtype parameter of pd.read_csv() here, which is useful when the type inferred by Pandas is not enough. The above code prints the following: 21 employees speak 5 languages, wow! Finally, suppose that end users of your code control the minimum number of languages spoken. You could format the query directly in Python, but it would allow an attacker to perform an ES|QL injection! Instead, use the built-in parameters support of the ES|QL REST API: which prints the following: Conclusion As you can see, ES|QL and Pandas play nicely together. However, CSV is not the ideal format as it requires explicit type declarations and doesn't handle well some of the more elaborate results that ES|QL can produce, such as nested arrays and objects. For this, we are working on adding native support for Apache Arrow dataframes in ES|QL, which will make all this transparent and bring significant performance improvements. Additional resources If you want to learn more about ES|QL, the ES|QL documentation is the best place to start. You can also check out this other Python example using Boston Celtics data . To know more about the Python Elasticsearch client itself, you can refer to the documentation , ask a question on Discuss with the language-clients tag or open a new issue if you found a bug or have a feature request. Thank you! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to ES|QL to Pandas dataframes in Python Importing test data Converting dataset to a Pandas DataFrame object Analyzing the data with Pandas Conclusion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "From ES|QL to Pandas dataframes in Python - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-pandas-dataframes-python",
+    "meta_description": "Learn how to export ES|QL queries as Pandas dataframes in Python through practical examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Lexical and semantic search with Elasticsearch In this blog, we'll explore various approaches to retrieving information using Elasticsearch, focusing on lexical and semantic search. Vector Database Python PP By: Priscilla Parodi On October 3, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Search is the process of locating the most relevant information based on your search query or combined queries and relevant search results are documents that best match these queries. Although there are several challenges and methods associated with search, the ultimate goal remains the same, to find the best possible answer to your question . Considering this goal, in this blog post, we will explore different approaches to retrieving information using Elasticsearch, with a specific focus on text search: lexical and semantic search. Prerequisites To accomplish this, we will provide Python examples that demonstrate various search scenarios on a dataset generated to simulate e-commerce product information. This dataset contains over 2,500 products, each with a description. These products are categorized into 76 distinct product categories, with each category containing a varying number of products, as shown below: Treemap visualization - top 22 values of category.keyword (product categories) For the setup you will need: Python 3.6 or later The Elastic Python client Elastic 8.8 deployment or later, with 8GB memory machine learning node The Elastic Learned Sparse EncodeR model that comes pre-loaded into Elastic installed and started on your deployment We will be using Elastic Cloud, a free trial is available . Besides the search queries provided in this blog post, a Python notebook will guide you through the following processes: Establish a connection to our Elastic deployment using the Python client Load a text embedding model into the Elasticsearch cluster Create an index with mappings for indexing feature vectors and dense vectors. Create an ingest pipeline with inference processors for text embedding and text expansion Lexical search - sparse retrieval The classic way documents are ranked for relevance by Elasticsearch based on a text query uses the Lucene implementation of the BM25 model, a sparse model for lexical search . This method follows the traditional approach for text search, looking for exact term matches. To make this search possible, Elasticsearch converts text field data into a searchable format by performing text analysis. Text analysis is performed by an analyzer , a set of rules to govern the process of extracting relevant tokens for searching. An analyzer must have exactly one tokenizer . The tokenizer receives a stream of characters and breaks it up into individual tokens (usually individual words), like in the example below: String tokenization for lexical search Output In this example we are using the default analyzer, the standard analyzer, which works well for most use cases as it provides English grammar based tokenization. Tokenization enables matching on individual terms, but each token is still matched literally. If you want to personalize your search experience you can choose a different built-in analyzer . Example, by updating the code to use the stop analyzer it will break the text into tokens at any non-letter character with support for removing stop words. Output When the built-in analyzers do not fulfill your needs, you can create a custom analyzer , which uses the appropriate combination of zero or more character filters , a tokenizer and zero or more token filters . In the above example that combines a tokenizer and token filters, the text will be lowercased by the lowercase filter before being processed by the synonyms token filter . Lexical matching BM25 will measure the relevance of documents to a given search query based on the frequency of terms and its importance. The code below performs a match query, searching for up to two documents considering \"description\" field values from the \"ecommerce-search\" index and the search query \" Comfortable furniture for a large balcony \" . Refining the criteria for a document to be considered a match for this query can improve the precision. However, more specific results come at the cost of a lower tolerance for variations. Output By analyzing the output, the most relevant result is the \" Barbie Dreamhouse \" product, in the \" Toys \" category, and its description is highly relevant as it includes the terms \" furniture \", \" large\" and \"balcony \", this is the only product with 3 terms in the description that match the search query, the product is also the only one with the term \"balcony\" in the description. The second most relevant product is a \" Comfortable Rocking Chair \" categorized as \" Indoor Furniture \" and its description includes the terms \" comfortable \" and \" furniture \". Only 3 products in the dataset match at least 2 terms of this search query, this product is one of them. \"Comfortable\" appears in the description of 105 products and \"furniture\" in the description of 4 products with 4 different categories: Toys , Indoor Furniture, Outdoor Furniture and 'Dog and Cat Supplies & Toys'. As you could see, the most relevant product considering the query is a toy and the second most relevant product is indoor furniture. If you want detailed information about the score computation to know why these documents are a match, you can set the explain __query parameter to true. Despite both results being the most relevant ones, considering both the number of documents and the occurrence of terms in this dataset, the intention behind the query \" Comfortable furniture for a large balcony \" is to search for furniture for an actual large balcony, excluding among others, toys and indoor furniture. Lexical search is relatively simple and fast , but it has limitations since it is not always possible to know all the possible terms and synonyms without necessarily knowing the user's intention and queries. A common phenomenon in the usage of natural language is vocabulary mismatch . Research shows that, on average, 80% of the time different people (experts in the same field) will name the same thing differently. These limitations motivate us to look for other scoring models that incorporate semantic knowledge. Transformer-based models, which excel at processing sequential input tokens like natural language, capture the underlying meaning of your search by considering mathematical representations of both documents and queries. This allows for a dense, context aware vector representation of text, powering Semantic Search , a refined way to find relevant content. Semantic search - dense retrieval In this context, after converting your data into meaningful vector values, k-nearest neighbor (kNN) search algorithm is utilized to find vector representations in a dataset that are most similar to a query vector. Elasticsearch supports two methods for kNN search, exact brute-force kNN and approximate kNN , also known as ANN. Brute-force kNN guarantees accurate results but doesn't scale well with large datasets. Approximate kNN efficiently finds approximate nearest neighbors by sacrificing some accuracy for improved performance. With Lucene's support for kNN search and dense vector indexes, Elasticsearch takes advantage of the Hierarchical Navigable Small World (HNSW) algorithm, which demonstrates strong search performance across a variety of ann-benchmark datasets . An approximate kNN search can be performed in Python using the below example code. Semantic search with approximate kNN This code block uses Elasticsearch's kNN to return up to two products with a description similar to the vectorized query (query_vector_build) of \" Comfortable furniture for a large balcony \" considering the embeddings of the “ description ” field in the products dataset. The products embeddings were previously generated in an ingest pipeline with an inference processor containing the \" all-mpnet-base-v2 \" text embedding model to infer against data that was being ingested in the pipeline. This model was chosen based on the evaluation of pretrained models using \" sentence_transformers.evaluation \" where different classes are used to assess a model during training. The \"all-mpnet-base-v2\" model demonstrated the best average performance according to the Sentence-Transformers ranking and also secured a favorable position on the Massive Text Embedding Benchmark (MTEB) Leaderboard. The model pre-trained microsoft/mpnet-base model and fine-tuned on a 1B sentence pairs dataset, it maps sentences to a 768 dimensional dense vector space. Alternatively, there are many other models available that can be used, especially those fine-tuned for your domain-specific data. Output The output may vary based on the chosen model, filters and approximate kNN tune . The kNN search results are both in the \" Outdoor Furniture \" category, even though the word \" outdoor \" was not explicitly mentioned as part of the query, which highlights the importance of semantics understanding in the context. Dense vector search offers several advantages: Enabling semantic search Scalability to handle very large datasets Flexibility to handle a wide range of data types However, dense vector search also comes with its own challenges : Selecting the right embedding model for your use case Once a model is chosen, fine-tuning the model to optimize performance on a domain-specific dataset might be necessary, a process that demands the involvement of domain experts Additionally, indexing high-dimensional vectors can be computationally expensive Semantic search - learned sparse retrieval Let’s explore an alternative approach: learned sparse retrieval, another way to perform semantic search. As a sparse model, it utilizes Elasticsearch's Lucene-based inverted index, which benefits from decades of optimizations. However, this approach goes beyond simply adding synonyms with lexical scoring functions like BM25. Instead, it incorporates learned associations using a deeper language-scale knowledge to optimize for relevance. By expanding search queries to include relevant terms that are not present in the original query, the Elastic Learned Sparse Encoder improves sparse vector embeddings , as you can see in the example below. Sparse vector search with Elastic Learned Sparse Encoder Output The results in this case include the \" Garden Furniture \" category, which offers products quite similar to \" Outdoor Furniture \". By analyzing \"ml.tokens\", the \"rank_features\" field containing Learned Sparse Retrieval generated tokens, it becomes apparent that among the various tokens generated there are terms that, while not part of the search query, are still relevant in meaning, such as \" relax \" (comfortable), \" sofa \" (furniture) and \" outdoor \" (balcony). The image below highlights some of these terms alongside the query, both with and without term expansion. As observed, this model provides a context-aware search and helps mitigate the vocabulary mismatch problem while providing more interpretable results. It can even outperform dense vector models when no domain-specific retraining is applied. Hybrid search: relevant results by combining lexical and semantic search When it comes to search, there is no universal solution. Each of these retrieval methods has its strengths but also its challenges. Depending on the use case, the best option may change. Often the best results across retrieval methods can be complementary. Hence, to improve relevance, we’ll look at combining the strengths of each method. There are multiple ways to implement a hybrid search , including linear combination, giving a weight to each score and reciprocal rank fusion (RRF), where specifying a weight is not necessary. Elasticsearch: best of both worlds with lexical and semantic search In this code, we performed a hybrid search with two queries having the value \" A dining table and comfortable chairs for a large balcony \". Instead of using \" furniture \" as a search term, we are specifying what we are looking for, and both searches are considering the same field values, \"description\". The ranking is determined by a linear combination with equal weight for the BM25 and ELSER scores. Output In the code below, we will use the same value for the query, but combine the scores from BM25 (query parameter) and kNN (knn parameter) using the reciprocal rank fusion method to combine and rank the documents. RRF functionality is in technical preview. The syntax will likely change before GA. Output Here we could also use different fields and values; some of these examples are available in the Python notebook . As you can see, with Elasticsearch you have the best of both worlds: the traditional lexical search and vector search, whether sparse or dense, to reach your goal and find the best possible answer to your question. If you want to continue learning about the approaches mentioned here, these blogs can be useful: Improving information retrieval in the Elastic Stack: Hybrid retrieval Vector search in Elasticsearch: The rationale behind the design How to get the best of lexical and AI-powered search with Elastic’s vector database Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model Elasticsearch provides a vector database, along with all the tools you need to build vector search: Elasticsearch vector database Vector search use cases with Elastic Conclusion In this blog post, we explored various approaches to retrieving information using Elasticsearch, focusing specifically on text, lexical and semantic search. To demonstrate this, we provided Python examples showcasing different search scenarios using a dataset containing e-commerce product information. We reviewed the classic lexical search with BM25 and discussed its benefits and challenges, such as vocabulary mismatch. We emphasized the importance of incorporating semantic knowledge to overcome this issue. Additionally, we discussed dense vector search, which enables semantic search, and covered the challenges associated with this retrieval method, including the computational cost when indexing high-dimensional vectors. On the other hand, we mentioned that sparse vectors compress exceptionally well. Thus, we discussed Elastic's Learned Sparse Encoder, which expands search queries to include relevant terms not present in the original query. There is no one-size-fits-all solution when it comes to search. Each retrieval method has its strengths and challenges. Therefore, we also discussed the concept of hybrid search. As you could see, with Elasticsearch, you can have the best of both worlds: traditional lexical search and vector search! Ready to get started? Check the available Python notebook and begin a free trial of Elastic Cloud . Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Prerequisites Lexical search - sparse retrieval String tokenization for lexical search Lexical matching Semantic search - dense retrieval Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Lexical and semantic search with Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/lexical-and-semantic-search-with-elasticsearch",
+    "meta_description": "In this blog, we'll explore various approaches to retrieving information using Elasticsearch, focusing on lexical and semantic search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Relativity uses Elasticsearch and Azure OpenAI to build AI search experiences With Elasticsearch Relevance Engine, you can create AI-powered search apps. Learn how Relativity uses Elastic & Azure Open AI for this goal. Generative AI HM AT By: Hemant Malik and Aditya Tripathi On July 5, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch has been used by developers to build search experiences for over a decade. At Microsoft Build this year, we announced the launch of Elasticsearch Relevance Engine — a set of tools to enable developers to build AI-powered search applications. With generative AI, large language models (LLMs), and vector search capabilities gaining mindshare, we are delighted to expand our range of tools and enable our customers in building the next generation of search apps. One example of what a next-generation search experience might look like comes from Relativity — the eDiscovery and legal search technology company. At Build, we shared the stage with the Relativity team as they spoke about how they’re using Elasticsearch and Microsoft Azure. You can read about Relativity’s coverage of Build on their blog . This blog explores how Relativity leverages Elasticsearch and Azure OpenAI to build futuristic search experiences. It also examines the key components of an AI-powered search experience and the important architectural considerations to keep in mind. About Relativity Relativity is the company behind RelativityOne , a leading cloud-based eDiscovery software solution. Relativity partners with Microsoft to innovate and deliver its solutions to hundreds of organizations, helping them manage, search, and act on large amounts of heterogeneous data. RelativityOne is respected in the industry for its global reach, and it is powered by Microsoft Azure infrastructure and a host of other Microsoft Azure services, such as Cognitive Services Translator. The RelativityOne product is built with scale in mind. Typical use cases involve ingesting large amounts of data provided for legal eDiscovery. This data is presented to legal teams via a search interface. In order to enable high-quality legal investigations, it is critical for the search experience to return highly accurate and relevant results, every time. Elasticsearch fits these requirements and is a key underlying technology. eDiscovery's future with generative AI Brittany Roush, senior product manager at Relativity, says, “The biggest challenge Relativity customers are facing right now is data explosion from heterogeneous data sources. The challenge is really compounded by the differences in data generated from different modes of communication.” This explosion of data, sources, and complexity renders traditional keyword search approaches ineffective. With Elasticsearch Relevance Engine (ESRE) , Relativity sees the potential of providing a search experience that goes beyond keyword search and basic conceptual search. ESRE provides an opportunity to natively tailor searches to case data. Relativity wants to augment the search experience with AI capabilities such as GPT-4, Signals, Classifications, and its in-house AI solutions. In this talk at Microsoft Build, Roush shared that in Relativity’s future vision for search, there are a few key challenges. As data grows exponentially, and as investigators must search through documents, images, and video records, traditional keyword search approaches can reach their limits. Privacy and confidentiality are key factors in the legal discovery process. Additionally, the ability to search as if you’re having a natural conversation and leveraging LLMs are all important factors when the team imagines the future of search. Elasticsearch for the future of AI search experiences In making the vision for the future of eDiscover search real, Relativity relies on Elasticsearch for its proven track record at scale, the ability to manage structured and unstructured data sources, and its leadership in search and hybrid scoring. ESRE provides an opportunity to natively tailor searches to case data. With Elasticsearch Relevance Engine, developers get a full vector database, the ability to manage multiple machine learning models or Elastic’s Learned Sparse Encoder that comes included with ESRE, and a plethora of data ingestion capabilities. Roush adds, “With ESRE and Azure OpenAI-powered search experiences, we aim to reduce what may take our customers months during an investigation to hours.” Components of an AI-powered search experience Natural language processing (NLP) NLP offers the ability to interact with a search interface using human language, or spoken intent. The search engine must identify intent and match it to data records. An example is matching “side hustle” to mean a secondary work occupation. Vector database This is a database that stores data as numerical representations, or vectors embeddings, across a vector space that may span several dimensions. Each dimension may be a mathematical representation of features or attributes used to describe search documents. Vector search A vector space is a mathematical representation of search documents as numbers. Traditional search relies on placement of keywords, frequency, and lexical similarity. A vector search engine uses numerical distances between vectors to represent similarity. Model flexibility and LLM integration In this fast-evolving space, support for public and proprietary machine learning models and the ability to switch models enables customers to keep up with innovation. Architectural considerations for an AI-powered search experience In order to build a search experience that understands natural language search prompts, stores underlying data as vectors, and leverages a large language model to present context-relevant responses, an approach such as the one below may be used: In order to achieve goals, for example building next-generation eDiscovery search leveraging LLM like OpenAI, users should consider the following factors: Cost LLMs can require large-scale resources and are trained on public data sets. Elasticsearch helps customers bring their private data and integrate with LLMs by passing a context window to keep the costs in check. Scale and flexibility Elasticsearch is a proven data store and vector database at petabyte scale . AI search experiences are powered by the data. The ability to ingest from a variety of private data sources is table stakes for the platform powering the solution. Elasticsearch as a datastore has been optimized over the years to host numerous data types , including the ability to store vectors. We cover Elastic’s ingestion capabilities in this recent webinar . Most AI-powered search experiences will benefit from having the flexibility of retrieval: keyword search, vector search, and the ability to deliver hybrid ranking. Elastic 8.8 introduced Elastic Learned Sparse Encoder in technical preview. This is a semantic search model trained and optimized by Elastic for superior relevance out of the box. Our model provides superior performance for vector and hybrid search approaches, and some of the work is documented in this blog . Elastic also supports a wide variety of third-party NLP transformer models to enable you to add NLP models that you may have trained for your use cases. Privacy and security Elasticsearch can help users limit access to domain-specific data by sharing only relevant context with LLM services. Combine this with Microsoft’s enterprise focus on data, privacy, and security of the Azure OpenAPI service, and users like Relativity can roll out search experiences leveraging generative AI built on proprietary data. For the private data hosted in Elasticsearch, applying Role Based Access Control will help protect sensitive data by configuring roles and corresponding access levels. Elasticsearch offers security options such as Document Level and Field Level security, which can restrict access based on domain-specific data sensitivity requirements. Elasticsearch Service is built with a security-first mindset and is independently audited and certified to meet industry compliance standards such as PCI DSS, SOC2, and HIPAA to name a few. Industry-specific considerations: Compliance, air-gapped, private clouds Elastic can go where our users are, whether that is on public cloud, private cloud, on-prem, and in air-gapped environments. For a privacy-first LLM experience, users can deploy proprietary transformer models to air-gapped environments. What will you build with Elastic and generative AI? We’re excited about the experiences that customers such as Relativity are building. The past few years in search have been very exciting, but with the rapid adoption of generative AI capabilities, we can’t wait to see what developers create with Elastic’s tools. If you’d like to try some of the capabilities that were mentioned here, we recommend these resources: Demo video: State of the Art Data Retrieval with Machine Learning & Elasticsearch Blog: How to deploy NLP: Text Embeddings and Vector Search Announcing Elasticsearch Relevance Engine : AI search tools for developers Sign up for an Elastic Cloud trial and get started today! The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. In this blog post, we may have used third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to About Relativity eDiscovery's future with generative AI Elasticsearch for the future of AI search experiences Components of an AI-powered search experience Architectural considerations for an AI-powered search experience Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Relativity uses Elasticsearch and Azure OpenAI to build AI search experiences - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/relativity-elasticsearch-azure-openai",
+    "meta_description": "With Elasticsearch Relevance Engine, you can create AI-powered search apps. Learn how Relativity uses Elastic & Azure Open AI for this goal."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds Amazon Bedrock support Elasticsearch open inference API added Amazon Bedrock support. Here's how to use Amazon Bedrock models via Elasticsearch's open inference API. Integrations Generative AI Vector Database How To MH HM By: Mark Hoy and Hemant Malik On July 10, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The Elasticsearch open inference API enables developers to create inference endpoints and use machine learning models from leading providers. Models hosted on Amazon Bedrock are available via the Elasticsearch Open Inference API . Developers building RAG applications using the Elasticsearch vector database can store and use embeddings generated from models hosted on Amazon Bedrock (such as Amazon Titan, Anthropic Claude, Cohere Command R, and others ). Bedrock integration with open inference API offers a consistent way of interacting with different AI models, such as text embeddings and chat completion, simplifying the development process with Elasticsearch. pick a model from Amazon Bedrock create and use an inference endpoint in Elasticsearch use the model as part of an inference pipeline Using a base model in Amazon Bedrock This walkthrough assumes you already have an AWS Account with access to Amazon Bedrock - a fully managed hosted models service that makes foundation models available through a unified API. From Amazon Bedrock in AWS Console, make sure that you do have access to Amazon Titan Embeddings G1 - Text model. You can check that by going to Amazon Bedrock service in AWS Console and checking for _Model access. _ If you don’t have access, you can request it through _Modify model access _ from Amazon Bedrock service in AWS Console. Amazon provides extensive IAM policies to control permissions and access to the models. From within IAM, you’ll also need to create a pair of access and secret keys that allow programmatic access to Amazon Bedrock for your Elasticsearch inference endpoint to communicate. Creating an inference API endpoint in Elasticsearch Once your model is deployed, we can create an endpoint for your inference task in Elasticsearch. For the examples below, we are using the Amazon Titan Text base model to perform inference for chat completion. In Elasticsearch, create your endpoint by providing the service as “amazonbedrock”, and the service settings including your region, the provider, the model (either the base model ID, or if you’ve created a custom model, the ARN for it), and your access and secret keys to access Amazon Bedrock. In our example, as we’re using Amazon Titan Text, we’ll specify “amazontitan” as the provider, and “amazon.titan-text-premier-v1:0” as the model id. When you send Elasticsearch the command, it should return back the created model to confirm that it was successful. Note that the API key will never be returned and is stored in Elasticsearch’s secure settings. Adding a model for using text embeddings is just as easy. For reference, if we use the Amazon Titan Embeddings Text base model, we can create our inference model in Elasticsearch with the “text_embeddings” task type by providing the appropriate API key and target URL from that deployment’s overview page: Let’s perform some inference That’s all there is to setting up your model. Now that that’s out of the way, we can use the model. First, let’s test the model by asking it to provide some text given a simple prompt. To do this, we’ll call the _inference API with our input text: And we should see Elasticsearch provide a response. Behind the scenes, Elasticsearch calls out to Amazon Bedrock with the input text and processes the results from the inference. In this case, we received the response: We’ve tried to make it easy for the end user to not have to deal with all the technical details behind the scenes, but we can also control our inference a bit more by providing additional parameters to control the processing, such as sampling temperature and requesting the maximum number of tokens to be generated: That was easy. What else can we do? This becomes even more powerful when we are able to use our new model in other ways, such as adding additional text to a document when it’s used in an Elasticsearch ingestion pipeline. For example, the following pipeline definition will use our model, and anytime a document using this pipeline is ingested, any text in the field “question_field” will be sent through the inference API, and the response will be written to the “completed_text_answer” field in the document. This allows large batches of documents to be augmented. Limitless possibilities By harnessing the power of Amazon Bedrock models in your Elasticsearch inference pipelines, you can enhance your search experience’s natural language processing capabilities. If you’re an AWS developer using Elasticsearch, there is more to look forward to. We recently added support for Amazon Bedrock to our Playground ( blog ), allowing developers to test and tune RAG workflows. In addition, the new semantic_text mapping lets you easily vectorize and chunk information. In upcoming versions of Elasticsearch, users can take advantage of new field mapping types that simplify the process described in this blog even further, where designing an ingest pipeline would no longer be necessary. Also, as alluded to in our accelerated roadmap for semantic search the future will provide dramatically simplified support for inference tasks with Elasticsearch retrievers at query time. These capabilities are available through the open inference API in our stateless offering on Elastic Cloud. They’ll also soon be available to everyone in an upcoming versioned Elasticsearch release. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Using a base model in Amazon Bedrock Creating an inference API endpoint in Elasticsearch Let’s perform some inference That was easy. What else can we do? Limitless possibilities Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch open inference API adds Amazon Bedrock support - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-amazon-bedrock-support",
+    "meta_description": "Elasticsearch open inference API added Amazon Bedrock support. Here's how to use Amazon Bedrock models via Elasticsearch's open inference API."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds support for OpenAI chat completions Learn how OpenAI chat completions and Elasticsearch can be used to summarize, translate or perform question & answering on any text. Integrations Generative AI How To TG By: Tim Grein On April 15, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. OpenAI Chat Completions has been integrated into Elastic’s inference APIs . This feature marks another milestone in our journey of integrating cutting-edge AI capabilities within Elasticsearch, offering additional easy-to-use features like generating human-like text completions. This blog explains how OpenAI chat completions and Elasticsearch can be used to summarize, translate or perform question & answering on any text. Before we get started, let's take a quick look at the recent Elastic features and integrations. The essence of continuous innovation at Elastic Elastic invests heavily in everything AI. We’ve recently released a lot of new features and exciting integrations: Elasticsearch open inference API adds support for Cohere Embeddings Introducing Elasticsearch vector database to Azure OpenAI Service On Your Data (preview) Speeding Up Multi- graph Vector Search ...explore more of Elastic Search Labs to learn about recent developments The new completion task type inside our inference API with OpenAI as the first backing provider is already available in our stateless offering on Elastic Cloud. It’ll be soon available to everyone in our next release. Using OpenAI chat completions with Elasticsearch's open inference API In this short guide we’ll show a simple example on how to use the new completion task type in the inference API during document ingestion. Please refer to the Elastic Search Labs GitHub repository for more in-depth guides and interactive notebooks. For the following guide to work you'll need to have an active OpenAI account and obtain an API key. Refer to OpenAI’s quickstart guide for the steps you need to follow. You can choose from a variety of OpenAI’s models. In the following example we’ve used `gpt-3.5-turbo`. In Kibana, you'll have access to a console for you to input these next steps in Elasticsearch without even needing to set up an IDE. Firstly, you configure a model, which will perform the completions: After running this command you should see a corresponding `200 OK` status indicating that the model is properly set up for performing inference on arbitrary text. You’re now able to call the configured model to perform inference on arbitrary text input: You’ll get a response with status code `200 OK` looking similar to the following: The next command creates an example document we’ll summarize using the model we’ve just configured: To summarize multiple documents, we’ll use an ingest pipeline together with the script- , inference- and remove-processor to set up our summarization pipeline. This pipeline simply prefixes the content with the instruction “Please summarize the following text: “ in a temporary field so the configured model knows what to do with the text. You can change this text to anything you would like of course, which unlocks a variety of other popular use cases: Question and Answering Translation …and many more! The pipeline deletes the temporary field after performing inference. We now send our document(s) through the summarization pipeline by calling the reindex API . Your document is now summarized and ready to be searched: That’s basically it, you just created a powerful summarization pipeline with a few simple API calls, which can be used with any ingestion mechanism! There are a lot of use cases, where summarization comes in handy, for example by summarizing large pieces of text before generating semantic embeddings or transforming large documents into a concise summary. This can reduce your storage cost, improve time-to-value for example, if you’re only interested in a summary of large documents etc. By the way if you want to extract text from binary documents you can take a look at our open-code data-extraction service ! Exciting future ahead But we won’t stop here. We’re already working on integrating Cohere’s chat as another provider for our `completion` task. We’re also actively exploring new retrieval and ingestion use cases in combination with the completion API. Bookmark Elastic Search Labs now to stay up to date! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to The essence of continuous innovation at Elastic Using OpenAI chat completions with Elasticsearch's open inference API Exciting future ahead Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch open inference API adds support for OpenAI chat completions - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-openai-completion-support",
+    "meta_description": "Learn how OpenAI chat completions and Elasticsearch can be used to summarize, translate or perform question & answering on any text."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Go Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Go How To October 31, 2023 Perform text queries with the Elasticsearch Go client Learn how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client through a practical example. CR LS By: Carly Richmond and Laurent Saint-Félix Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Go - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/go-programming",
+    "meta_description": "Go articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial PHP Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe ES|QL PHP +1 April 8, 2024 From ES|QL to PHP objects Learn how to execute and manage ES|QL queries in PHP. Follow this guide to map ES|QL results to a PHP object or custom class. EZ By: Enrico Zimuel Generative AI PHP June 21, 2023 How to use Elasticsearch to prompt ChatGPT with natural language This blog post presents an experimental project for querying Elasticsearch in natural language using ChatGPT. EZ By: Enrico Zimuel Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "PHP - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/php-programming",
+    "meta_description": "PHP articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Open Crawler released for tech-preview The Open Crawler lets users crawl web content and index it into Elasticsearch from wherever they like. Learn about it & how to use it here. Ingestion NF By: Navarone Feekery On June 7, 2024 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Don't miss the follow-up to this blog announcing Open Crawler's promotion to beta! Discover how fast deployment has become with our out-of-the-box Docker images, and explore recent enhancements such as Extraction Rules or Binary Content Extraction. Read more in the blog post Open Crawler now in beta . There's a new crawler in town! The Open Crawler enables users to crawl web content and index it into Elasticsearch from wherever they prefer. This blog goes over the Open Crawler, compares it to the Elastic Crawler, and explains how to use it. Background Elastic has seen a few iterations of Crawler over the years. What started out as Swiftype's Site Search became the App Search Crawler , and then most recently the Elastic Crawler . These Crawlers are feature-rich, and allow for a robust and nuanced method for ingesting website data into Elasticsearch. However, if a user wants to run these on their own infrastructure, they are required to run the entirety of Enterprise Search as well. The Enterprise Search codebase is massive and contains a lot of different tools, so users don't have the option to run just Crawler. Because Enterprise Search is private code, it also isn't entirely clear to the user what they are running. That has all changed, as we have released the latest iteration of Crawler; the Open Crawler ! Open Crawler allows users to crawl web content and index it into Elasticsearch from wherever they like. There are no requirements to use Elastic Cloud, nor to have Kibana or Enterprise Search instances running. Only an Elasticsearch instance is required to ingest the crawl results into. This time the repository is open code too. Users can now inspect the codebase, open issues and PR requests, or fork the repository to make changes and run their own crawler variant. What has changed? The Open Crawler is considerably more light-weight than the SaaS crawlers that came before it. This product is essentially the core crawler code from the existing Elastic Crawler, decoupled from the Enterprise Search service. Decoupling the Open Crawler has meant leaving some features behind, temporarily. There is a full feature comparison table at the end of this blog if you'd like to read our roadmap towards feature-parity. We intend to reintroduce those features and reach near-feature-parity when this product becomes GA. This process allowed us to also make improvements to the core product. For example: We were able to remove the limitations towards index naming It is now possible to use custom mappings on your index before crawling and ingesting content Crawl results are now also bulk indexed into Elasticsearch instead of indexed one webpage at a time This provided a signficant performance boost, which we get into below How does the Open Crawler compare to the Elastic Crawler? As already mentioned, this crawler can be run from wherever you like; your computer, your personal server, or a cloud-hosted one. It can index documents into Elasticsearch on-prem, on Cloud, and even Serverless. You are also no longer tied to using Enterprise Search to ingest your website content into Elasticsearch. But what is most exciting is that Open Crawler is also faster than the Elastic Crawler. We ran a performance test to compare the speed of Open Crawler versus that of the Elastic Crawler, our next-latest crawler. Both crawlers crawled the site elastic.co with no alterations to the default configuration. The Open Crawler was set up to run on two AWS EC2 instances; m1.small and m1.large , while Elastic Crawler was run natively from Elastic Cloud. All were set up in the region of N. Virginia. The content was indexed into an Elasticsearch Cloud instance with identical settings (the default 360 GB storage | 8 GB RAM | Up to 2.5 vCPU ). Here are the results: Crawler Type Server RAM Server CPU Crawl Duration (mins) Docs Ingested (n) Elastic Crawler 2GB up to 8 vCPU 305 43957 Open Crawler (m1.small) 1.7 GB 1 vCPU 160 56221 Open Crawler (m1.large) 3.75 GB per vCPU 2 vCPU 100 56221 Open Crawler was almost twice as fast for the m1.small and over 3 times as fast for the m1.large ! This is also despite running on servers with less-provisioned vCPU. Open Crawler ingested around 13000 more documents, but that is because the Elastic Crawler combines website pages with identical bodies into a single document. This feature is called duplicate content handling and is in the feature comparison matrix at the end of this blog. The takeaway here is that both Crawlers encountered the same amount of web pages during their respective crawls, even if the ingested document count is different. Here are some graphs comparing the impact of this on Elasticsearch. These compare the Elastic Crawler with the Open Crawler that ran on the m1.large instance. CPU Naturally the Open Crawler caused signficantly less CPU usage on Elastic Cloud, but that's because we've removed the entire Enterprise Search server. It's still worth taking a quick look at where this CPU usage was distributed. Elastic Crawler CPU load (Elastic Cloud) Open Crawler CPU load (Elastic Cloud) The Elastic Crawler reached the CPU threshold immediately and consistently used it for an hour. It then dropped down and had periodic spikes until the crawl completed. For the Open Crawler there was almost no noticable CPI usage on Elastic Cloud, but the CPU is still being consumed somewhere , and in our case this was on the EC2 instance. EC2 CPU load ( m1.large ) We can see here that the Open Crawler didn't reach the 100% limit threshold. The most CPU it used was 84.3%. This means there's still more room for optimization here. Depending on user setup (and optimizations we can add to the codebase), Open Crawler could be even faster . Requests (n) We can see a real change on Elasticsearch server load here by comparing the amount of requests made during the crawl. Elastic Crawler requests Open Crawler requests The indexing request impact from Open Crawler is so small it's not even noticeable on this graph compared to the background noise. There's a slight uptick in index requests, and no change to the search requests. Elastic Crawler, meanwhile, has an explosion of requests; particularly search requests. This means the Open Crawler is a great solution for users who want to reduce requests made to their Elasticsearch instance. So why is the Open Crawler so much faster? 1. The Open Crawler makes significantly fewer indexing requests. The Elastic Crawler indexes crawl results one at a time. It does this to allow for features such as duplicate content management. This means that Elastic Crawler performed 43,957 document indexing requests during its crawl. It also updates documents when it encounters duplicate content, so it also performed over 13000 individual update requests. The Open Crawler instead pools crawl results and indexes them in bulk. In this test, it indexed the same amount of crawl results in only 604 bulk requests of varying sizes. That's less than 1.5% of the indexing requests made, which is a significant load reduction for Elasticsearch to manage. 2. Elastic Crawler also performs many search requests, further slowing down performance The Elastic Crawler has its configuration and metadata managed in Elasticsearch pseudo-system indices. When Crawling it periodically checks this configuration and updates the metadata on a few of these indices, which is done through further Elasticsearch requests. The Open Crawler's configuration is entirely managed in yaml files. It also doesn't track metadata on Elasticsearch indices. The only requests it makes to Elasticsearch are to index documents from crawl results while crawling a website. 3. Open Crawler is simply doing less with crawl results In the tech-preview stage of Open Crawler, there are many features that are not available yet. In Elastic Crawler, these features are all managed through pseudo-system indices in Elasticsearch. When we add these features to the Open Crawler, we can ensure they are done in a way that doesn't involve multiple requests to Elasticsearch to check configuration. This means Open Crawler should still retain this speed advantage even after reaching near feature parity with Elastic Crawler. How do I use the Open Crawler? You can clone the repository now and follow the documentation here to get started. I recommend using Docker to run the Open Crawler if you aren't making changes to source code, to make the process smoother. If you want to index the crawl results into Elasticsearch, you can also try out a free trial of Elasticsearch on Cloud or download and run Elasticsearch yourself from source . Here's a quick demo of crawling the website parksaustralia.gov.au . The requirements for this are Docker, a clone/fork of the Open Crawler repository, and a running Elasticsearch instance. 1. Build the Docker image and run it This can be done in one line with docker build -t crawler-image . && docker run -i -d --name crawler crawler-image . You can then confirm it is running by using the CLI command to check the version docker exec -it crawler bin/crawler version . 2. Configure the crawler Using examples in the repository you can create a configuration file. For this example I'm crawling the website parksaustralia.org.au and indexing into a Cloud-based Elasticsearch instance. Here's an example of my config, which I creatively named example.yml . I can copy this into the docker container using docker cp config/example.yml crawler:/app/config/example.yml 3. Validate the domain Before crawling you can check that the configured domain is valid using docker exec -it crawler bin/crawler validate config/example.yml 4. Crawl! Start the crawl with docker exec -it crawler bin/crawler crawl config/example.yml . It will take a while to complete if the site is large, but you'll know it's done based on the shell output. 5. Check the content We can then do a _search query against the index. This could also be done in Kibana Dev Tools if you have an instance of that running. And the results! You could even hook these results up with Semantic Search and do some cool real-language queries, like What park is in the centre of Australia? . You just need to add the name of the pipeline you create to the Crawler config yaml file, under the field elasticsearch.pipeline . Feature comparison breakdown Here is a full list of Elastic Crawler features as of v8.13 , and when we intend to add them to the Open Crawler. Features available in tech-preview are available already. These aren't tied to any specific stack version, but we have a general time we're aiming for each release. tech-preview : Today (June 2024) beta : Autumn 2024 GA : Summer 2025 Feature Open Crawler Elastic Crawler Index content into Elasticsearch tech-preview ✔ No index name restrictions tech-preview ✖ Run anywhere, without Enterprise Search or Kibana tech-preview ✖ Bulk index results tech-preview ✖ Ingest pipelines tech-preview ✔ Seed URLs tech-preview ✔ robots.txt and sitemap.xml adherence tech-preview ✔ Crawl through proxy tech-preview ✔ Crawl sites with authorization tech-preview ✔ Data attributes for inclusion/exclusion tech-preview ✔ Limit crawl depth tech-preview ✔ Robots meta tags tech-preview ✔ Canonical URL link tags tech-preview ✔ No-follow links tech-preview ✔ CSS selectors beta ✔ XPath selectors beta ✔ Custom data attributes beta ✔ Binary content extraction beta ✔ URL pattern extraction (extraction directly from URLs using regex) beta ✔ URL filters (extraction rules for specific endpoints) beta ✔ Purge crawls beta ✔ Crawler results history and metadata GA ✔ Duplicate content handling TBD ✔ Schedule crawls TBD ✔ Manage crawler through Kibana UI TBD ✔ The TBD features are still undecided as we are assessing the future of the Open Crawler. Some of these, like Schedule crawls , can be done already using cron jobs or similar automation. Depending on user feedback and how the Open Crawler project evolves, we may decide to implement these features properly in a later release. If you have a need for one of these, reach out to us! You can find us in the forums and in the community Slack , or you can create an issue directly in the repository . What's next? We want to get this to GA in time for v9.0 . The Open Crawler is designed with Elastic Cloud Serverless in mind, and we intend for it to be the main web content ingestion method for that version. We also have plans to support the Elastic Data Extraction Service , so even larger binary content files can be ingested using Open Crawler. There are many features we need to introduce in the meantime to get the same feature-rich experience that Elastic Crawler has today. Report an issue Related content Integrations Ingestion +1 March 7, 2025 Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. JR By: Jeffrey Rengifo Integrations Ingestion +1 February 19, 2025 Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. AK By: Amit Khandelwal Integrations Ingestion +1 February 18, 2025 Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. JR TM By: Jeffrey Rengifo and Tomás Murúa Ingestion How To February 4, 2025 How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. AL By: Andre Luiz Integrations Ingestion +1 February 3, 2025 Elastic Playground: Using Elastic connectors to chat with your data Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources. JR TM By: Jeffrey Rengifo and Tomás Murúa Jump to Background What has changed? How does the Open Crawler compare to the Elastic Crawler? CPU Requests (n) Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Open Crawler released for tech-preview - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-open-crawler-release",
+    "meta_description": "The Open Crawler lets users crawl web content and index it into Elasticsearch from wherever they like. Learn about it & how to use it here."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. Elastic Cloud Serverless Agent FS By: Fram Souza On March 4, 2025 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. This little command-line tool lets you manage your Serverless Elasticsearch projects in plain English. It talks to an AI (in this case, OpenAI) to figure out what you mean and call the right functions using LlamaIndex! What does it do? Create a project : Spin up a new Serverless Elasticsearch project. Delete a project : Remove an existing project (yep, it cleans up after you). Get project status : Check on how your project is doing. Get project details : Fetch all the juicy details about your project. Check the code on GitHub. How it works When you type in something like: \"Create a serverless project named my_project\" …here’s what goes on behind the scenes: User input & context: Your natural language command is sent to the AI agent. Function descriptions: The AI agent already knows about a few functions—like create_ess_project, delete_ess_project, get_ess_project_status, and get_ess_project_details—because we gave it detailed descriptions. These descriptions tell the AI what each function does and what parameters they need. LLM processing: Your query plus the function info is sent off to the LLM. This means the AI sees: The user query : Your plain-English instruction. Available functions & descriptions : Details on what each tool does so it can choose the right one. Context/historic chat info : Since it’s a conversation, it remembers what’s been said before. Function call & response: The AI figures out which function to call, passes along the right parameters (like your project name), and then the function is executed. The response is sent back to you in a friendly format. In short, we’re sending both your natural language query and a list of detailed tool descriptions to the LLM so it can “understand” and choose the right action for your request. Setup Prerequisites: Before running the AI agent, ensure that you have the following set up: Python (v3.7 or later) installed. Elasticsearch serverless account set up on Elastic Cloud. OpenAI account to interact with the language model. Steps: 1. Clone the repository: 2. Create a virtual environment (optional but recommended): If you're facing environment-related issues, you can set up a virtual environment for isolation: 3. Install the dependencies: Ensure that all required dependencies are installed by running: 4. Configure your environment: Create a .env file in the project root with the following variables. Here’s an example .env.example file to help you out: Ensure that you have the correct values for ES_URL , API_KEY , and OPENAI_API_KEY . You can find your API keys in the respective service dashboards. 5. Projects File: The tool uses a projects.json file to store your project mappings (project names to their details). This file will be created automatically if it doesn't already exist. Running the agent You’ll see a prompt like this: Type in your command, and the AI agent will work its magic! When you're done, type exit or quit to leave. A few more details LLM integration : The LLM is given both your query and detailed descriptions of each available function. This helps it understand the context and decide, for example, whether to call create_ess_project or delete_ess_project . Tool descriptions : Each function tool (created using FunctionTool.from_defaults) has a friendly description. This description is included in the prompt sent to the LLM so that it “knows” what actions are available and what each action expects. Persistence : Your projects and their details are saved in projects.json, so you don’t have to re-enter info every time. Verbose logging : The agent is set to verbose mode, which is great for debugging and seeing how your instructions get translated into function calls. Example utilization Report an issue Related content Agent How To March 28, 2025 Connect Agents to Elasticsearch with Model Context Protocol Let’s use Model Context Protocol server to chat with your data in Elasticsearch. JB JM By: Jedr Blaszyk and Joe McElroy Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to What does it do? How it works Setup Prerequisites: Steps: Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "The AI Agent to manage Elasticsearch Serverless projects - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/serverless-elasticsearch-ai-agent",
+    "meta_description": "A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. Elastic Cloud Serverless AD By: Andrei Dan On December 10, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. Traditionally, users change the sharding configuration of data streams in order to deal with various workloads and make the best use of the available resources. In Elastic Cloud Serverless we've introduced autosharding of data streams, enabling them to be managed and scaled automatically based on indexing load. This post explores the mechanics of autosharding, its benefits, and its implications for users dealing with variable workloads. The autosharding philosophy is to increase the number of shards aggressively and reduce them very conservatively, such that an increase in shards is not followed prematurely by a reduction of shards due to a small period of reduced workload. Autosharding of data streams in Serverless Elasticsearch Imagine you have a large pizza that needs to be shared among your friends at a party. If you cut the pizza into only two slices for a group of six friends, each slice will need to serve multiple people. This will create a bottleneck, where one person hogs a whole slice while others wait, leading to a slow sharing process. Additionally, not everyone can enjoy the pizza at the same time; you can practically hear the sighs from the friends left waiting. If more friends show up unexpectedly, you’ll struggle to feed them with just two slices and find yourself scrambling to reshape those slices on the spot. On the other hand, if you cut the pizza into 36 tiny slices for those same six friends, managing the sharing becomes tricky. Instead of enjoying the pizza, everyone spends more time figuring out how to grab their tiny portions. If the slices are too small, the pizza might even fall apart. To ensure everyone enjoys the pizza efficiently, you’d aim to cut it into a number of slices that matches the number of friends. If you have six friends, cutting the pizza into 6 or 12 slices allows everyone to grab a slice without long waits. By finding the right balance in slicing your pizza, you’ll keep the party running smoothly and everyone happy. You know it’s a good analogy when you immediately follow-up with the explanation; the pizza represents the data, the slices represent the index shards, and the friends are the Elasticsearch nodes in your cluster. Traditionally, users of Elasticsearch had to anticipate their indexing throughput and manually configure the number of shards for each data stream . This approach relied heavily on predictive heuristics and required ongoing adjustments based on workload characteristics whilst also balancing data storage, search analytics, and application performance . Businesses with seasonal traffic, like retail, often deal with spikes in data demands, while IoT applications can experience rapid load increases at specific times. Development and testing environments typically run only a few hours a week, making fixed shard configurations inefficient. New applications might struggle to estimate workload needs accurately, leading to potential over- or under-provisioning. We've introduced autosharding of data streams in Elastic Cloud Serverless . Data streams in Serverless are managed and scaled automatically based on indexing load - automatically slicing your pizza as friends arrive to your party or finish eating. The promise of autosharding Autosharding addresses these challenges by automatically adjusting the number of shards in response to the current indexing load. This means that instead of users having to manually tweak configurations, Elasticsearch will dynamically manage shard counts for the data streams in your project based on real-time data traffic. Elasticsearch keeps track of the indexing load for every index as part of a metric named write load, and exposes it for on-prem and ESS deployments as part of the index stats API under the indexing section. The write_load represents the average number of write threads used while indexing documents. For an index with one shard the maximum possible value of the write_load metric is the number of write threads available (e.g. all write threads are busy writing in the same shard). For indices with multiple shards the maximum possible value for the write load is the number of write threads available in a node times the number of indexing nodes in the project. (e.g. all write threads on all the indexing nodes that host a shard for our index are busy writing in the shards belonging to our index, exclusively) To get a sense of the values allowed for write_load let’s look at index logs with one shard running on one Elasticsearch machine with 2 allocated processors. The write thread pool will be sized to 2 threads. This means that if this Elasticsearch node is exclusively and constantly writing to the same index logs , the write_load we’ll report for index logs will be 2.0 (i.e. 2 write threads fully utilized for writing into index logs ). If logs has 2 primary shards and we’re now running on two Elasticsearch nodes, each with 2 allocated processors we’ll be able to get a maximum reported write_load of 4.0 if all write threads on both Elasticsearch nodes are exclusively writing into the logs index. Serverless autoscaling We just looked at how the write load capacity doubled when we increased the number of shards and Elasticsearch nodes. Elastic Cloud Serverless takes care automatically of both these operations using data stream autosharding and ingest autoscaling . Autoscaling refers to the process of dynamically adjusting resources - like memory, CPU, and disk - based on current demands. In our serverless architecture, we start with a small 2GB memory server and use a step-function scaling approach to increase capacity efficiently. We scale up memory incrementally and then scale out by adding servers. This cycle continues, increasing memory per server incrementally up to 64GB while managing the number of servers. Linking autoscaling and autosharding The connection between auto scaling and auto sharding is essential for optimizing performance. When calculating the optimal number of shards for a data stream, we consider the minimum and maximum number of available write threads per node in our scaling setup. For small projects, the system will move from 1 to 2 shards when the data stream uses more than half the capacity of a node (i.e., more than one indexing thread). For medium-sized projects, as the system scales across multiple nodes, it will not exceed 3 shards to avoid excessive overhead. Once we reach the largest node sizes, further sharding is enabled to accommodate larger workloads. Autosharding also enables autoscaling to increase resources as needed, preventing the system from staying at low capacity during high indexing workloads, by enabling projects to reach higher ingestion load values. Auto sharding formula To determine the number of shards needed, we use the following formula: This equation balances the need for increasing shards based on write_load while capping the number of shards to prevent oversharding. The division by 2 reflects the strategy of increasing shards only after exceeding half the capacity of a node. The min/max write threads represent the minimum and maximum number of write threads available in the autoscaling step function (i.e. the number of write threads available on the smallest 2GB step and the number of write threads available on the largest server) Let’s visualize the output of the formula: On the Y axis we have the number of shards . And on the X axis we have the write load . We start with 1 shard and we get to 3 shards when the write load is just over 3.0. We remain with 3 shards for quite some time until the write load is about 48.0. This covers us for the time we scale up through the nodes but haven’t really got to 2 or more or the largest servers, at which point we unlock auto sharding to more than 3 shards, as many as needed to ingest data. While adding shards can improve indexing performance, excessive sharding in an Elasticsearch cluster can have negative repercussions - imagine that pizza with 56 slices being shared by only 7 friends. Each shard carries overhead costs, including maintenance and resource allocation. Our algorithm accounts for and avoids the peril of excessive sharding until we get to the largest workloads where adding more than 3 shards makes a material difference to indexing performance and throughput. Implementing autosharding with rollovers The implementation of autosharding relies on the concept of rollover . A rollover operation creates a new index within the data stream , promoting it to the write index while designating the previous index as a regular backing index, which no longer accepts writes. This transition can occur based on specific conditions, such as exceeding a shard size of 50GB. We take care of configuring the optimal rollover conditions for data streams in Serverless . In Serverless alongside the usual rollover conditions that relate to maintaining healthy indices and shards we introduce a new condition that evaluates whether the current write load necessitates an increase in shard count. If this condition is met, a rollover will be triggered and the new resulting data stream write index will be configured with the optimal number of shards. For downscaling, the system will monitor the workload and will not trigger a rollover solely for reducing shards. Instead, it will wait until a regular rollover condition, like the primary shard size, triggers the rollover. The resulting write index will be configured with a lower number of shards. Cooldown periods for shard adjustments To ensure stability during shard adjustments, we implement cooldown periods: Increase shards cooldown : A minimum wait time of 4.5 minutes is enforced before increasing the number of shards since the last adjustment. The 4.5 minutes cooldown might seem peculiar but the interval has been chosen to make sure we can increase the number of shards every time data stream lifecycle checks if data streams should rollover (currently, every 5 minutes) but not more often than 5 minutes, covering for internal Elasticsearch cluster reconfiguration. Decrease shards cooldown : We maintain a 3-day minimum wait time before reducing shards to ensure that the decision is based on sustained workload patterns rather than temporary fluctuations. Conclusion The data streams autosharding feature in Serverless Elasticsearch represents significant progress in managing data streams effectively. By automatically adjusting shard counts based on real-time indexing loads, this feature simplifies operations and enhances scalability. With the added benefits of autoscaling , users can expect a more efficient and responsive experience, whether they are handling small projects or large-scale applications. As data workloads continue to evolve, the adaptability provided by auto sharding ensures that Elasticsearch remains a robust solution for managing diverse indexing needs. Try out our Serverless Elasticsearch offering to take advantage of data streams auto sharding and observe the indexing throughput scaling seamlessly as your data ingestion load increases. Your pizzas will be optimally sliced as more friends arrive at your party, keen to try those sourdough craft pizzas you prepared for them. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Elastic Cloud Serverless Ingestion September 20, 2024 Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. VR MR By: Vishal Raj and Marc Lopez Rubio Jump to Autosharding of data streams in Serverless Elasticsearch The promise of autosharding Serverless autoscaling Linking autoscaling and autosharding Auto sharding formula Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Autosharding of data streams in Elasticsearch Serverless - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/datastream-autosharding-serverless",
+    "meta_description": "Learn about autosharding of data streams in Elasticsearch Serverless, the mechanics of autosharding, its benefits, and its implications for users."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds support for Azure OpenAI embeddings Elasticsearch open inference API adds support for Azure OpenAI embeddings to be stored in the world's most downloaded vector database. Integrations Vector Database How To MH By: Mark Hoy On May 22, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. We're happy to announce that Elasticsearch now supports Azure OpenAI embeddings in our open inference API , enabling developers to store generated embeddings into our highly scalable and performant vector database . This new functionality further solidifies our commitment to not only working with Microsoft and the Azure platform, but also toward our commitment to offering our customers more flexibility with their AI solutions. Ongoing Investment in AI at Elastic This is the latest in a series of additional features and integrations on AI enablement for Elasticsearch following on from: Elasticsearch open inference API adds Azure AI Studio support Elasticsearch open inference API adds support for Azure OpenAI chat completions Elasticsearch open inference API adds support for OpenAI chat completions Elasticsearch open inference API adds support for Cohere Embeddings Introducing Elasticsearch vector database to Azure OpenAI Service On Your Data (preview) The new inference embeddings service provider for Azure OpenAI is already available in our stateless offering on Elastic Cloud, and will be soon available to everyone in an upcoming Elastic release. Using Azure OpenAI Embeddings with the Elasticsearch Inference API Deploying an Azure OpenAI Embeddings Model To get started, you will need a Microsoft Azure Subscription as well as access to Azure OpenAI service . Once you have registered and have access, you will need to create a resource in your Azure Portal , and then deploy an embedding model to Azure OpenAI Studio . To do this, if you do not already have an Azure OpenAI resource in your Azure Portal, create a new one from the “Azure OpenAI” type which can be found in the Azure Marketplace, and take note of your resource name as you will need this later. When you create your resource, the region you choose may impact what models you have access to. See the Azure OpenAI deployment model availability table for additional details. Once you have your resource, you will also need one of your API keys which can be found in the “Keys and Endpoint” information from the Azure Portal's left side navigation: Now, to deploy your Azure OpenAI Embedding model, go into your Azure OpenAI Studio's console and create your deployment using an OpenAI Embeddings model such as text-embedding-ada-002 . Once your deployment is created, you should see the deployment overview. Also take note of the deployment name, in the example below it is “example-embeddings-model”. Using your deployed Azure OpenAI embeddings model with the Elasticsearch Inference API With an Azure OpenAI embeddings model deployed, we can now configure your Elasticsearch deployment's _inference API and create a pipeline to index embeddings vectors in your documents. Please refer to the Elastic Search Labs GitHub repository for more in-depth guides and interactive notebooks. To perform these tasks, you can use the Kibana Dev Console, or any REST console of your choice. First, configure your inference endpoint using the create inference model endpoint - we'll call this “example_model”: For your inference endpoint, you will need your API key, your resource name, and the deployment id that you created above. For the “api_version”, you will want to use an available API version from the Azure OpenAI embeddings documentation - we suggest always using the latest version which is “2024-02-01” as of this writing. You can also optionally add a username in the task setting's “user” field which should be a unique identifier representing your end-user to help Azure OpenAI to monitor and detect abuse. If you do not want to do this, omit the entire “task_settings” object. After running this command you should receive a 200 OK status indicating that the model is properly set up. Using the perform inference endpoint , we can see an example of your inference endpoint at work: The output from the above command should provide the embeddings vector for the input text: Now that we know our inference endpoint works, we can create a pipeline that uses it: This will create an ingestion pipeline named “azureopenai_embeddings” that will read the contents of the “name” field upon ingestion and apply the embeddings inference from our model to the “name_embedding” output field. You can then use this ingestion pipeline when documents are ingested (e.g. via the _bulk ingest endpoint), or when reindexing an index that is already populated. This is currently available through the open inference API in our stateless offering on Elastic Cloud. It'll also be soon available to everyone in an upcoming versioned Elasticsearch release, with additional semantic text capabilites that will make this step even simpler to integrate into your existing workflows. For an additional use case, you can walk through the semantic search with inference tutorial for how to perform ingestion and semantic search on a larger scale with Azure OpenAI and other services such as reranking or chat completions. Plenty more on the horizon This new extensibility is only one of many new features we are bringing to the AI table from Elastic. Bookmark Elastic Search Labs now to stay up to date! Ready to build RAG into your apps? Want to try different LLMs with a vector database? Check out our sample notebooks for LangChain, Cohere and more on Github , and join the Elasticsearch Engineer training starting soon! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Ongoing Investment in AI at Elastic Using Azure OpenAI Embeddings with the Elasticsearch Inference API Deploying an Azure OpenAI Embeddings Model Using your deployed Azure OpenAI embeddings model with the Elasticsearch Inference API Plenty more on the horizon Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch open inference API adds support for Azure OpenAI embeddings - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-azure-openai-embeddings-support",
+    "meta_description": "Elasticsearch open inference API adds support for Azure OpenAI embeddings to be stored in the world's most downloaded vector database."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing the sparse vector query: Searching sparse vectors with inference or precomputed query vectors Learn about the Elasticsearch sparse vector query, how it works, and how to effectively use it. Vector Database KD By: Kathleen DeRusso On July 23, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Sparse vector queries take advantage of Elasticsearch’s powerful inference API , allowing easy built-in setup for Elastic-hosted models such as ELSER and E5 , as well as the flexibility to host other models. Introduction Vector search is evolving, and as our needs for vector search evolve so does the need for a consistent and forward thinking vector search API. When Elastic first launched semantic search, we leveraged existing rank_features fields using the text_expansion query. We then reintroduced the sparse_vector field type for semantic search use cases. As we think about what sparse vector search is going forward, we’ve introduced a new sparse vector query. As of Elasticsearch 8.15.0, both the text_expansion query and weighted_tokens query have been deprecated in favor of the new sparse vector query. The sparse vector query supports two modes of querying: using an inference ID and using precomputed query vectors. Both modes of querying require data to be indexed in a sparse_vector mapped field. These token-weight pairs are then used in a query against a sparse vector. At query time, query vectors are calculated using the same inference model that was used to create the tokens. Let’s look at an example: let’s say we’ve indexed a document detailing when Orion is most visible in the night sky: Now, assume we’re looking for constellations that are visible in the northern hemisphere, and we run this query through the same learned sparse encoder model. The output might look similar to this: At query time, these vectors are ORed together, and scoring is effectively a dot product calculation between the stored dimensions and the query dimensions, which would score this example at 10.84: Sparse vector queries with inference Sparse vector queries using inference work in a very similar way to the previous text expansion query, instead of sending in a trained model, we create an inference endpoint associated with the model we want to use. Here’s an example of how to create an inference endpoint for ELSER: You should use an inference endpoint to index your sparse vector data, and use the same endpoint as input to your sparse_vector query. For example: Sparse vector queries with precomputed query vectors You may have precomputed vectors that don’t require inference at query time. These can be sent into the sparse_vector query instead of using inference. Here is an example: Query optimization with token pruning Like text expansion search, the sparse vector query is subject to performance penalties from huge boolean queries. Therefore the same token pruning strategies available for text expansion strategies are available in the sparse vector query. You can see the impact of token pruning in our nightly MS Marco Passage Ranking benchmarks . In order to enable pruning with the default pruning configuration (which has been tuned for ELSER V2), simply add prune: true to your request: Alternately, you can adjust the pruning configuration by sending it directly in with the request: Because token pruning will incur a recall penalty, we recommend adding the pruned tokens back in a rescore: What's next? While the text_expansion query is GA’d and will be supported throughout Elasticsearch 8.x, we recommend updating to the sparse_vector query as soon as possible in order to ensure you’re using the most up to date features as we continually improve the vector search experience in Elasticsearch. If you are using the weighted_tokens query, this was never GA’d and will be replaced by the sparse_vector query very soon. The sparse_vector query will be available starting with 8.15.0 and is already available in Serverless - try it out today! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Introduction Sparse vector queries with inference Sparse vector queries with precomputed query vectors Query optimization with token pruning What's next? Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing the sparse vector query: Searching sparse vectors with inference or precomputed query vectors - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-sparse-vector-query",
+    "meta_description": "Learn about the Elasticsearch sparse vector query, how it works, and how to effectively use it."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Understanding Int4 scalar quantization in Lucene This blog explains how int4 quantization works in Lucene, how it lines up, and the benefits of using int4 quantization. Lucene ML Research BT TV By: Benjamin Trent and Thomas Veasey On April 25, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction to Int4 quantization in Lucene In our previous blogs, we walked through the implementation of scalar quantization as a whole in Lucene. We also explored two specific optimizations for quantization . Now we've reached the question: how does int4 quantization work in Lucene and how does it line up? How does Int4 quantization work in Lucene Storing and scoring the quantized vectors Lucene stores all the vectors in a flat file, making it possible for each vector to be retrieved given some ordinal. You can read a brief overview of this in our previous scalar quantization blog . Now int4 gives us additional compression options than what we had before. It reduces the quantization space to only 16 possible values (0 through 15). For more compact storage, Lucene uses some simple bit shift operations to pack these smaller values into a single byte, allowing a possible 2x space savings on top of the already 4x space savings with int8. In all, storing int4 with bit compression is 8x smaller than float32 . Figure 1: This shows the reduction in bytes required with int4 which allows an 8x reduction in size from float32 when compressed. int4 also has some benefits when it comes to scoring latency. Since the values are known to be between 0-15 , we can take advantage of knowing exactly when to worry about value overflow and optimize the dot-product calculation. The maximum value for a dot product is 15*15=225 which can fit in a single byte. ARM processors (like my macbook) have a SIMD instruction length of 128 bits (16 bytes). This means that for a Java short we can allocate 8 values to fill the lanes. For 1024 dimensions, each lane will end up accumulating a total of 1024/8=128 multiplications that have a max value of 225 . The resulting maximum sum of 28800 fits well within the limit of Java's short value and we can iterate more values at a time than. Here is some simplified code of what this looks like for ARM. Calculating the quantization error correction For a more detailed explanation of the error correction calculation and its derivation, please see error correcting the scalar dot-product . Here is a short summary, woefully (or joyfully) devoid of complicated mathematics. For every quantized vector stored, we additionally keep track of a quantization error correction. Back in the Scalar Quantization 101 blog there was a particular constant mentioned: α × i n t 8 i × m i n \\alpha \\times int8_i \\times min α × in t 8 i ​ × min This constant is a simple constant derived from basic algebra. However, we now include additional information in the stored float that relates to the rounding loss. ∑ i = 0 d i m − 1 ( ( i − m i n ) − i ′ × α ) i ′ × α \\sum_{i=0}^{dim-1} ((i - min) - i'\\times\\alpha)i'\\times\\alpha i = 0 ∑ d im − 1 ​ (( i − min ) − i ′ × α ) i ′ × α Where i i i is each floating point vector dimension, i ′ i' i ′ is the scalar quantized floating point value, and α = m a x − m i n ( 1 < < b i t s ) − 1 \\alpha=\\frac{max - min}{(1 << bits) - 1} α = ( 1 << bi t s ) − 1 ma x − min ​ . This has two consequences. The first is intuitive, as it means that for a given set of quantization buckets, we are slightly more accurate as we account for some of the lossiness of the quantization. The second consequence is a bit more nuanced. It now means we have an error correction measure that is impacted by the quantization bucketing. This implies that it can be optimized. Finding the optimal bucketing for int4 quantization The naive and simple way to do scalar quantization can get you pretty far. Usually, you pick a confidence interval from which you calculate the allowed extreme boundaries for vector values. The default in Lucene and consequently Elasticsearch is 1 − 1 / ( d i m e n s i o n s + 1 ) 1-1/(dimensions+1) 1 − 1/ ( d im e n s i o n s + 1 ) . Figure 2 shows the confidence interval over some sampled CohereV3 embeddings . Figure 3 shows the same vectors, but scalar quantized with that statically set confidence interval. Figure 2: A sampling of CohereV3 dimension values. Figure 3: CohereV3 dimension values quantized into int7 values. What are those spikes at the end? Well, that is the result of truncating extreme values during the quantization process. But, we are leaving some nice optimizations on the floor. What if we could tweak the confidence interval to shift the buckets, allowing for more important dimensional values to have higher fidelity. To optimize, Lucene does the following: Sample around 1,000 vectors from the data set and calculate their true nearest 10 neighbors. Calculate a set of candidate upper and lower quantiles. The set is calculated by using two different confidence intervals: 1 − 1 / ( d i m e n s i o n s + 1 ) 1 - 1/(dimensions+1) 1 − 1/ ( d im e n s i o n s + 1 ) and 1 − ( d i m e n s i o n s / 10 ) / ( d i m e n s i o n s + 1 ) 1-(dimensions/10)/(dimensions + 1) 1 − ( d im e n s i o n s /10 ) / ( d im e n s i o n s + 1 ) . These intervals are on the opposite extremes. For example, vectors with 1024 dimensions would search quantile candidates between confidence intervals 0.99902 and 0.90009 . Do a grid search over a subset of the quantiles that exist between these two confidence intervals. The grid search finds the quantiles that maximize the coefficient of determination of the quantization score errors vs. the true 10 nearest neighbors calculated earlier. Figure 3: Lucene searching the confidence interval space and testing various buckets for int4 quantization. Figure 4: The best int4 quantization buckets found for this CohereV3 sample set. For a more complete explanation of the optimization process and the mathematics behind this optimization, see optimizing the truncation interval . Speed vs. size for quantization As I mentioned before, int4 gives you an interesting tradeoff between performance and space. To drive this point home, here are some memory requirements for CohereV3 500k vectors. Figure 5: Memory requirements for CohereV3 500k vectors. Of course, we see the typical 4x reduction in regular scalar quantization, but then additional 2x reduction with int4 . Moving the required memory from 2GB to less than 300MB . Keep in mind, this is with compression enabled. Decompressing and compressing bytes does have an overhead at search time. For every byte vector, we must decompress them before doing the int4 comparisons. Consequently, when this is introduced in Elasticsearch, we want to give users the ability to choose to compress or not. For some users, the cheaper memory requirements are just too good to pass up, for others, their focus is speed. Int4 gives the opportunity to tune your settings to fit your use-case. Figure 6: HNSW graph search speed comparison for CohereV3 500k vectors. Speed part 2: more SIMD in int4 Figure 6 is a bit disappointing in terms of the speed of compressed scalar quantization. We expect performance benefits from loading fewer bytes to the JVM heap. However, this is being outweighed by the cost of decompressing them. This caused us dig deeper. The reason for the performance impact was naively decompressing the bytes separately from the dot-product comparison. This is a mistake. We can do better. Consequently, we can use SIMD to decompress the bytes and compare them in the same function. This is a bit more complicated than the previous SIMD example, but it is possible. Here is a simplified version of what this looks like for ARM. As expected, this has a significant improvement on ARM. Effectively removing all performance discrepancies on ARM between compressed and uncompressed scalar quantization. Figure 7: HNSW graph search comparison with int4 quantized vectors over 500k Coherev3 vectors. This is on ARM architecture. The end? Over the last two large and technical blog posts, we've gone over the math and intuition around the optimizations and what they bring to Lucene. It's been a long ride, and we are nowhere near done. Keep an eye out for these capabilities in a future Elasticsearch version! Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Jump to Introduction to Int4 quantization in Lucene How does Storing and scoring the quantized vectors Calculating the quantization error correction Finding the optimal bucketing for int4 quantization Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Understanding Int4 scalar quantization in Lucene - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/int4-scalar-quantization-in-lucene",
+    "meta_description": "This blog explains how int4 quantization works in Lucene, how it lines up, and the benefits of using int4 quantization."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog NEST lifetime extended & Elastic.Clients.Elasticsearch (v8) Roadmap Announcing the extension of the NEST (v7) lifetime and providing a high level overview of the Elastic.Clients.Elasticsearch (v8) roadmap. Developer Experience .NET FB By: Florian Bernd On October 15, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Extension of the NEST lifetime At Elastic, we typically offer support for our Elasticsearch products for two entire major series. So when version 9 is released, everything related to version 7 hits end-of-life. But sometimes there's a reason to deviate from this policy, and we've chosen to do just that for our version 7 client for .NET: NEST. We've been watching the downloads, and listening to user feedback, and have seen and heard that it's going to take a little longer for many of you to migrate onto the version 8 .NET client. For that reason, we'll be extending the support period of NEST right up to the end of 2025. This means that you're absolutely fine to keep using NEST for a while longer, and you don't need to rush your migration to the version 8 client. We'll also be working on our migration docs and code examples to help you as much as we can with the transition. Although we want to keep the lights on for longer, we are still working towards the end of life for NEST. So we won't be adding any new features to the old client, nor will we add explicit support for any new Elasticsearch APIs, or upgrade NEST to work with new versions of the .NET framework or Elasticsearch ( sticking with the NEST client prevents migration to Elasticsearch 9.x ). But we will monitor the library for bug fixes and security patches, and we'll make sure that NEST keeps working the way it always has. Let's dive into the roadmap in a bit more depth... Elastic.Clients.Elasticsearch (v8) roadmap In the meantime, we are working hard to not only improve the actual client, but also the documentation. A few of the most important points on our roadmap in this regard are as follows: Improvement of the getting started guide Providing as many code examples as possible Publishing up-to-date auto-generated reference documentation for each version In addition to the planned improvements to the documentation, several usability optimizations are also on the agenda. These include, for example: Re-implementation of the && , || and ! operators for the logical combination of queries (also for descriptor syntax) Simpler sorting Extended support for the default index Conclusion In conclusion, Elastic's extension of NEST support until the end of 2025 provides users with ample time to transition smoothly to the new client, ensuring continued stability and assistance throughout the process. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote Jump to Extension of the NEST lifetime Elastic.Clients.Elasticsearch (v8) roadmap Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "NEST lifetime extended & Elastic.Clients.Elasticsearch (v8) Roadmap - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/nest-lifetime-extension-v8-roadmap",
+    "meta_description": "Covering the extension of the Elastic NEST (v7) lifetime and providing a high level overview of the Elastic.Clients.Elasticsearch (v8) roadmap."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Scalar quantization optimized for vector databases Optimizing scalar quantization for the vector database use case allows us to achieve significantly better performance for the same retrieval quality at high compression ratios. ML Research TV BT By: Thomas Veasey and Benjamin Trent On April 25, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction We talked before about the scalar quantization work we've been doing in Lucene. In this two part blog we will dig a little deeper into how we're optimizing scalar quantization for the vector database use case. This has allowed us to unlock very nice performance gains for int4 quantization in Lucene, as we'll discuss in the second part. In the first part we're going to dive into the details of how we did this. Feel free to jump ahead to learn about what this will actually mean for you. Otherwise, buckle your seatbelts, because Kansas is going bye-bye! Scalar quantization recap First of all, a quick refresher on scalar quantization. Scalar quantization was introduced as a mechanism for accelerating inference. To date it has been mainly studied in that setting. For that reason, the main considerations were the accuracy of the model output and the performance gains that come from reduced memory pressure and accelerated matrix multiplication (GEMM) operations. Vector retrieval has some slightly different characteristics which we can exploit to improve quantization accuracy for a given compression ratio. The basic idea of scalar quantization is to truncate and scale each floating point component of a vector (or tensor) so it can be represented by an integer. Formally, if you use b b b bits to represent a vector component x x x as an integer in the interval [ 0 , 2 b − 1 ] [0,2^b-1] [ 0 , 2 b − 1 ] you transform it as follows x ↦ ⌊ ( 2 b − 1 ) clamp ( x , a , b ) b − a ⌉ x \\mapsto \\left\\lfloor \\frac{(2^b-1)\\text{clamp}(x,a,b)}{b-a} \\right\\rceil x ↦ ⌊ b − a ( 2 b − 1 ) clamp ( x , a , b ) ​ ⌉ where clamp ( ⋅ , a , b ) \\text{clamp}(\\cdot,a,b) clamp ( ⋅ , a , b ) denotes min ⁡ ( max ⁡ ( ⋅ , a ) , b ) \\min(\\max(\\cdot, a),b) min ( max ( ⋅ , a ) , b ) and ⌊ ⋅ ⌉ \\lfloor\\cdot\\rceil ⌊ ⋅ ⌉ denotes round to the nearest integer. People typically choose a a a and b b b based on percentiles of the distribution. We will discuss a better approach later. If you use int4 or 4 bit quantization then each component is some integer in the interval [0,15], that is each component takes one of only 16 distinct values! Novelties introduced to scalar quantization In this part, we are going to describe in detail two specific novelties we have introduced: A first order correction to the dot product when it is computed using integer values. An optimization procedure for the free parameters used to compute the quantized vector. Just to pause on point 1 for a second. What we will show is that we can continue to compute the dot product directly using integer arithmetic . At the same time, we can compute an additive correction that allows us to improve its accuracy. So we can improve retrieval quality without losing the opportunity of using extremely optimized implementations of the integer dot product. This translates to a clear cut win in terms of retrieval quality as a function of performance. 1. Error correcting the scalar dot product Most embedding models use either cosine or dot product similarity. The good thing is if you normalize your vectors then cosine (and even Euclidean) is equivalent to dot product (up to order). Therefore, reducing the quantization error in the dot product covers the great majority of use cases. This will be our focus. The vector database use case is as follows. There is a large collection of floating point vectors from some black box embedding model. We want a quantization scheme which achieves the best possible recall when retrieving with query vectors which come from a similar distribution. Recall is the proportion of true nearest neighbors, those with maximum dot product computed with float vectors, which we retrieve computing similarity with quantized vectors. We assume we've been given the truncation interval [ a , b ] [a,b] [ a , b ] for now. All vector components are snapped into this interval in order to compute their quantized values exactly as we discussed before. In the next part we will discuss how to optimize this interval for the vector database use case. To movitate the following analysis, consider that for any given document vector, if we knew the query vector ahead of time , we could compute the quantization error in the dot product exactly and simply subtract it. Clearly, this is not realistic since, apart from anything else, the query vector is not fixed. However, maybe we can achieve a real improvement by assuming that the query vector is drawn from a distribution that is centered around the document vector. This is plausible since queries that match a document are likely to be in the vicinity of its embedding. In the following, we formalize this intuition and actually derive a correction term. We first study the error that scalar quantization introduces into the dot product. We then devise a correction based on the expected first order error in the vicinity of each indexed vector. To do this requires us to store one extra float per vector. Since realistic vector dimensions are large this results in minimal overhead. We will call an arbitrary vector in our database x \\mathbf{x} x and an arbitrary query vector y \\mathbf{y} y . Then x t y = ( a + x − a ) t ( a + x − a ) = a t a + a t ( y − a ) + a t ( x − a ) + ( y − a ) t ( x − a ) \\begin{align*} \\mathbf{x}^t\\mathbf{y} &= (\\mathbf{a}+\\mathbf{x}-\\mathbf{a})^t(\\mathbf{a}+\\mathbf{x} - \\mathbf{a}) \\\\ &= \\mathbf{a}^t\\mathbf{a}+\\mathbf{a}^t(\\mathbf{y}-\\mathbf{a})+\\mathbf{a}^t(\\mathbf{x}-\\mathbf{a})+(\\mathbf{y}-\\mathbf{a})^t(\\mathbf{x}-\\mathbf{a}) \\end{align*} x t y ​ = ( a + x − a ) t ( a + x − a ) = a t a + a t ( y − a ) + a t ( x − a ) + ( y − a ) t ( x − a ) ​ On the right hand side, the first term is a constant and the second two terms are a function of a single vector and can be precomputed. For the one involving the document, this is an extra float that can be stored with its vector representation. So far all our calculations can use floating point arithmetic. Everything interesting however is happening in the last term, which depends on the interaction between the query and the document. We just need one more bit more notation: define α = b − a 2 b − 1 \\alpha = \\frac{b-a}{2^b-1} α = 2 b − 1 b − a ​ and ⋆ q = clamp ( ⋆ , a , b ) α \\star_q=\\frac{\\text{clamp}(\\star, \\mathbf{a}, \\mathbf{b})}{\\alpha} ⋆ q ​ = α clamp ( ⋆ , a , b ) ​ where we understand that the clamp function broadcasts over vector components. Let's rewrite the last term, still keeping everything in floating point, using a similar trick: ( y − a ) t ( x − a ) = ( α y q + y − a − α y q ) t ( α x q + x − a − α x q ) = α 2 y q t x q − α y q t ϵ x + α x q t ϵ y + O ( ∥ ϵ ∥ 2 ) \\begin{align*} (\\mathbf{y}-\\mathbf{a})^t(\\mathbf{x}-\\mathbf{a}) &= (\\alpha \\mathbf{y}_q + \\mathbf{y} -\\mathbf{a} - \\alpha\\mathbf{y}_q)^t(\\alpha \\mathbf{x}_q + \\mathbf{x} -\\mathbf{a} - \\alpha\\mathbf{x}_q) \\\\ &= \\alpha^2\\mathbf{y}_q^t\\mathbf{x}_q - \\alpha\\mathbf{y}_q^t\\mathbf{\\epsilon}_x + \\alpha\\mathbf{x}_q^t\\mathbf{\\epsilon}_y + \\text{O}(\\|\\mathbf{\\epsilon}\\|^2) \\end{align*} ( y − a ) t ( x − a ) ​ = ( α y q ​ + y − a − α y q ​ ) t ( α x q ​ + x − a − α x q ​ ) = α 2 y q t ​ x q ​ − α y q t ​ ϵ x ​ + α x q t ​ ϵ y ​ + O ( ∥ ϵ ∥ 2 ) ​ Here, ϵ ⋆ = ⋆ − a − ⋆ q \\mathbf{\\epsilon}_{\\star} = \\star - \\mathbf{a}- \\star_q ϵ ⋆ ​ = ⋆ − a − ⋆ q ​ represents the quantization error. The first term is just the scaled quantized vector dot product and can be computed exactly. The last term is proportional in magnitude to the square of the quantization error and we hope this will be somewhat small compared to the overall dot product. That leaves us with the terms that are linear in the quantization error. We can compute the quantization error vectors in the query and document, ϵ y \\mathbf{\\epsilon}_y ϵ y ​ and ϵ x \\mathbf{\\epsilon}_x ϵ x ​ respectively, ahead of time. However, we don't actually know the value of x \\mathbf{x} x we will be comparing to a given y \\mathbf{y} y and vice versa. So we don't know how to calculate the error in the dot product quantities exactly. In such cases it is natural to try and minimize the error in expectation (in some sense we discuss below). If x \\mathbf{x} x and y \\mathbf{y} y are drawn at random from our corpus they are random variables and so too are x q \\mathbf{x}_q x q ​ and y q \\mathbf{y}_q y q ​ . For any distribution we average over for x q \\mathbf{x}_q x q ​ then E x [ α x q t ϵ y ] = α E x [ x q ] t ϵ y \\mathbb{E}_x[\\alpha\\mathbf{x}_q^t \\mathbf{\\epsilon}_y]=\\alpha\\mathbb{E}_x[\\mathbf{x}_q]^t \\mathbf{\\epsilon}_y E x ​ [ α x q t ​ ϵ y ​ ] = α E x ​ [ x q ​ ] t ϵ y ​ , since α \\alpha α and ϵ y \\mathbf{\\epsilon}_y ϵ y ​ are fixed for a query. This is a constant additive term to the score of each document, which means it does not change their order. This is important as it will not change the quality of retrieval and so we can drop it altogether. What about the α y q t ϵ x \\alpha\\mathbf{y}_q^t \\mathbf{\\epsilon}_x α y q t ​ ϵ x ​ term? The naive thing is to assume that the y q t \\mathbf{y}_q^t y q t ​ is a random sample from our corpus. However, this is not the best assumption. In practice, we know that the queries which actually match a document will come from some region in the vicinity of its embedding as we illustrate in the figure below. Schematic query distribution that is expected to match a given document embedding (orange) vs all queries (blue plus orange). We can efficiently find nearest neighbors of each document x \\mathbf{x} x in the database once we have a proximity graph. However, we can do something even simpler and assume that for relevant queries E y [ y q ] ≈ x q \\mathbb{E}_y[\\mathbf{y}_q] \\approx \\mathbf{x}_q E y ​ [ y q ​ ] ≈ x q ​ . This yields a scalar correction α x q t ϵ x \\alpha\\mathbf{x}_q^t\\mathbf{\\epsilon}_x α x q t ​ ϵ x ​ which only depends on the document embedding and can be precomputed and added on to the a t ( x − a ) \\mathbf{a}^t(\\mathbf{x}-\\mathbf{a}) a t ( x − a ) term and stored with the vector. We show later how this affects the quality of retrieval. The anisotropic correction is inspired by this approach for reducing product quantization errors. Finally, we note that the main obstacle to improving this correction is we don't have useful estimates of the joint distribution of the query and document embedding quantization errors. One approach that might enable this, at the cost of some extra memory and compute, is to use low rank approximations of these errors. We plan to study schemes like this since we believe they could unlock accurate general purpose 2 or even 1 bit scalar quantization. 2. Optimizing the truncation interval So far we worked with some specified interval [ a , b ] [a,b] [ a , b ] but didn't discuss how to compute it. In the context of quantization for model inference people tend to use quantile points of the component distribution or their minimum and maximum. Here, we discuss a new method for computing this based on the idea that preserving the order of the dot product values is better suited to the vector database use case. First off, why is this not equivalent to minimizing the magnitude of the quantization errors? Suppose for a query y \\mathbf{y} y the top-k matches are x i \\mathbf{x}_i x i ​ for i ∈ [ k ] i \\in [k] i ∈ [ k ] . Consider two possibilities, the quantization error is some constant c c c , or it is is normally distributed with mean 0 0 0 and standard deviation c 10 \\frac{c}{10} 10 c ​ . In the second case the expected error is roughly 10 times smaller than the first. However, the first effect is a constant shift, which preserves order and has no impact on recall. Meanwhile, if 1 k ∑ i = 1 k ∣ y t ( x i − x i + 1 ) ∣ < c 10 \\frac{1}{k}\\sum_{i=1}^k \\left|\\mathbf{y}^t(\\mathbf{x}_i-\\mathbf{x}_{i+1})\\right| < \\frac{c}{10} k 1 ​ ∑ i = 1 k ​ ∣ y t ( x i ​ − x i + 1 ​ ) ∣ < 10 c ​ it is very likely the random error will reorder matches and so affect the quality of retrieval. Let's use the previous example to better motivate our approach. The figure below shows the various quantities at play for a sample query y \\mathbf{y} y and two documents x i \\mathbf{x}_i x i ​ and x i + 1 \\mathbf{x}_{i+1} x i + 1 ​ . The area of each blue shaded rectangle is equal to one of the floating point dot products and the area of each red shaded rectangle is equal to one of the quantization errors . Specifically, the dot products are ∥ y ∥ ∥ P y x i ∥ \\|\\mathbf{y}\\|\\|P_y\\mathbf{x}_i\\| ∥ y ∥∥ P y ​ x i ​ ∥ and ∥ y ∥ ∥ P y x i + 1 ∥ \\|\\mathbf{y}\\|\\|P_y\\mathbf{x}_{i+1}\\| ∥ y ∥∥ P y ​ x i + 1 ​ ∥ , and the quantization errors are ∥ y ∥ ∥ P y ( x i − a − x i , q ) ∥ \\|\\mathbf{y}\\|\\|P_y(\\mathbf{x}_i-\\mathbf{a}-\\mathbf{x}_{i,q})\\| ∥ y ∥∥ P y ​ ( x i ​ − a − x i , q ​ ) ∥ and ∥ y ∥ ∥ P y ( x i + 1 − a − x i + 1 , q ) ∥ \\|\\mathbf{y}\\|\\|P_y(\\mathbf{x}_{i+1}-\\mathbf{a}-\\mathbf{x}_{i+1,q})\\| ∥ y ∥∥ P y ​ ( x i + 1 ​ − a − x i + 1 , q ​ ) ∥ where P y = y y t ∥ y ∥ 2 P_y=\\frac{\\mathbf{y}\\mathbf{y}^t}{\\|\\mathbf{y}\\|^2} P y ​ = ∥ y ∥ 2 y y t ​ is the projection onto the query vector. In this example the errors preserve the document order. This follows because the right blue rectangle (representing the exact dot product) and union of right blue and red rectangles (representing the quantized dot product) are both larger than the left ones. It is visually clear the more similar the left and right red rectangles the less likely it is the documents will be reordered. Conversely, the more similar the left and right blue rectangles the more likely it is that quantization will reorder them. Schematic of the dot product values and quantization error values for a query and two near neighbor documents. In this case, the document order is preserved by quantization. One way to think of the quantized dot product is that it models the floating point dot product. From the previous discussion we want to minimize the variance of this model's residual error, which should be as similar as possible for each document. However, there is a second consideration: the density of the floating point dot product values. If these values are close together it is much more likely that quantization will reorder them. It is quite possible for this density to change from one part of the embedding space to another and higher density regions are more sensitive to quantization errors. A natural measure which captures both the quantization error variance and the density of the dot product values is the coefficient of determination of the quantized dot product with respect to the floating point dot product. A good interval [ a , b ] [a,b] [ a , b ] will maximize this in expectation over a representative query distribution. We need a reasonable estimator for this quantity for the database as a whole that we can compute efficiently. We found the following recipe is both fast and yields an excellent choice for parameters a a a and b b b : Sample 1000 random document vectors from the index. For each sample vector find its 10 nearest neighbors. Maximize the average coefficient of determination of the quantized dot product between the sampled vectors and their nearest neighbors with respect to the interval [ a , b ] [a,b] [ a , b ] . This optimization problem can be solved by any black box solver. For example, we used a variant of the adaptive LIPO algorithm in the following. Furthermore, we found that our optimization objective was well behaved (low Lipschitz constant ) for all data sets we tested. Proof of principle for int4 quantization Before deciding to implement this scheme for real we studied how it behaves with int4 quantization. Below we show results for two data sets that are fairly typical of passage embedding model distributions on real data. To generate these we use e5-small-v2 and Cohere's multilingual-22-12 models. These are both fairly state-of-the-art text embedding models. However, they have rather different characteristics. The e5-small-v2 model uses cosine similarity, its vectors have 384 dimensions and very low angular variation. The multilingual-22-12 model uses dot product similarity, its vectors have 768 dimensions and it encodes information in their length. They pose rather different challenges for our quantization scheme and improving both gives much more confidence it works generally. For e5-small-v2 we embedded around 500K passages and 10K queries sampled from the MS MARCO passage data set. For multilingual-22-12 we used around 1M passages and 1K distinct passages for queries sampled from the English Wikipedia data set. First of all, it is interesting to understand the accuracy of the int4 dot product values. The figure below shows the int4 dot product values compared to their float values for a random sample of 100 documents and their 10 nearest neighbors taken from the set we use to compute the optimal truncation interval for e5-small-v2. The orange “best fit” line is y = x − 0.017 y=x-0.017 y = x − 0.017 . Note that this underlines the fact that this procedure can pick a biased estimator if it reduces the residual variance: in this case the quantized dot product is systematically underestimating the true dot product. However, as we discussed before, any constant shift in the dot product is irrelevant for ranking. For the full 1k samples we achieve an R 2 R^2 R 2 of a little less than 0.995, i.e. the int4 quantized dot product is a very good model of the float dot product! Comparison of int4 dot product values to the corresponding float values for a random sample of 100 documents and their 10 nearest neighbors. While this is reassuring, what we really care about is the impact on retrieval quality. Since one can implement brute force nearest neighbor search in a few lines, it allows us to quickly test the impact of our design choices on retrieval. In particular, we are interested in understanding the expected proportion of true nearest neighbors we retrieve when computing similarities using int4 quantized vectors. Below we show the results for an ablation study of the dot product correction and interval optimization. In general, one can boost the accuracy of any quantization scheme by gathering more than the requested vector count and reranking them using their floating point values. However, this comes with a cost: it is significantly more expensive to search graphs for more matches and the floating point vectors must be loaded from disk or it defeats the purpose of compressing them. One way of comparing alternatives is therefore to understand how many vectors must be reranked to achieve the same recall. The lower this number is, the better. In the figures below we show average recall curves as a function of the number of candidates we rerank for different combinations of the two improvements we have discussed: “Baseline” sets [ a , b ] [a,b] [ a , b ] to the 1 − 1 d + 1 1-\\frac{1}{d+1} 1 − d + 1 1 ​ central confidence and applies no correction to the dot product, “No correction” optimizes the interval [ a , b ] [a,b] [ a , b ] by maximizing R 2 R^2 R 2 but applies no correction to the dot product, “No optimization” sets [ a , b ] [a,b] [ a , b ] to the 1 − 1 d + 1 1-\\frac{1}{d+1} 1 − d + 1 1 ​ central confidence but applies the linear correction to the dot product, and “Our scheme” optimizes the interval [ a , b ] [a,b] [ a , b ] by maximizing R 2 R^2 R 2 and applies the linear correction to the dot product. Note that we used d d d to denote the vector dimension. Average recall@10 curves for e5-small-v2 embeddings as a function of the number of candidates reranked for different combinations of the two improvements we discussed. Average recall@10 curves for multilingual-22-12 embeddings as a function of the number of candidates reranked for different combinations of the two improvements we discussed. For e5-small-v2 embeddings we roughly halve the number of vectors we need to rerank to achieve 95% recall compared to the baseline. For multilingual-22-12 embeddings we reduce it by closer to a factor of three. Interestingly, the impact of the two improvements is different for the different data sets. For e5-small-v2 embeddings applying the linear correction has a significantly larger effect than optimizing the interval [ a , b ] [a,b] [ a , b ] whilst the converse is true for multilingual-22-12 embeddings. Another important observation is the gains are more significant if one wants to achieve very high recall: to achieve close to 99% recall one has to rerank at least 5 times as many vectors for both data sets in the baseline versus our improved quantization scheme. Conclusion We have discussed the theoretical and empirical motivation behind two novelties we introduced to achieve high quality int4 quantization, as well as some preliminary results that indicate it'll be an effective general purpose scheme for in memory vector storage for retrieval. This is all well and good, but how well does it work in a real vector database implementation? In the companion blog we discuss our implementation in Lucene and compare it to other storage options such as floating point and int7, which Lucene also provides. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Introduction Scalar quantization recap Novelties introduced to scalar quantization 1. Error correcting the scalar dot product 2. Optimizing the truncation interval Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Scalar quantization optimized for vector databases - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/vector-db-optimized-scalar-quantization",
+    "meta_description": "Optimizing scalar quantization for the vector database use case allows us to achieve significantly better performance for the same retrieval quality at high compression ratios."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Bringing maximum-inner-product into Lucene Explore how we brought maximum-inner-product into Lucene and the investigations undertaken to ensure its support. Lucene BT By: Benjamin Trent On September 1, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Currently Lucene restricts dot_product to be only used over normalized vectors. Normalization forces all vector magnitudes to equal one. While for many cases this is acceptable, it can cause relevancy issues for certain data sets. A prime example are embeddings built by Cohere . Their vectors use magnitudes to provide more relevant information. So, why not allow non-normalized vectors in dot-product and thus enable maximum-inner-product? What's the big deal? Negative values and Lucene optimizations Lucene requires non-negative scores, so that matching one more clause in a disjunctive query can only make the score greater, not lower. This is actually important for dynamic pruning optimizations such as block-max WAND , whose efficiency is largely defeated if some clauses may produce negative scores. How does this requirement affect non-normalized vectors? In the normalized case, all vectors are on a unit sphere. This allows handling negative scores to be simple scaling. Figure 1: Two opposite, two dimensional vectors in a 2d unit sphere (e.g. a unit circle). When calculating the dot-product here, the worst it can be is -1 = [1, 0] * [-1, 0]. Lucene accounts for this by adding 1 to the result. With vectors retaining their magnitude, the range of possible values is unknown. Figure 2: When calculating the dot-product for these vectors [2, 2] \\* [-5, -5] = -20 To allow Lucene to utilize blockMax WAND with non-normalized vectors, we must scale the scores. This is a fairly simple solution. Lucene will scale non-normalize vectors with a simple piecewise function: Now all negative scores are between 0-1, and all positives are scaled above 1. This still ensures that higher values mean better matches and removes negative scores. Simple enough, but this is not the final hurdle. The triangle problem Maximum-inner-product doesn't follow the same rules as of simple euclidean spaces . The simple assumed knowledge of the triangle inequality is abandoned. Unintuitively, a vector is no longer nearest to itself. This can be troubling. Lucene’s underlying index structure for vectors is Hierarchical Navigable Small World (HNSW). This being a graph based algorithm, it might rely on euclidean space assumptions. Or would exploring the graph be too slow in non-euclidean space? Some research has indicated that a transformation into euclidean space is required for fast search . Others have gone through the trouble of updating their vector storage enforcing transformations into euclidean space. This caused us to pause and dig deep into some data. The key question is this: does HNSW provide good recall and latency with maximum-inner-product search? While the original HNSW paper and other published research indicate that it does, we needed to do our due diligence. Experiments and results: Maximum-inner-product in Lucene The experiments we ran were simple. All of the experiments are over real data sets or slightly modified real data sets. This is vital for benchmarking as modern neural networks create vectors that adhere to specific characteristics ( see discussion in section 7.8 of this paper ). We measured latency (in milliseconds) vs. recall over non-normalized vectors. Comparing the numbers with the same measurements but with a euclidean space transformation. In each case, the vectors were indexed into Lucene’s HNSW implementation and we measured for 1000 iterations of queries. Three individual cases were considered for each dataset: data inserted ordered by magnitude (lesser to greater), data inserted in a random order, and data inserted in reverse order (greater to lesser). Here are some results from real datasets from Cohere: Figure 3: Here are results for the Cohere’s Multilingual model embedding wikipedia articles. Available on HuggingFace . The first 100k documents were indexed and tested. Figure 4: This is a mixture of Cohere’s English and Japanese embeddings over wikipedia. Both datasets are available on HuggingFace. We also tested against some synthetic datasets to ensure our rigor. We created a data set with e5-small-v2 and scaled the vector's magnitudes by different statistical distributions. For brevity, I will only show two distributions. Figure 5: Pareto distribution of magnitudes. A pareto distribution has a “fat tail” meaning there is a portion of the distribution with a much larger magnitude than others. Figure 6: Gamma distribution of magnitudes. This distribution can have high variance and makes it unique in our experiments. In all our experiments, the only time where the transformation seemed warranted was the synthetic dataset created with the gamma distribution. Even then, the vectors must be inserted in reverse order, largest magnitudes first, to justify the transformation. These are exceptional cases. If you want to read about all the experiments, and about all the mistakes and improvements along the way, here is the Lucene Github issue with all the details (and mistakes along the way). Here’s one for open research and development! Conclusion This has been quite a journey requiring many investigations to make sure maximum-inner-product can be supported in Lucene. We believe the data speaks for itself. No significant transformations required or significant changes to Lucene. All this work will soon unlock maximum-inner-product support with Elasticsearch and allow models like the ones provided by Cohere to be first class citizens in the Elastic Stack. Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Negative values and Lucene optimizations The triangle problem Experiments and results: Maximum-inner-product in Lucene Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Bringing maximum-inner-product into Lucene - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/lucene-bringing-maximum-inner-product-to-lucene",
+    "meta_description": "Explore how we brought maximum-inner-product into Lucene and the investigations undertaken to ensure its support."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch geospatial search with ES|QL Geospatial search in Elasticsearch Query Language (ES|QL). Elasticsearch has powerful geospatial search features, which are now coming to ES|QL for dramatically improved ease of use and OGC familiarity. Python How To CT By: Craig Taverner On August 12, 2024 Part of Series Elasticsearch geospatial search Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch has had powerful geospatial search and analytics capabilities for many years, but the API was quite different from what typical GIS users were used to. In the past year we've added the ES|QL query language , a piped query language as easy, or even easier, than SQL. It's particularly suited to the search, security, and observability use cases Elastic excels at. We're also adding support for geospatial search and analytics within ES|QL, making it far easier to use, especially for users coming from SQL or GIS communities. Elasticsearch 8.12 and 8.13 brought basic support for geospatial types to ES|QL. This was dramatically enhanced with the addition of geospatial search capabilities in 8.14. More importantly, this support was designed to conform closely to the Simple Feature Access standard from the Open Geospatial Consortium (OGC) used by other spatial databases like PostGIS, making it much easier to use for GIS experts familiar with these standards. In this blog, we'll show you how to use ES|QL to perform geospatial searches, and how it compares to the SQL and Query DSL equivalents. We'll also show you how to use ES|QL to perform spatial joins, and how to visualize the results in Kibana Maps. Note that all the features described here are in \"technical preview\", and we'd love to hear your feedback on how we can improve them. Searching for geospatial data Let's start with an example query: This performs a search for any city boundary polygons that intersect with a rectangular search polygon around the Sanya Phoenix International Airport (SYX). In a sample dataset of airports, cities and city boundaries, this search finds the intersecting polygon and returns the desired fields from the matching document: abbrev airport region city city_location SYX Sanya Phoenix Int'l 天涯区 Sanya POINT(109.5036 18.2533) That was easy! Now compare this to the classic Elasticsearch Query DSL for the same query: Both queries are reasonably clear in their intent, but the ES|QL query closely resembles SQL. The same query in PostGIS looks like this: Look back at the ES|QL example. So similar, right? We've found that existing users of the Elasticsearch API find ES|QL much easier to use. We now expect that existing SQL users, particularly Spatial SQL users, will find that ES|QL feels very familiar to what they are used to seeing. Why not SQL? What about Elasticsearch SQL? It has been around for a while and has some geospatial features. However, Elasticsearch SQL was written as a wrapper on top of the original Query API, which meant only queries that could be transpiled down to the original API were supported. ES|QL does not have this limitation. Being a completely new stack allows for many optimizations that were not possible in SQL. Our benchmarks show ES|QL is very often faster than the Query API , particularly with aggregations! Differences to SQL Clearly, from the previous example, ES|QL is somewhat similar to SQL, but there are some important differences. For example, ES|QL is a piped query language, starting with a source command like FROM and then chaining all subsequent commands together with the pipe | character. This makes it very easy to understand how each command receives a table of data and performs some action on that table, such as filtering with WHERE , adding columns with EVAL , or performing aggregations with STATS . Rather than starting with SELECT to define the final output columns, there can be one or more KEEP commands, with the last one specifying the final output results. This structure simplifies reasoning about the query. Focusing in on the WHERE command in the above example, we can see it looks quite similar to the PostGIS example: ES|QL PostGIS Aside from the difference in string quotation characters, the biggest difference is in how we type-cast the string to a spatial type. In PostGIS, we use the ::geometry suffix, while in ES|QL, we use the ::geo_shape suffix. This is because ES|QL runs within Elasticsearch, and the type-casting operator :: can be used to convert a string to any of the supported ES|QL types , in this case, a geo_shape . Additionally, the geo_shape and geo_point types in Elasticsearch imply the spatial coordinate system known as WGS84, more commonly referred to using the SRID number 4326. In PostGIS, this needs to be explicit, hence the use of the SRID=4326; prefix to the WKT string. If that prefix is removed, the SRID will be set to 0, which is more like the Elasticsearch types cartesian_point and cartesian_shape , which are not tied to any specific coordinate system. Both ES|QL and PostGIS provide type conversion function syntax as well: ES|QL PostGIS OGC functions Elasticsearch 8.14 introduces the following four OGC spatial search functions: ES|QL PostGIS Description ST_INTERSECTS ST_Intersects Returns true if two geometries intersect, and false otherwise. ST_DISJOINT ST_Disjoint Returns true if two geometries do not intersect, and false otherwise. The inverse of ST_INTERSECTS. ST_CONTAINS ST_Contains Returns true if one geometry contains another, and false otherwise. ST_WITHIN ST_Within Returns true if one geometry is within another, and false otherwise. The inverse of ST_CONTAINS. These function behave similarly to their PostGIS counterparts, and are used in the same way. For example, ST_INTERSECTS returns true if two geometries intersect and false otherwise. If you follow the documentation links in the above table, you might notice that all the ES|QL examples are within a WHERE clause after a FROM clause, while all the PostGIS examples are using literal geometries. In fact, both platforms support using the functions in any part of the query where they make sense. The first example in the PostGIS documentation for ST_INTERSECTS is: The ES|QL equivalent of this would be: Note how we did not specify the SRID in the PostGIS example. This is because in PostGIS when using the geometry type, all calculations are done on a planar coordinate system, and so if both geometries have the same SRID, it does not matter what the SRID is. In Elasticsearch, this is also true for most functions, however, there are exceptions where geo_shape and geo_point use spherical calculations, as we'll see in the next blog about spatial distance search. ES|QL versatility So, we've seen examples above for using spatial functions in WHERE clauses, and in ROW commands. Where else would they make sense? One very useful place is in the EVAL command. This command allows you to evaluate an expression and return the result. For example, let's determine if the centroids of all airports grouped by their country names are within a boundary outlining the country: The results are expected, the centroid of UK airports are within the UK boundary, and not within the Iceland boundary, and vice versa: centroid count in_uk in_iceland within_uk within_iceland POINT (-21.946634463965893 64.13187285885215) 1 false true false true POINT (-2.597342072712148 54.33551226578214) 17 true false true false POINT (0.04453958108176276 23.74658354606057) 873 false false false false In fact, these functions can be used in any part of the query where their signature makes sense. They all take two arguments, which are either a literal spatial object or a field of a spatial type, and they all return a boolean value. One important consideration is that the coordinate reference system (CRS) of the geometries must match, or an error will be returned. This means you cannot mix geo_shape and cartesian_shape types in the same function call. You can, however, mix geo_point and geo_shape types, as the geo_point type is a special case of the geo_shape type, and both share the same coordinate reference system. The documentation for each of the functions defined above lists the supported type combinations. Additionally, either argument can be a spatial literal or a field, in either order. You can even specify two fields, two literals, a field and a literal, or a literal and a field. The only requirement is that the types are compatible. For example, this query compares two fields in the same index: The query basically asks if the city location is within the city boundary, which should generally be true, but there are always exceptions: cardinality count in_city few 29 false many 740 true A far more interesting question would be whether the airport location is within the boundary of the city that the airport serves. However, the airport location resides in a different index than the one containing the city boundaries. This requires a method to effectively query and correlate data from these two separate indexes. Spatial joins ES|QL does not support JOIN commands, but you can achieve a special case of a join using the ENRICH command , which behaves similarly to a 'left join' in SQL. This command operates akin to a 'left join' in SQL, allowing you to enrich results from one index with data from another index based on a spatial relationship between the two datasets. For example, let's enrich the results from a table of airports with additional information about the city they serve by finding the city boundary that contains the airport location, and then perform some statistics on the results: This returns the top 5 regions with the most airports, along with the centroid of all the airports that have matching regions, and the range in length of the WKT representation of the city boundaries within those regions: centroid count min_wkt max_wkt region POINT (-32.56093470960719 32.598117914802714) 90 207 207 null POINT (-73.94515332765877 40.70366442203522) 9 438 438 City of New York POINT (-83.10398317873478 42.300230911932886) 9 473 473 Detroit POINT (-156.3020245861262 20.176383580081165) 5 307 803 Hawaii POINT (-73.88902732171118 45.57078813901171) 4 837 837 Montréal So, what really happened here? Where did the supposed JOIN occur? The crux of the query lies in the ENRICH command: This command instructs Elasticsearch to enrich the results retrieved from the airports index, and perform an intersects join between the city_location field of the original index, and the city_boundary field of the airport_city_boundaries index, which we used in a few examples earlier. But some of this information is not clearly visible in this query. What we do see is the name of an enrich policy city_boundaries , and the missing information is encapsulated within that policy definition. Here we can see that it will perform a geo_match query ( intersects is the default), the field to match against is city_boundary , and the enrich_fields are the fields we want to add to the original document. One of those fields, the region was actually used as the grouping key for the STATS command, something we could not have done without this 'left join' capability. For more information on enrich policies, see the enrich documentation . While reading those documents, you will notice that they describe using the enrich indexes for enriching data at index time, by configuring ingest pipelines. This is not required for ES|QL, as the ENRICH command works at query time. It is sufficient to prepare the enrich index with the necessary data and enrich policy, and then use the ENRICH command in your ES|QL queries. You may also notice that the most commonly found region was null . What could this imply? Recall that I likened this command to a 'left join' in SQL, meaning if no matching city boundary is found for an airport, the airport is still returned but with null values for the fields from the airport_city_boundaries index. It turns out there were 89 airports that found no matching city_boundary , and one airport with a match where the region field was null . This lead to a count of 90 airports with no region in the results. Another interesting detail is the need for the MV_EXPAND command. This is necessary because the ENRICH command may return multiple results for each input row, and MV_EXPAND helps to separate these results into multiple rows, one for each outcome. This also clarifies why \"Hawaii\" shows different min_wkt and max_wkt results: there were multiple regions with the same name but different boundaries. Kibana Maps Kibana has added support for Spatial ES|QL in the Maps application. This means that you can now use ES|QL to search for geospatial data in Elasticsearch, and visualize the results on a map. There is a new layer option in the add layers menu, called \"ES|QL\". Like all of the geospatial features described so far, this is in \"technical preview\". Selecting this option allows you to add a layer to the map based on the results of an ES|QL query. For example, you could add a layer to the map that shows all the airports in the world. Or you could add a layer that shows the polygons from the airport_city_boundaries index, or even better, how about that complex ENRICH query above that generates statistics for how many airports are in each region? What's next You might have noticed in two of the examples above we squeezed in yet another spatial function ST_CENTROID_AGG . This is an aggregating function used in the STATS command, and the first of many spatial analytics features we plan to add to ES|QL. We'll blog about it when we've got more to show! Before that, we want to tell you more about a particularly exciting feature we've worked on: the ability to perform spatial distance searches, one of the most used spatial search features of Elasticsearch. Can you imagine what the syntax for distance searches might look like? Perhaps similar to an OGC function? Stay tuned for the next blog in this series to find out! Spoiler alert: Elasticsearch 8.15 has just been released, and spatial distance search with ES|QL is included! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Searching for geospatial data Why not SQL? Differences to SQL OGC functions ES|QL versatility Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch geospatial search with ES|QL - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-geospatial-search-part-one",
+    "meta_description": "Learn how to perform geospatial searches with Elasticsearch's ES|QL. Explore  geospatial queries, spatial joins & visualize results in Kibana."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch heap size usage and JVM garbage collection Exploring Elasticsearch heap size usage and JVM garbage collection, including best practices and how to resolve issues when heap memory usage is too high or when JVM performance is not optimal. How To KB By: Kofi Bartlett On April 22, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. The heap size is the amount of RAM allocated to the Java Virtual Machine of an Elasticsearch node. As of version 7.11, Elasticsearch by default automatically sets the JVM heap size based on a node’s roles and total memory. Using the default sizing is recommended for most production environments. However, if you want to manually set your JVM heap size, as a general rule you should set -Xms and -Xmx to the SAME value, which should be 50% of your total available RAM subject to a maximum of (approximately) 31GB. A higher heap size will give your node more memory for indexing and search operations. However, your node also requires memory for caching, so using 50% maintains a healthy balance between the two. For this same reason in production you should avoid using other memory intensive processes on the same node as Elasticsearch. Typically, the heap usage will follow a saw tooth pattern, oscillating between around 30 and 70% of the maximum heap being used. This is because the JVM steadily increases heap usage percentage until the garbage collection process frees up memory again. High heap usage occurs when the garbage collection process cannot keep up. An indicator of high heap usage is when the garbage collection is incapable of reducing the heap usage to around 30%. In the image above, you can see a normal sawtooth of JVM heap. You will also see that there are two types of garbage collections, young and old GC. In a healthy JVM, garbage collection should ideally meet the following conditions: Young GC is processed quickly (within 50 ms). Young GC is not frequently executed (about 10 seconds). Old GC is processed quickly (within 1 second). Old GC is not frequently executed (once per 10 minutes or more). How to resolve when heap memory usage is too high or when JVM performance is not optimal There can be a variety of reasons why heap memory usage can increase: Oversharding Please see the document on oversharding here . Large aggregation sizes In order to avoid large aggregation sizes, keep the number of aggregation buckets (size) in your queries to a minimum. You can use slow query logging (slow logs) and implement it on a specific index using the following. Queries that take a long time to return results are likely to be the resource-intensive ones. Excessive bulk index size If you are sending large requests, then this can be a cause of high heap consumption. Try reducing the size of the bulk index requests. Mapping issues In particular, if you use “fielddata: true” then this can be a major user of your JVM heap. Heap size incorrectly set The heap size can be manually defined by: Setting the environment variable: Editing the jvm.options file in your Elasticsearch configuration directory: The environmental variable setting takes priority over the file setting. It is necessary to restart the node for the setting to be taken into account. JVM new ratio incorrectly set It is generally NOT necessary to set this, since Elasticsearch sets this value by default. This parameter defines the ratio of space available for “new generation” and “old generation” objects in the JVM. If you see that old GC is becoming very frequent, you can try specifically setting this value in jvm.options file in your Elasticsearch config directory. What are the best practices for managing heap size usage and JVM garbage collection in a large Elasticsearch cluster? The best practices for managing heap size usage and JVM garbage collection in a large Elasticsearch cluster are to ensure that the heap size is set to a maximum of 50% of the available RAM, and that the JVM garbage collection settings are optimized for the specific use case. It is important to monitor the heap size and garbage collection metrics to ensure that the cluster is running optimally. Specifically, it is important to monitor the JVM heap size, garbage collection time, and garbage collection pauses. Additionally, it is important to monitor the number of garbage collection cycles and the amount of time spent in garbage collection. By monitoring these metrics, it is possible to identify any potential issues with the heap size or garbage collection settings and take corrective action if necessary. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to How to resolve when heap memory usage is too high or when JVM performance is not optimal Oversharding Large aggregation sizes Excessive bulk index size Mapping issues Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch heap size usage and JVM garbage collection - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-heap-size-jvm-garbage-collection",
+    "meta_description": "Exploring Elasticsearch heap size usage and JVM garbage collection, including best practices and how to resolve issues when heap memory usage is too high or when JVM performance is not optimal."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Building multilingual RAG with Elastic and Mistral Building a multilingual RAG application using Elastic and Mixtral 8x22B model Integrations Generative AI Vector Database Python How To GL By: Gustavo Llermaly On August 2, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Mixtral 8x22B is the most performant open model, and one of its most powerful features is fluency in many languages; including English, Spanish, French, Italian, and German. Imagine a multinational company with support tickets and solutions in different languages and wants to take advantage of that knowledge across divisions. Currently, knowledge is limited to the language the agent speaks. Let's fix that! In this article, I’m going to show you how to test Mixtral’s language capabilities, by creating a multilingual RAG system. You can follow the notebook to reproduce this article's example here Steps Creating embeddings endpoint Creating mappings Indexing data Asking questions Creating embeddings endpoint Our support tickets for this example will come in English, Spanish, and German. The Mistral embeddings model is not multilingual, but we can generate multilingual embeddings using the e5 model, so we can index text on different languages and manage it as a single source, giving us a much richer context. To create e5 multilingual embeddings you can use Kibana: Or the _inference API: Creating Mappings For the mappings we will use semantic_text mapping type, which is one of my favorite features. It handles the process of chunking the data, generating embeddings, and querying embeddings for you! We call the text field super_body because with a single mapping type it will handle chunks and embeddings. Indexing data We will index a couple of support tickets with problems and solutions in two languages, and then ask a question about problems within many documents in a third. The following documents will be added to the index: 1. English Support Ticket: Calendar Sync Issue Support Ticket #EN1234 Subject : Calendar sync not working with Google Calendar Description : I'm having trouble syncing my project deadlines with Google Calendar. Whenever I try to sync, I get an error message saying \"Unable to connect to external calendar service.\" Resolution : The issue was resolved by following these steps: Go to Settings > Integrations Disconnect the Google Calendar integration Clear browser cache and cookies Reconnect the Google Calendar integration Authorize the app again in Google's security settings The sync should now work correctly. If problems persist, ensure that third-party cookies are enabled in your browser settings. 2. German Support Ticket: File Upload Problem Support-Ticket #DE5678 Betreff : Datei-Upload funktioniert nicht Beschreibung : Ich kann keine Dateien mehr in meine Projekte hochladen. Jedes Mal, wenn ich es versuche, bleibt der Ladebalken bei 99% stehen und dann erscheint eine Fehlermeldung. Lösung : Das Problem wurde durch folgende Schritte gelöst: Überprüfen Sie die Dateigröße. Die maximale Uploadgröße beträgt 100 MB. Deaktivieren Sie vorübergehend den Virenschutz oder die Firewall. Versuchen Sie, die Datei im Inkognito-Modus hochzuladen. Wenn das nicht funktioniert, leeren Sie den Browser-Cache und die Cookies. Als letzten Ausweg, versuchen Sie einen anderen Browser zu verwenden. In den meisten Fällen lag das Problem an zu großen Dateien oder an Interferenzen durch Sicherheitssoftware. Nach Anwendung dieser Schritte sollte der Upload funktionieren. 3. Marketing Campaign Ideas (noise) Q3 Marketing Campaign Ideas Social media contest: \"Share Your Productivity Hack\" Users share tips using our software, best entry wins a premium subscription. Webinar series: \"Mastering Project Management\" Invite industry experts to share insights using our tool. Email campaign: \"Unlock Hidden Features\" Series of emails highlighting lesser-known but powerful features. Partner with a productivity podcast for sponsored content. Create a \"Project Management Memes\" social media account for lighter, shareable content. 4. Mitarbeiter des Monats (noise) Mitarbeiter des Monats: Juli 2023 Wir freuen uns, bekannt zu geben, dass Sarah Schmidt zur Mitarbeiterin des Monats Juli gewählt wurde! Sarah hat außergewöhnliche Leistungen in folgenden Bereichen gezeigt: Kundenbetreuung: Sarah hat durchschnittlich 95% positive Bewertungen erhalten. Teamarbeit: Sie hat maßgeblich zur Verbesserung unseres internen Wissensmanagementsystems beigetragen. Innovation: Sarah hat eine neue Methode zur Priorisierung von Support-Tickets vorgeschlagen, die unsere Reaktionszeiten um 20% verbessert hat. Bitte gratulieren Sie Sarah zu dieser wohlverdienten Anerkennung! This is how a document will look like inside Elasticsearch: Asking questions Now, we are going to ask a question in Spanish: Hola, estoy teniendo problemas para ocupar su aplicación, estoy teniendo problemas para sincronizar mi calendario, y encima al intentar subir un archivo me da error. The expectation is retrieving documents #1 and #2, then sending them to the LLM as additional context, and finally, getting an answer in Spanish. Retrieving documents To retrieve the relevant documents, we can use this nice and short query that will run a search on the embeddings, and return the support tickets most relevant to the question. Notes about the parameters set: size: 2 Because we know we want the top 2 documents. excludes For clarity in the response. Documents are short so each one will be one chunk long. Answering the question Now we can call the Mistral completion API using the Python library to answer the question. The answer is in perfect Spanish and on point! Show answer Hola, me alegra que te hayas comunicado con nosotros. Parece que hay dos problemas distintos. En cuanto a la sincronización del calendario, puedes seguir estos pasos para resolver el problema: Ve a Configuración > Integraciones Desconecta la integración del Calendario de Google Borra la caché y las cookies del navegador Vuelve a conectar la integración del Calendario de Google Autoriza de nuevo la aplicación en la configuración de seguridad de Google Si sigues teniendo problemas, asegúrate de que las cookies de terceros están habilitadas en la configuración de tu navegador. En cuanto al problema de subir un archivo, hay varias cosas que puedes probar: Comprueba el tamaño del archivo. El tamaño máximo de carga es de 100 MB. Desactiva temporalmente el antivirus o el cortafuegos. Intenta cargar el archivo en modo incógnito. Si eso no funciona, borra la caché y las cookies del navegador. Como último recurso, prueba a usar un navegador diferente. En la mayoría de los casos, el problema se debe a archivos demasiado grandes o a interferencias causadas por software de seguridad. Al seguir estos pasos, deberías poder cargar el archivo correctamente. ¡Espero que esto te ayude a resolver tus problemas! Si tienes alguna otra pregunta, no dudes en preguntar. Conclusion Mixtral 8x22B is a powerful model that enables us to leverage data sources in different languages, being able to answer, understand, and translate in many languages. This ability– together with multilingual embeddings– allows you to have multilingual support both in the data retrieval and the answer generation stages, removing language barriers entirely. If you are interested on reproducing the examples of this article, you can find the Python Notebook with the requests here Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Creating embeddings endpoint Creating Mappings Indexing data Asking questions Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Building multilingual RAG with Elastic and Mistral - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/building-multilingual-rag-with-elastic-and-mistral",
+    "meta_description": "Learn how to build a multilingual RAG application using Mixtral 8x22B model and Elastic."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Dataset translation with LangChain, Python & Vector Database for multilingual insights Learn how to translate a dataset from one language to another and use Elastic's vector database capabilities to gain more insights. Generative AI Vector Database Python How To JG By: Jessica Garson On September 10, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Translating a dataset from one language to another can be a powerful tool. You can gain insights into a dataset you previously might not have been able to, such as detecting new patterns or trends. Using LangChain , you can take a dataset and translate it into the language of your choice. After your dataset has been translated, you can use Elastic’s vector database to gain insight. This blog post will walk you through how to load data into a DataFrame using Pandas , translate the data from one language to another using LangChain , load the translated data into Elasticsearch, and use Elastic’s vector database capabilities to learn more about your dataset. The full code for this example can be found on the Search Labs GitHub repository . Setting up your environment for dataset translation Configure an environment variable for your OpenAI API Key First, you will want to configure an environment variable for your OpenAI API Key, which you can find on the API keys page in OpenAI's developer portal . You will need this API Key to work with LangChain. You can find more information on getting started in the LangChain quick start guide . Mac/Unix: Windows: Set up Elasticsearch This demo uses Elasticsearch version 8.15, but you can use any version of Elasticsearch that is higher than 8.0. If you are new, check out our Quick Start on Elasticsearch and the documentation on the integration between LangChain and Elasticsearch. Python version The version of Python that is used is Python 3.12.1 but you can use any version of Python higher than 3.9. Install the required packages The packages you will be working with are as follows: Jupyter Notebooks to work with the dataset interactively. nest_asyncio for asynchronous execution for processing your dataset. pandas for data manipulation and cleaning of the dataset used. To integrate natural language processing capabilities, you will use the LangChain library. To work with Elasticsearch, you will use Elasticsearch Python Client to connect to Elasticsearch. The langchain-elasticsearch package allows for an extra level of interaction between LangChain and Elasticsearch. You will need to install the TikToken package, which is used under the hood to break text into manageable pieces for efficient further processing. The datasets package will allow you to easily work with a dataset from Hugging Face. You can run the following command in your terminal to install the required packages for this blog post. Dataset The dataset used is a collection of news articles in Spanish , known as the DACSA corpus. Below is sample of what the dataset looks like: You will need to authenticate with Hugging Face to use this dataset. You first will need to create a token . The huggingface-cli is installed when you install the datasets package in the first step. After doing so, you can log in from the command line as follows: If you don't have a token already you will be prompted to create one from the command line interface. Loading a Jupyter notebook You will want to load a Jupyter Notebook to work with your data interactively. To do so, you can run the following command in your terminal. In the right-hand corner, you can select where it says “New” to create a new Jupyter Notebook. Translate dataset column from Spanish to English The code in this section will first load data from a dataset into a Pandas DataFrame and create a subset of the dataset that contains only 25 records. Once your dataset is ready, you can set up a role to allow your model to act as a translator and create an event loop that will translate a column of your dataset from Spanish to English. The subset that is being used is only 25 records to avoid hitting OpenAI’s rate limits. You may need to use batch loading if you are using a larger dataset. Import packages In your Jupyter Notebook, you will first want to import the following packages, including asyncio, which allows you to use async functions, and openai to work with models from OpenAI. Additionally, you will want to import the following packages, which also include getpass to keep your secrets secure and functools, which will help create an event loop to translate your dataset. In this code sample, you will create an event loop, which will allow you to translate many rows of a dataset at once using nest_asyncio . Event loops are a core construct of asyncio , they run within a thread and will execute all tasks inside of a thread. Before you can create an event loop you first need to run the following line of code. Loading in your dataset You can create a variable called ds, which loads in the ELiRF/dacsa dataset in Spanish. Later in this blog post, you will translate this dataset. This dataset contains articles in Catalan and Spanish, but for this blog post, you will only use the records in Spanish. The output will show the different datasets available, what columns each has, and how many rows. Now that the data is loaded, you can translate it into a Pandas DataFrame to make it easier to work with. Since this dataset contains almost 2000 rows, you can create a sample of the dataset to make it smaller for the purposes of this blog post. You can now view the first 5 rows of the dataset to ensure everything has been properly loaded. The output should look something like this: Translating dataset from one language to another You will want to create an async function to create an event loop, allowing you to translate the data seamlessly. Since you will be using GPT-4o, you will want to set a role to tell your model to act like a translator and another to give directions to translate your data from Spanish to English. You will translate the data from two specified columns of the dataset and add new columns with the translated data back to the original dataset. Finally, you can run the event loop and translate the specified column of your dataset. This dataset will now have columns entitled “translated summary” and “translated article” that contains the translation of the summaries and articles loaded. To confirm your data has been translated you can run the head.() method again. You will now see a new column called translated_summary containing the translation of the summary and another column entitled translated_article containing the translations of the articles from Spanish to English. Loading the translated articles into a vector database and searching A vector database allows you to find similar data quickly. It stores vector embeddings, a type of vector data representation that converts words, sentences, and other data into numbers that capture their meaning and relationships. In this section, you will learn how to load data into an Elasticsearch vector database, and perform searches on your newly translated dataset. Authenticate to Elasticsearch Now, you can use the Elasticsearch Python client to establish a secure connection to Elasticsearch. You will want to pass in your Elasticsearch host and port, and API key. Create an index Before you can load your data into a vector database, you must create an index. You will first create a variable to name your index. From there, check to see if an index exists. If one already does exist, it will delete your index and allow you to create a new index without error. Adding embeddings Embeddings leverage a machine learning model to translate text into numbers, allowing you to perform vector searches. You must also set up your index to be used as a vector database. At this point, you will want to set the embedding variable to OpenAI Embeddings . You will also want to specify the model used as text-embedding-3-large . Loading data At this point, you will want to load your translated data into a Python list, allowing you to load the data into the vector database. You can use the LangChain library to turn characters into text and, from there, load the data into a vector database. I chose this method because its ability to handle long documents by splitting text into smaller chunks helps manage memory and processing power efficiently, and the fact that you control how text is split. Performing searches You can now ask questions about the data, such as, \"What happened in Spain?\" You will now be able to get results from your dataset similar to your question. The output you get back should look something like this: You can find the complete output for this query here . Since you are using kNN by default, if you change the value of k, which is the number of global nearest neighbors to retrieve, you will return more values. The output should look similar to the following: You can check out the complete output for this query if needed. You can also adjust the num_candidates field to the number of approximate nearest neighbor candidates on each shard. You can check out our blog post on the subject for more information. If you are looking to tune this a bit more, you may want to check out our documentation on tuning an approximate KNN search. Conclusion This is just the start of how you can utilize Elastic’s vector database capabilities. To learn more about what’s available be sure to check out our resource on the subject . By leveraging LangChain, and Elastic’s vector database capabilities, you can draw insights from a dataset that may contain a language you are not familiar with. In this article, we were able to ask questions regarding specific locations mentioned in the text and receive responses in the translated English text. To dig in deeper on vector database capabilities you may want to check out our tutorial . Additionally you can find another tutorial on working with multilingual datasets on Search Labs as well as this post which walks you through how to build multilingual RAG with Elastic and Mistral. The full code for this example can be found on the Search Labs GitHub repository. Let us know if you built anything based on this blog or if you have questions on our forums and the community Slack channel. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to Setting up your environment for dataset translation Configure an environment variable for your OpenAI API Key Set up Elasticsearch Python version Install the required packages Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Dataset translation with LangChain, Python & Vector Database for multilingual insights - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/dataset-translation-langchain-python-elastic",
+    "meta_description": "Learn how to translate datasets from one language to another using LangChain & Python. Then, use Elastic’s vector database to uncover multilingual insights."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to migrate data between different versions of Elasticsearch & between clusters Exploring methods for transferring data between Elasticsearch versions and clusters. How To KB By: Kofi Bartlett On April 14, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. When you want to upgrade an Elasticsearch cluster, it is sometimes easier to create a new, separate cluster and transfer data from the old cluster to the new one. This affords users the advantage of being able to test all of their data and configurations on the new cluster with all of their applications without any risk of downtime or data loss. The disadvantages of that approach are that it requires some duplication of hardware and could create difficulties when trying to smoothly transfer and synchronize all of the data. It may also be necessary to carry out a similar procedure if you need to migrate applications from one data center to another. In this article, we will discuss and detail three ways to transfer data between Elasticsearch clusters. How to migrate data between Elasticsearch clusters? There are 3 ways to transfer data between Elasticsearch clusters: Reindexing from a remote cluster Transferring data using snapshots Transferring data using Logstash Using snapshots is usually the quickest and most reliable way to transfer data. However, bear in mind that you can only restore a snapshot onto a cluster of an equal or higher version and never with a difference of over one major version. That means you can restore a 6.x snapshot onto a 7.x cluster but not an 8.x cluster. If you need to increase by more than one major version, you will need to reindex or use Logstash. Now, let’s look in detail at each of the three options for transferring data between Elasticsearch clusters. 1. Reindexing data from a remote cluster Before starting to reindex, remember that you will need to set up appropriate mappings for all of the indices on the new cluster. To do that, you must either create the indices directly with the appropriate mappings or use index templates. Reindexing from remote — configuration required In order to reindex from remote, you should add the configuration below to the elasticseearch.yml file for the cluster that is receiving the data, which, in Linux systems, is usually located here: /etc/elasticsearch/elasticsearch.yml. The configuration to add is as follows: If you are using SSL, you should add the CA certificate to each node and include the following in the command for each node in elasticsearch.yml: Alternatively, you can add the line below to all Elasticsearch nodes in order to disable SSL verification. However, that approach is less recommended since it is not as secure as the previous option: You will need to make these modifications on every node and carry out a rolling restart. For more information on how to do that, please see our guide . Reindexing command After you have defined the remote host in the elasticsearch.yml file and added the SSL certificates if necessary, you can start reindexing data with the command below: While doing that, you may face timeout errors, so it may be useful to establish generous values for timeouts rather than relying on defaults. Now, let’s take a look at some other common errors that you may encounter when reindexing from remote. Common errors when reindexing from remote 1. Reindexing not whitelisted If you encounter this error, it shows that you did not define the remote host IP address or node name DNS in Elasticsearch as described above or forgot to restart Elasticsearch services. To fix that for the Elasticsearch cluster, you need to add the remote host to all Elasticsearch nodes and restart Elasticsearch services. 2. SSL handshake exception This error means that you forgot to add the reindex.ssl.certificate_authorities to elasticsearch.yml as described above. To add it: 2. Transferring data using snapshots Remember, as mentioned above, you can only restore a snapshot onto a cluster of an equal or higher version and never with a difference of over one major version If you need to increase by more than one major version, you will need to reindex or use Logstash. The following steps are required to transfer data via snapshots: Step 1. Adding the repository plugin to the first Elasticsearch cluster – In order to transfer data between clusters via snapshots, you need to ensure that the repository is accessible from both the new and the old clusters. Cloud storage repositories such as AWS, Google, and Azure are generally ideal for this. To take snapshots, please see our guide and follow the steps it describes. Step 2. Restart Elasticsearch service (rolling restart). Step 3. Create a repository for the first Elasticsearch cluster. Step 4- Add the repository plugin to the second Elasticsearch cluster. Step 5- Add repository as read only to second Elasticsearch cluster – You will need to add a repository by repeating the same steps that you took to create the first Elasticsearch cluster. Important note: When connecting the second Elasticsearch cluster to the same AWS S3 repository, you should define the repository as a read-only repository: That is important because you want to prevent the risk of mixing Elasticsearch versions inside the same snapshot repository. Step 6- Restoring data to the second Elasticsearch cluster – After taking the above steps, you can restore data and transfer it to the new cluster. Please follow the steps described in this article to restore data to the new cluster. 3. Transferring data using Logstash Before starting to transfer the data with logstash, remember that you will need to set up appropriate mappings for all of the indices on the new cluster. To do that, you will need to either create the indices directly or use index templates. To transfer data between two Elasticsearch clusters, you can set up a temporary Logstash server and use it to transfer your data between two clusters. For small clusters, a 2GB ram instance should be sufficient. For larger clusters, you can use four-core CPUs with 8GB RAM. For guidance on installing Logstash, please see here . Logstash configuration for transferring data from one cluster to another A basic configuration to copy a single index from cluster A to cluster B is: For secured elasticsearch, you can use the configuration below: Index metadata The above commands will write to a single named index. If you want to transfer multiple indices and preserve the index names, then you will need to add the following line to the Logstash output: Also if you want to preserve the original ID of the document, then you will need to add: Bear in mind that setting the document ID will make the data transfer significantly slower, so only preserve the original ID if you need to. Synchronization of updates All of the methods described above will take a relatively long period of time, and you might find that data in the original cluster has been updated while waiting for the process to complete. There are various strategies to enable the synchronization of any updates that may have occurred during the data transfer process, and you should give some thought to these issues before starting that process. In particular, you need to think about: What method do you have to identify any data that has been updated/added since the start of the data transfer process (e.g., a “last_update_time” field in the data)? What method can you use to transfer the last piece of data? Is there a risk of records being duplicated? Usually, there is, unless the method you are using sets the document ID during reindexing to a known value). The different methods to enable the synchronization of updates are described below. 1. Use of queueing systems Some ingestion/updating systems use queues that enable you to “replay” data modifications received in the last x days. That may provide a means to synchronize any changes carried out. 2. Reindex from remote Repeat the reindexing process for all items where “last_update_time” > x days ago. You can do this by adding a “query” parameter to the reindex request. 3. Logstash In the Logstash input, you can add a query to filter all items where “last_update_time” > x days ago. However, this process will cause duplicates in non-time-series data unless you have set the document_id. 4. Snapshots It is not possible to restore only part of an index, so you would have to use one of the other data transfer methods described above (or a script) to update any changes that have taken place since the data transfer process was carried out. However, snapshot restore is a much quicker process than reindexing/Logstash, so it may be possible to suspend updates for a brief period of time while snapshots are transferred to avoid the problem altogether. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to 1. Reindexing data from a remote cluster Reindexing from remote — configuration required Reindexing command Common errors when reindexing from remote 1. Reindexing not whitelisted Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to migrate data between different versions of Elasticsearch & between clusters - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-migrate-data-versions-clusters",
+    "meta_description": "Exploring methods for transferring data between Elasticsearch versions and clusters."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blogs Developer insights and practical how-to articles from our experts to inspire and empower your search experience Articles Series Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Blogs - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog",
+    "meta_description": "Blog articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. Integrations Generative AI JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta On May 20, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Spring AI is now generally available, with its first stable release 1.0 ready for you to download on Maven Central . Let’s use it right away to build a complete AI application, using your favorite LLM and our favorite vector database . What’s Spring AI? Spring AI 1.0 , a comprehensive solution for AI engineering in Java, is now available after a significant development period influenced by rapid advancements in the AI field. The release includes numerous essential new features for AI engineers. Java and Spring are in a prime spot to jump on this whole AI wave. Tons of companies are running their stuff on Spring Boot, which makes it super easy to plug AI into what they're already doing. You can basically link up your business logic and data right to those AI models without too much hassle. Spring AI provides support for various AI models and technologies , such as: Image models : generate images given text prompts. Transcription models : take audio sources and convert them to text. Embedding models: convert arbitrary data into vectors , which are data types optimized for semantic similarity search. Chat models: these should be familiar! You’ve no doubt even had a brief conversation with one somewhere. Chat models are where most of the fanfare seems to be in the AI space, and rightfully so, they're awesome! You can get them to help you correct a document or write a poem. (Just don’t ask them to tell a joke… yet.) They’re awesome, but they do have some issues. Spring AI solutions to AI challenges (The picture shown is used with permission from the Spring AI team lead Dr. Mark Pollack) Let's go through some of these problems and their solutions in Spring AI. Problem Solution Consistency Chat Models are open-minded and prone to distraction You can give them a system prompt to govern their overall shape and structure Memory AI models don’t have memory, so they can’t correlate one message from a given user to another You can give them a memory system to store the relevant parts of the conversation Isolation AI models live in isolated little sandboxes, but they can do really amazing things if you give them access to tools - functions that they can invoke when they deem it necessary Spring AI supports tool calling which lets you tell the AI model about tools in its environment, which it can then ask you to invoke. This multi-turn interaction is all handled transparently for you Private data AI models are smart, but they’re not omniscient! They don't know what's in your proprietary databases - nor we think would you want them to! You need to inform their responses by stuffing the prompts - basically using the all mighty string concatenation operator to put text in the request before the model looks at the question being asked. Background information, if you like. How do you decide what should be sent and what shouldn’t? Use a vector store to select only the relevant data and send it in onward. This is called retrieval augmented generation, or RAG Hallucination AI chat models like to, well, chat! And sometimes they do so so confidently that they can make stuff up You need to use evaluation - using one model to validate the output of another - to confirm reasonable results And, of course, no AI application is an island. Today modern AI systems and services work best when integrated with other systems and services. Model Context Protocol (MCP) makes it possible to connect your AI applications with other MCP-based services, regardless of what language they’re written in. You can assemble all of this in agentic workflows that drive towards a larger goal. The best part? You can do all this while building on the familiar idioms and abstractions any Spring Boot developer will have come to expect: convenient starter dependencies for basically everything are available on the Spring Initializr . Spring AI provides convenient Spring Boot autoconfigurations that give you the convention-over-configuration setup you’ve come to know and expect. And Spring AI supports observability with Spring Boot’s Actuator and the Micrometer project. It plays well with GraalVM and virtual threads, too, allowing you to build super fast and efficient AI applications that scale. Why Elasticsearch Elasticsearch is a full text search engine, you probably know that. So why are we using it for this project? Well, it’s also a vector store! And quite a good one at that, where data lives next to the full text. Other notable advantages: Super easy to set up Opensource Horizontally scalable Most of your organization’s free form data probably already lives in an Elasticsearch cluster Feature complete search engine capability Fully integrated in Spring AI ! Taking everything into consideration, Elasticsearch checks all the boxes for an excellent vector store, so let's set it up and start building our application! Getting started with Elasticsearch We’re going to need both Elasticsearch and Kibana, the UI console you’ll use to interact with the data hosted in the database. You can try everything on your local machine thanks to the goodness of Docker images and the Elastic.co home page . Go there, scroll down to find the curl command, run it and pipe it right into your shell: This will simply pull and configure Docker images for Elasticsearch and Kibana, and after a few minutes you’ll have them up and running on your local machine, complete with connection credentials. You’ve also got two different urls you can use to interact with your Elasticsearch instance. Do as the prompt says and point your browser to http://localhost:5601 . Note the username elastic and password printed on the console, too: you’ll need those to log in (in the example output above they’re respectively elastic and w1GB15uQ ). Pulling the app together Go to the Spring Initializr page and generate a new Spring AI project with the following dependencies: Elasticsearch Vector Store Spring Boot Actuator GraalVM OpenAI Web Make sure to choose the latest-and-greatest version of Java (ideally Java 24 - as of this writing - or later) and the build tool of your choice. We’re using Apache Maven in this example. Click Generate and then unzip the project and import it into your IDE of choice. (We’re using IntelliJ IDEA.) First things first: let’s specify your connection details for your Spring Boot application. In application.properties, write the following: We’ll also Spring AI’s vector store capability to initialize whatever’s needed on the Elasticsearch side in terms of data structures, so specify: We’re going to use OpenAI in this demo, specifically the Embedding Model and Chat Model (feel free to use the service you prefer, as long as Spring AI supports it ). The Embedding Model is needed to create embeddings of the data before we stash it into Elasticsearch. For OpenAI to work, we need to specify the API key : You can define it as an environment variable like SPRING_AI_OPENAI_API_KEY to avoid stashing the credential in your source code. We’re going to upload files, so be sure to customize how much data can be uploaded to the servlet container: We’re almost there! Before we dive into writing the code, let’s get a preview of how this is going to work. On our machine, we downloaded the following file (a list of rules for a board game), renamed it to test.pdf and put it in ~/Downloads/test.pdf . The file will be sent to the /rag/ingest endpoint (replace the path accordingly to your local setup): This might take a few seconds… Behind the scenes, the data’s being sent to OpenAI, which is creating embeddings of the data; that data is then being written to Elasticsearch, both the vectors and the original text. That data, along with all the embeddings therein, is where the magic happens. We can then query Elasticsearch using the VectorStore interface. The full flow looks like this: The HTTP client uploads your PDF of choice to the Spring application. Spring AI takes care of the text extraction from our PDF and chunks each page into 800 character chunks. OpenAI generates the vector representation for each chunk. Both chunked text and the embedding are then stored in Elasticsearch. Last, we’ll issue a query: And we’ll get a relevant answer: Nice! How does this all work? The HTTP client submits the question to the Spring application. Spring AI gets the vector representation of the question from OpenAI. With that embedding it searches for similar documents in the stored Elasticsearch chunks and retrieves the most similar documents. Spring AI then sends the question and retrieved context to OpenAI for generating an LLM answer. Finally, it returns the generated answer and a reference to the retrieved context. Let’s dive into the Java code to see how it really works. First of all, the Main class: it’s a stock standard main class for any ol’ Spring Boot application. Nothing to see there. Moving on… Up next, a basic HTTP controller: The controller is simply calling a service we’ve built to handle ingesting files and writing them to the Elasticsearch vector store, and then facilitating queries against that same vector store. Let’s look at the service: This code handles all the ingest: given a Spring Framework Resource , which is a container around bytes, we read the PDF data (presumed to be a .PDF file - make sure that you validate as much before accepting arbitrary inputs!) using Spring AI’s PagePdfDocumentReader and then tokenize it using Spring AI’s TokenTextSplitter , finally adding the resulting List<Document> s to the VectorStore implementation, ElasticsearchVectorStore . You can confirm as much using Kibana: after sending a file to the /rag/ingest endpoint, open up your browser to localhost:5601 and in the side menu on the left navigate to Dev Tools . There you can issue queries to interact with the data in the Elasticsearch instance. Issue a query like this: Now for the fun stuff: how do we get that data back out again in response to user queries? Here’s a first cut at an implementation of the query, in a method called directRag . The code’s fairly straightforward, but let’s break it down into multiple steps: Use the VectorStore to perform a similarity search. Given all the results, get the underlying Spring AI Document s and extract their text, concatenating them all into one result. Send the results from the VectorStore to the model, along with a prompt instructing the model what to do with them and the question from the user. Wait for the response and return it. This is RAG - retrieval augmented generation. It’s the idea that we’re using data from a vector store to inform the processing and analysis done by the model. Now that you know how to do it, let’s hope you never have to! Not like this anyway: Spring AI’s Advisors are here to simplify this process even more. Advisors allows you to pre- and post-process a request to a given model, other than providing an abstraction layer between your application and the vector store. Add the following dependency to your build: Add another method called advisedRag(String question) to the class: All the RAG-pattern logic is encapsulated in the QuestionAnswerAdvisor . Everything else is just as any request to a ChatModel would be! Nice! Conclusion In this demo, we used Docker images and did everything on our local machine, but the goal here is to build production-worthy AI systems and services. There are several things you could do to make that a reality. First of all, you can add Spring Boot Actuator to monitor the consumption of tokens. Tokens are a proxy for the complexity (and sometimes the dollars-and-cents) cost of a given request to the model. You’ve already got the Spring Boot Actuator on the classpath, so just specify the following properties to show all the metrics (captured by the magnificent Micrometer.io project): Restart your application. Make a query, and then go to: http://localhost:8080/actuator/metrics . Search for “ token ” and you’ll see information about the tokens being used by the application. Make sure you keep an eye on this. You can of course use Micrometer’s integration for Elasticsearch to push those metrics and have Elasticsearch act as your time series database of choice, too! You should then consider that every time we make a request to a datastore like Elasticsearch, or to OpenAI, or to other network services, we’re doing IO and - often - that IO blocks the threads on which it executes. Java 21 and later ship with non-blocking virtual threads that dramatically improve scalability. Enable it with: And, finally, you’ll want to host your application and your data in a place where it can thrive and scale. We're sure you’ve probably already thought about where to run your application, but where will you host your data? May we recommend the Elastic Cloud ? It’s secure, private, scalable, and full of features. Our favorite part? If you want, you can get the serverless edition where Elastic wears the pager, not you! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Jump to What’s Spring AI? Spring AI solutions to AI challenges Why Elasticsearch Getting started with Elasticsearch Pulling the app together Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Spring AI and Elasticsearch as your vector database - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/spring-ai-elasticsearch-application",
+    "meta_description": "Building a complete AI application using Spring AI and Elasticsearch.\n"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. Developer Experience Inside Elastic DT By: Drew Tate On May 22, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. It’s easy for us developers to take good autocomplete for granted. It just works—until you try building it yourself. This post is about a recent rearchitecture we performed to support continued evolution in ES|QL. A little about ES|QL In case you haven’t heard, ES|QL is Elastic’s new query language. It is super powerful and we see it as the future of how AI agents, applications, and humans will talk to Elastic. So, we provide an ES|QL editing experience in several places in Kibana including the Discover and Dashboard applications. ES|QL in Discover To understand the rearchitecture, it’s key to understand a few language components. An ES|QL query consists of a series of commands chained together to perform a pipeline of operations. Here, we are joining the data from one index to another index: In the example above, FROM , LOOKUP JOIN , and SORT are the commands. Commands can have major subcomponents (call them subcommands), generally identified by a second keyword before the next pipe character (for example, METADATA in the example above). Like commands, subcommands have their own semantic rules governing what comes after the keyword. ES|QL also has functions which look like you’d expect. See AVG in the example below: Autocomplete is an important feature for enabling users to learn ES|QL. Autocomplete 1.0 Our autocomplete engine was originally built with a few defining characteristics. Declarative — Used static declarations to describe commands Generic — Relied heavily on generic logic meant to apply to most/all language contexts Reified subcommands — Treated subcommands as first-class abstractions with their own logic Within the top-level suggestion routine, our code analyzed the query, detecting the general area of the user’s cursor. It then branched into one of several subroutines, corresponding to language subcomponents. The semantics of both commands and subcommands were described declaratively using a “command signature.” This defined a pattern of things that could be used after the command name. It might say “accept any number of boolean expressions,” or “accept a string field and then a numeric literal.” If the first analysis identified the cursor as being within a command or subcommand, the corresponding branch would then try to match the (sub)command signature with the query and figure out what to suggest in a generic way. The cracks start to show At first, this architecture worked. Early on, commands in ES|QL were relatively uniform. They looked basically like: But, as time went on, they started to get more bespoke. A couple of issues showed up and grew with every new command. Code complexity —the autocomplete code became large, complicated, and difficult to follow. It wasn’t clear which parts of the logic applied to which commands. Lack of orthogonality —a change in the behavior in one area of the language often had side-effects in other parts of the language. For example, adding a comma suggestion to the field list in KEEP , accidentally created a comma suggestion after the field in DISSECT — which is invalid. The problem was that new syntax and behaviors led our “generic” code to need more and more command-specific branches, and our command definitions to need more and more “generic” settings (that really only applied to a single command). Gradually, the idea that we could describe the nuances of each command’s structure and behavior with a declarative interface started to look a bit idealistic. Timing the investment When is it time to invest in a refactor? The answer is very contextual. You have to weigh the upsides against the cost. Truth be told, you can generally keep paying the price of inefficiencies for quite awhile— and it can make sense. One way to stave off a refactor is by treating the symptoms. We did this for months. We treated our code complexity with verbose comments. We treated our lack of orthogonality with better test coverage and careful manual testing. But there comes a point where the cost of patching outweighs the cost of change. Ours came with the introduction of a fabulous new ES|QL feature, filtering by aggregation. The WHERE command has existed since the early days, but this new feature added the ability to use WHERE as a sub command in STATS . This may look like a small change, but it broke the architecture’s careful delineation between commands and subcommands. Now, we had a command that could also be a subcommand. With this fundamental abstraction break added to all the existing inefficiencies, we decided it was time to invest. Autocomplete 2.0 ES|QL isn’t a generic language, it is a query language. So we decided it was time to accept that commands are bespoke by design (in accordance with grand query language tradition). The new architecture needed to be flexible and adaptive and it needed to be clear what code belonged to which command. This meant a system that was: Imperative — Instead of declaring what was acceptable after the command name and separately interpreting the declaration, we write the logic to check the correctness of the command directly. Command-specific — Each command gets its own logic. There is no generic routine that is supposed to work for all the commands. In Autocomplete 1.0, the up-front triage did a lot of work. Now, it just decides whether or not the cursor is already within a command. If within a command, it delegates straight to the command-specific suggest method. The bulk of the work now happens within the command’s logic, which is given complete control over suggestions within that command. This doesn’t mean that commands don’t share logic. They often delegate suggestion creation and even some triage steps to reusable subroutines (for example, if the cursor is within an ES|QL function). But, they retain the flexibility to customize the behavior in any way. Giving each command its own suggestion method improves isolation and reduces side effects, while making it obvious what code applies to which command. It’s still about the user There is no question that this refactor has resulted in a better developer experience. Everyone who interacted with both systems can attest that this is a breath of fresh air. But, at the end of the day, we made this investment in service of our users. First of all, some ES|QL features couldn’t be reasonably supported without it. Our users expect quality suggestions when they are writing ES|QL. Now, we can deliver in more contexts. The old system made it easy to introduce regressions. Now, we expect fewer of these. One of our team’s biggest roles is adding support for upcoming commands. Now, we can do this much faster. The work isn’t over, but we’ve created a system that supports change instead of resisting it. With this investment, we’ve laid a solid foundation to keep the language and the editor evolving into the future, side by side. Report an issue Related content Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Jump to A little about ES|QL Autocomplete 1.0 The cracks start to show Timing the investment Autocomplete 2.0 Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How we rebuilt autocomplete for ES|QL - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-autocomplete-rebuilt",
+    "meta_description": "How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial AutoOps Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe AutoOps January 2, 2025 Leveraging AutoOps to detect long-running search queries Learn how AutoOps helps you investigate long-running search queries plaguing your cluster to improve search performance. VC By: Valentin Crettaz AutoOps December 18, 2024 Resolving high CPU usage issues in Elasticsearch with AutoOps How AutoOps pinpointed and resolved high CPU usage in an Elasticsearch cluster: A step-by-step case study. MD By: Musab Dogan AutoOps How To November 20, 2024 Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. SF By: Sachin Frayne AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "AutoOps - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/autoops",
+    "meta_description": "AutoOps articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. Integrations How To TP By: Tom Potoma On May 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Red Hat validated pattern frameworks use GitOps for seamless provisioning of all operators and applications on Red Hat OpenShift. The Elasticsearch vector database is now officially supported by The ‘AI Generation with LLM and RAG’ Validated Pattern . This allows developers to jumpstart their app development using Elastic's vector database for retrieval-augmented generation (RAG) applications on OpenShift, combining the benefits of Red Hat's container platform with Elastic's vector search capabilities. Getting started with Elastic in the Validated Pattern Let's walk through setting up the pattern with Elasticsearch as your vector database: Prerequisites Podman installed on your local system An OpenShift cluster running in AWS Your OpenShift pull secret OpenShift CLI ( oc ) installed An installation configuration file Step 1: Fork the repository Create a fork of the rag-llm-gitops repository. Step 2: Clone the forked repository Clone your forked repository and go to the root directory of the repository. Step 3: Configure and deploy Create a local copy of the secret values file: Configure the pattern to use Elasticsearch by editing the values-global.yaml file: IF NECESSARY: Configure AWS settings (if your cluster is in an unsupported region ): Add GPU nodes to your cluster: Install the pattern: The installation process automatically deploys: Pattern operator components HashiCorp Vault for secrets management Elasticsearch operator and cluster RAG application UI and backend Step 4: Verify deployment After installation completes, check that all components are running. In the OpenShift web console, go to the Workloads > Pods menu. Select the rag-llm project from the drop-down. The following pods should be up and running: Alternatively, you can check via the CLI: You should see pods including: elastic-operator - The Elasticsearch operator es-vectordb-es-default-0 - The Elasticsearch cluster ui-multiprovider-rag-redis - The RAG application UI (despite the name, it uses the configured database type, which in our case is Elastic) Step 5: Try out the application Navigate to the UI in your browser to start generating content with your RAG application backed by Elasticsearch. From any page of your OpenShift console, click on the Application Menu and select the application: Then: Select your configured LLM provider, or configure your own When configuring with OpenAI, the application appends the appropriate endpoint. So, in the ‘URL’ field, provide ‘https://api.openai.com/v1’ rather than ‘https://api.openai.com/v1/chat/completions’ Enter the ‘Product’ as ‘RedHat OpenShift AI’ Click “Generate” Watch as the Proposal is created for you in real-time So what just happened? When you deploy the pattern with Elasticsearch, here's what happens behind the scenes: The Elasticsearch operator is deployed to manage Elasticsearch resources An Elasticsearch cluster is provisioned with vector search capabilities Sample data is processed and stored as vector embeddings in Elasticsearch The RAG application is configured to connect to Elasticsearch for retrieval When you generate content, the application queries Elasticsearch to find relevant context for the LLM What's next? This initial integration showcases just the beginning of what's possible when you combine Elasticsearch vector search with OpenShift AI. Elastic brings rich information retrieval capabilities that make it ideal for production RAG applications, and we are considering the following for future enhancement: Advanced semantic understanding - Utilize Elastic's ELSER model for more accurate retrieval without fine-tuning Intelligent data processing using Elastic's native text chunking and preprocessing capabilities Hybrid search superiority - Combine vector embeddings with traditional keyword search and BM25 ranking for the most relevant results Production-ready monitoring - Leverage Elastic's comprehensive observability stack to monitor RAG application performance and gain insights into LLM usage patterns We welcome feedback and contributions as we continue to bring powerful vector search capabilities to OpenShift AI applications! If you are at Red Hat Summit 2025, stop by Booth #1552 to learn more about Elastic! Resources: https://validatedpatterns.io/patterns/rag-llm-gitops/ https://validatedpatterns.io/patterns/rag-llm-gitops/deploying-different-db/ Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Getting started with Elastic in the Validated Pattern Prerequisites Step 1: Fork the repository Step 2: Clone the forked repository Step 3: Configure and deploy Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/red-hat-openshift-validated-pattern-elasticsearch",
+    "meta_description": "The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. Developer Experience Javascript How To JR By: Jeffrey Rengifo On May 19, 2025 Part of Series Elasticsearch in JavaScript the proper way Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This is the second part of our Elasticsearch in JavaScript series. In the first part, we learned how to set up our environment correctly, configure the Node.js client, index data and search. In this second part, we will learn how to implement production best practices and run the Elasticsearch Node.js client in Serverless environments. We will review: Production best practices Error handling Testing Serverless environments Running the client on Elastic Serverless Running the client on function-as-a-service environment You can check the source code with the examples here . Production best practices Error handling A useful feature of the Elasticsearch client in Node.js is that it exposes objects for the possible errors in Elasticsearch so you can validate and handle them in different ways. To see them all , run this: Let’s go back to the search example and handle some of the possible errors: ResponseError in particular, will occur when the answer is 4xx or 5xx , meaning the request is incorrect or the server is not available. We can test this type of error by generating wrong queries, like trying to do a term query on a text-type field: Default error: Customized error: We can also capture and handle each type of error in a certain way. For example, we can add retry logic in a TimeoutError . Testing Tests are key in guaranteeing the app's stability. To test the code in a way that is isolated from Elasticsearch, we can use the library elasticsearch-js-mock when creating our cluster. This library allows us to instantiate a client that is very similar to the real one but that will answer to our configuration by only replacing the client’s HTTP layer with a mock one while keeping the rest the same as the original. We’ll install the mocks library and AVA for automated tests. npm install @elastic/elasticsearch-mock npm install --save-dev ava We’ll configure the package.json file to run the tests. Make sure it looks this way: Let’s now create a test.js file and install our mock client: Now, add a mock for semantic search: We can now create a test for our code, making sure that the Elasticsearch part will always return the same results: Let’s run the tests. npm run test Done! From now on, we can test our app focusing 100 % on the code and not on external factors. Serverless environments Running the client on Elastic Serverless We covered running Elasticsearch on Cloud or on-prem; however, the Node.js client also supports connections to Elastic Cloud Serverless . Elastic Cloud Serverless allows you to create a project where you don’t need to worry about infrastructure since Elastic handles that internally, and you only need to worry about the data you want to index and how long you want to have access to it. From a usage perspective, Serverless decouples compute from storage, providing autoscaling features for both search and indexing . This allows you to only grow the resources you actually need. The client makes the following adaptations to connect to Serverless: Turns off sniffing and ignores any sniffing-related options Ignores all nodes passed in config except the first one, and ignores any node filtering and selecting options Enables compression and `TLSv1_2_method` (same as when configured for Elastic Cloud) Adds an `elastic-api-version` HTTP header to all requests Uses `CloudConnectionPool` by default instead of `WeightedConnectionPool` Turns off vendored `content-type` and `accept` headers in favor of standard MIME types To connect your serverless project, you need to use the parameter serverMode: serverless. Running the client on function-as-a-service environment In the example, we used a Node.js server, but you can also connect using a function-as-a-service environment with functions like AWS lambda, GCP Run, etc. Another example is to connect to services like Vercel, which is also serverless. You can check this complete example of how to do this, but the most relevant part of the search endpoint looks like this: This endpoint lives in the folder /api and is run from the server’s side so that the client only has control over the “text” parameter that corresponds to the search term. The implication of using function-as-a-service is that, unlike a server running 24/7, functions only bring up the machine that runs the function, and once it is finished, the machine goes into rest mode to consume fewer resources. This configuration can be convenient if the application does not get too many requests; otherwise, the costs can be high. You also need to consider the lifecycle of functions and the run times (which could only be seconds in some cases). Conclusion In this article, we learned how to handle errors, which is crucial in production environments. We also covered testing our application while mocking the Elasticsearch service, which provides reliable tests regardless of the cluster’s state and lets us focus on our code. Finally, we demonstrated how to spin up a fully serverless stack by provisioning both Elastic Cloud Serverless and a Vercel application. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Production best practices Error handling Testing Serverless environments Running the client on Elastic Serverless Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch in JavaScript the proper way, part II - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/how-to-use-elasticsearch-in-javascript-part-ii",
+    "meta_description": "Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. Search Relevance DW By: Daniel Wrigley On May 20, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In today’s digital age, search engines are the backbone of how we access information. Whether it’s a web search engine, an e-commerce site, an internal enterprise search tool, or a Retrieval Augmented Generation (RAG) system, the quality of search results directly impacts user satisfaction and engagement. But what ensures that search results meet user expectations? Enter the judgment list , a tool to evaluate and refine search result quality. At OpenSource Connections , our experts regularly help clients create and use judgment lists to improve their user search experience. In this post, we’ll explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. Why do you need a judgment list? Judgment lists play a crucial role in the continuous cycle of search result quality improvement. They provide a reliable benchmark for evaluating search relevance by offering a curated set of assessments on whether search results truly meet user needs. Without high-quality judgment lists, search teams would struggle to interpret feedback from users and automated signals, making it difficult to validate hypotheses about improving search. For example, if a team hypothesizes that hybrid search will increase relevance and expects a 2% increase in click-through rate (CTR), they need judgment lists to compare before-and-after performance meaningfully. These lists help ground experimentation results in objective measures, ensuring that changes positively impact business outcomes before they are rolled out broadly. By maintaining robust judgment lists, search teams can iterate with confidence, refining the search experience in a structured, data-driven way. A judgment list is a curated set of search queries paired with relevance ratings for their corresponding results, also known as a test collection. Metrics computed using this list act as a benchmark for measuring how well a search engine performs. Here’s why it’s indispensable: Evaluating search algorithms: It helps determine whether a search algorithm is returning the most relevant results for a given query. Measuring improvements or regressions: When you make changes to your search engine, a judgment list can quantify the impact of those changes on result quality. Providing insights into user satisfaction: By simulating expected outcomes, a judgment list aligns system performance with user needs. Helping product development: By making product requirements explicit, a judgment list supports search engineers implementing them. For example, if a user searches for “best smartphones under $500,” your judgment list can indicate whether the results not only list relevant products but also cater to the query intent of affordability and quality. Judgment lists are used in offline testing. Offline testing enables rapid, cost-effective iterations before committing to time-consuming live experiments like A/B testing. Ideally, combining both online and offline testing maximizes experimentation efficiency and ensures robust search improvements. What is a judgment? At its core, a judgment is a rating of how relevant a search result is for a specific query. Judgments can be categorized into two main types: binary judgments and graded judgments. Binary judgments Results are labeled as either relevant (1) or not relevant (0). Example: A product page returned for the query “wireless headphones” either matches the query intent or it doesn’t. Use case: Binary judgments are simple and useful for queries with clear-cut answers. Graded judgments Results are assigned a relevance score on a scale (e.g., 0 to 3), with each value representing a different level of relevance: 0: Definitely irrelevant 1: Probably irrelevant 2: Probably relevant 3: Definitely relevant Example: A search result for “best laptops for gaming” might score: 3 for a page listing laptops specifically designed for gaming, 2 for a page featuring laptops that could be suitable for gaming, 1 for gaming-related accessories, and 0 for items unrelated to gaming laptops. Scales can also be categorical rather than numeric, for example : Exact Substitute Complement Irrelevant Use case: Graded judgments are ideal for queries requiring a nuanced evaluation of relevance, beyond a simple binary determination of “relevant” or “not relevant.” This approach accommodates scenarios where multiple factors influence relevance. Some evaluation metrics explicitly require more than a binary judgment. We use graded judgments when we want to model specific information-seeking behaviors and expectations in our evaluation metric. For example, the gain based metrics, Discounted Cumulative Gain (DCG), normalized Discounted Cumulative Gain (nDCG), and Expected Reciprocal Rank (ERR) , model a user whose degree of satisfaction with a result can be greater or lesser, while still being relevant. This is useful for those who are researching and gathering information to make a decision. Example of a judgment list Let’s consider an example of a judgment list for an e-commerce search engine: Query Result URL Relevance wireless headphones /products/wireless-headphones-123 3 wireless headphones /products/noise-cancelling-456 3 best laptops for gaming /products/gaming-laptops-789 3 best laptops for gaming /products/ultrabook-321 2 In this list: The query “wireless headphones” evaluates the relevance of two product pages, with scores indicating how well the result satisfies the user’s intent. A score of 3 represents high relevance, a very good match, while lower scores suggest the result is less ideal. This structured approach allows search teams to objectively assess and refine their search algorithms. Different kinds of judgments To create a judgment list, you need to evaluate the relevance of search results, and this evaluation can come from different sources. Each type has its strengths and limitations: 1. Explicit judgments These are made by human evaluators who assess search results based on predefined guidelines. Typically, Subject Matter Experts (SMEs) are preferred as human evaluators for their knowledge. Explicit judgments offer high accuracy and nuanced insights but also pose unique challenges. Explicit judgments are very good in capturing the actual relevance of a document for a given query. Strengths: High accuracy, nuanced understanding of intent, and the ability to interpret complex queries. Limitations: Time-consuming, costly for large datasets, and prone to certain challenges. Challenges: Variation: Different judges might assess the same result differently, introducing inconsistency. Position bias: Results higher up in the ranking are often judged as more relevant regardless of actual quality. Expertise: Not all judges possess the same level of subject matter or technical expertise, leading to potential inaccuracies. Interpretation: User intent or the information need behind a query can be ambiguous or difficult to interpret. Multitasking: Judges often handle multiple tasks simultaneously, which may reduce focus. Fatigue: Judging can be mentally taxing, impacting judgment quality over time. Actual vs. perceived relevance: Some results may appear relevant at first glance (e.g., by a misleading product image) but fail closer scrutiny. Scaling: As the dataset grows, efficiently gathering enough judgments becomes a logistical challenge. Best practices: To overcome these challenges, follow these guidelines: Define information needs and tasks clearly to reduce variation in how judges assign grades. Train judges thoroughly and provide detailed guidance. Avoid judging results in a list view to minimize position bias. Correlate judgments from different groups (e.g., subject matter experts versus general judges) to identify discrepancies. Use crowdsourcing or specialized agencies to scale the evaluation process efficiently. 2. Implicit judgments Implicit judgments are inferred from user behavior data such as click-through rates, dwell time, and bounce rates. While they offer significant advantages, they also present unique challenges. In addition to relevance, implicit judgments capture search result quality aspects that match user taste or preference (for example, cost, and delivery time) as well as factors that fulfill the user in a certain way or the attractiveness of a product to a user (for example, sustainability features of the product). Strengths: Scalable and based on real-world usage, making it possible to gather massive amounts of data without manual intervention. Limitations: Susceptible to biases and other challenges that affect the reliability of the judgments. Challenges: Clicks are noisy: Users might click on results due to missing or unclear information on the search results page, not because the result is truly relevant. Biases: Position bias: Users are more likely to click on higher-ranked results, regardless of their actual relevance. Presentation bias: Users cannot click on what is not shown, resulting in missing interactions for potentially relevant results. Conceptual biases: For example, in a grid view result presentation users tend to interact more often with results at the grid margins. Sparsity issues: Metrics like CTR can be skewed in scenarios with limited data (e.g., CTR = 1.0 if there’s only 1 click out of 1 view). No natural extension points: Basic models like CTR lack built-in mechanisms for handling nuanced user behavior or feedback. Best practices: To mitigate these challenges and maximize the value of implicit judgments: Avoid over-reliance on position bias-prone metrics: Combine implicit signals with other data points to create a more holistic evaluation. Correlate implicit judgments with explicit feedback: Compare user behavior data with manually graded relevance scores to identify alignment and discrepancies. Train your models thoughtfully: Ensure they account for biases and limitations inherent in user behavior data by using a model that incorporates countermeasures for biases and provides options to combine different signals (for example clicks and purchases) . 3. AI-generated judgments AI-generated judgments leverage large language models (LLMs) like OpenAI’s GPT-4o to judge query-document pairs. These judgments are gaining traction due to their scalability and cost-effectiveness. LLMs as judges capture the actual relevance of a document for a given query well. Strengths: Cost-efficient, scalable, and consistent across large datasets, enabling quick evaluations of vast numbers of results. Limitations: AI-generated judgments may lack context-specific understanding, introduce biases from training data, and fail to handle edge cases effectively. Challenges: Training data bias: The AI model’s outputs are only as good as the data it’s trained on, potentially inheriting or amplifying biases. Context-specific nuances: AI may struggle with subjective or ambiguous queries that require human-like understanding. Interpretability: Understanding why a model assigns a specific judgment can be difficult, reducing trust in the system. Scalability trade-offs: While AI can scale easily, ensuring quality across all evaluations requires significant computational resources and potentially fine-tuning. Cost: While LLM judgments scale well, they are not free. Monitor your expenses closely. Best practices: To address these challenges and make the most of AI-generated judgments: Incorporate human oversight: Periodically compare AI-generated judgments with explicit human evaluations to catch errors and edge cases and use this information to improve your prompt. Enhance interpretability: Use explainable AI techniques to improve understanding and trust in the LLM’s decisions. Make the LLM explain its decision as part of your prompt. Optimize computational resources: Invest in infrastructure that balances scalability with cost-effectiveness. Combine AI with other judgment types: Use AI-generated judgments alongside explicit and/or implicit judgments to create a holistic evaluation framework. Prompt engineering: Invest time in your prompt. Even small changes can make a huge difference in judgment quality. Different factors of search quality Different kinds of judgments incorporate different aspects or factors of search quality. We can divide search result quality factors into three groups: Search relevance: This measures how well a document matches the information need expressed in the query. For instance: Binary judgments: Does the document fulfill the query (relevant or not)? Graded judgments: How well does the document fulfill the query on a nuanced scale? Explicit judgments and AI-generated judgments work well to capture search relevance. Relevance factors: These address whether the document aligns with specific user preferences. Examples include: Price: Is the result affordable or within a specified range? Brand: Does it belong to a brand the user prefers? Availability: Is the item in stock or ready for immediate use Implicit judgments capture relevance factors well. Fulfillment aspects: These go beyond relevance and preferences to consider how the document resonates with broader user values or goals. Examples include: Sustainability: Does the product or service promote environmental responsibility? Ethical practices: Is the company or provider known for fair trade or ethical standards? Fulfillment aspects are the most difficult to measure and quantify. Understanding your users is key and implicit feedback is the best way to move in that direction. Be aware of biases in implicit feedback and apply techniques to counter these as well as possible, for example when modeling the judgments based on implicit feedback . By addressing these factors systematically, search systems can ensure a holistic approach to evaluating and enhancing result quality. Where do judgment lists fit in the search quality improvement cycle? Search quality improvement is an iterative process that involves evaluating and refining search algorithms to better meet user needs. Judgment lists play a central role in offline experimentation (the smaller, left cycle in the image below), where search results are tested against predefined relevance scores without involving live users. This allows teams to benchmark performance, identify weaknesses, and make adjustments before deploying changes. This makes offline experimentation a fast and low-risk way of exploring potential improvements before trialing them in an online experiment. Online experimentation (the larger, right cycle) uses live user interactions, such as A/B testing, to gather real-world feedback on system updates. While offline experimentation with judgment lists ensures foundational quality, online experimentation captures dynamic, real-world nuances and user preferences. Both approaches complement each other, forming a comprehensive framework for search quality improvement. Source: Peter Fries. Search Quality - A business-friendly perspective . Tools to create judgment lists At its core, creating judgment lists is a labeling task where ultimately we are seeking to add a relevance label to a query-document pair. Some of the services that exist are: Quepid : An open source solution that supports the whole offline experimentation lifecycle from creating query sets to measuring search result quality with judgment lists created in Quepid. Label Studio : A data labeling platform that is predominantly used for generating training data or validating AI models. Amazon SageMaker Ground Truth : A cloud service offering data labeling to apply human feedback across the machine learning lifecycle. Prodigy : A complete data development experience with an annotation capability to label data. Looking ahead: Creating judgment lists with Quepid This post is the first in a series on search quality evaluation. In our next post, we will dive into the step-by-step process of creating explicit judgments using a specific tool called Quepid . Quepid simplifies the process of building, managing, and refining judgment lists, enabling teams to collaboratively improve search quality. Stay tuned for practical insights and tips on leveraging this tool to enhance the quality of your search results. Conclusion A judgment list is a cornerstone of search quality evaluation, providing a reliable benchmark for measuring performance and guiding improvements. By leveraging explicit, implicit, and AI-generated judgments, organizations can address the multifaceted nature of search quality—from relevance and accuracy to personalization and diversity. Combining these approaches ensures a comprehensive and robust evaluation strategy. Investing in a well-rounded strategy for search quality not only enhances user satisfaction but also positions your search system as a trusted and reliable tool. Whether you’re managing a search engine or fine-tuning an internal search feature, a thoughtful approach to judgments and search quality factors is essential for success. Partner with Open Source Connections to transform your search capabilities and empower your team to continuously evolve them. Our proven track record spans the globe, with clients consistently achieving dramatic improvements in search quality, team capability, and business performance. Contact us today to learn more. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to Why do you need a judgment list? What is a judgment? Binary judgments Graded judgments Example of a judgment list Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Cracking the code on search quality: The role of judgment lists - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/judgment-lists",
+    "meta_description": "Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Vector Database Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Search Relevance Vector Database +1 March 18, 2025 Searching complex documents with ColPali - part 1 The article introduces the ColPali model, a late-interaction model that simplifies the process of searching complex documents with images and tables, and discusses its implementation in Elasticsearch. PS BT By: Peter Straßer and Benjamin Trent Vector Database March 13, 2025 Semantic Text: Simpler, better, leaner, stronger Our latest semantic_text iteration brings a host of improvements. In addition to streamlining representation in _source, benefits include reduced verbosity, more efficient disk utilization, and better integration with other Elasticsearch features. You can now use highlighting to retrieve the chunks most relevant to your query. And perhaps best of all, it is now a generally available (GA) feature! MP By: Mike Pellegrini Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Vector Database - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/vector-database",
+    "meta_description": "Vector Database articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Integrations Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations April 9, 2025 Elasticsearch vector database for native grounding in Google Cloud’s Vertex AI Platform Elasticsearch is now publicly available as the first third-party native grounding engine for Google Cloud’s Vertex AI platform and Google’s Gemini models. It enables joint users to build fully customizable GenAI experiences grounded in enterprise data, powered by the best-of-breed Search AI capabilities from Elasticsearch. VA By: Valerio Arvizzigno Integrations How To April 8, 2025 Using CrewAI with Elasticsearch Learn how to create an Elasticsearch agent with CrewAI for your agent team and perform market research. JR By: Jeffrey Rengifo Integrations How To April 4, 2025 Getting Started with the Elastic Chatbot RAG app using Vertex AI running on Google Kubernetes Engine Learn how to configure the Elastic Chatbot RAG app using Vertex AI and run it on Google Kubernetes Engine (GKE). JS By: Jonathan Simon 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Integrations - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/integrations",
+    "meta_description": "Integrations articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Ingestion Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Ingestion +1 March 7, 2025 Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. JR By: Jeffrey Rengifo Integrations Ingestion +1 February 19, 2025 Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. AK By: Amit Khandelwal Integrations Ingestion +1 February 18, 2025 Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. JR TM By: Jeffrey Rengifo and Tomás Murúa Ingestion How To February 4, 2025 How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. AL By: Andre Luiz Integrations Ingestion +1 February 3, 2025 Elastic Playground: Using Elastic connectors to chat with your data Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources. JR TM By: Jeffrey Rengifo and Tomás Murúa Integrations Ingestion +1 January 24, 2025 Indexing OneLake data into Elasticsearch - Part II Second part of a two-part article to index and search OneLake data into Elastic using a Custom connector. GL JR By: Gustavo Llermaly and Jeffrey Rengifo Integrations Ingestion +1 January 23, 2025 Indexing OneLake data into Elasticsearch - Part 1 Learn to configure OneLake, consume data using Python and index documents in Elasticsearch to then run semantic searches. GL By: Gustavo Llermaly Integrations Ingestion +1 January 16, 2025 Elastic Jira connector tutorial part II: Optimization tips After connecting Jira to Elasticsearch, we'll now review best practices to escalate this deployment. GL By: Gustavo Llermaly Integrations Ingestion +1 January 15, 2025 Elastic Jira connector tutorial part I Reviewing a use case for the Elastic Jira connector. We'll be indexing our Jira content into Elasticsearch to create a unified data source and do search with Document Level Security. GL By: Gustavo Llermaly 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Ingestion - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/ingestion",
+    "meta_description": "Ingestion articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Javascript Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Python Javascript +2 October 30, 2024 Export your Kibana Dev Console requests to Python and JavaScript Code The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application. MG By: Miguel Grinberg Javascript Python +1 June 4, 2024 Automatically updating your Elasticsearch index using Node.js and an Azure Function App Learn how to update your Elasticsearch index automatically using Node.js and an Azure Function App. Follow these steps to ensure your index stays current. JG By: Jessica Garson ES|QL Javascript +1 June 3, 2024 ES|QL queries to TypeScript types with the Elasticsearch JavaScript client Explore how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects. JM By: Josh Mock Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Javascript - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/javascript-programming",
+    "meta_description": "Javascript articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial ML Research Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil ML Research December 5, 2024 Exploring depth in a 'retrieve-and-rerank' pipeline Select an optimal re-ranking depth for your model and dataset. TP TV QH By: Thanos Papaoikonomou , Thomas Veasey and Quentin Herreros ML Research November 25, 2024 Introducing Elastic Rerank: Elastic's new semantic re-ranker model Learn about how Elastic's new re-ranker model was trained and how it performs. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou Vector Database Lucene +1 November 18, 2024 Better Binary Quantization (BBQ) vs. Product Quantization Why we chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch. BT By: Benjamin Trent ML Research Search Relevance October 29, 2024 What is semantic reranking and how to use it? Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "ML Research - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/ml-research",
+    "meta_description": "ML Research articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Developer Experience Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Developer Experience March 3, 2025 Fast Kibana Dashboards From 8.13 to 8.17, the wait time for data to appear on a dashboard has improved by up to 40%. These improvements are validated both in our synthetic benchmarking environment and from metrics collected in real user’s cloud environments. TN By: Thomas Neirynck Developer Experience January 22, 2025 Engineering a new Kibana dashboard layout to support collapsible sections & more Building collapsible dashboard sections in Kibana required overhauling an embeddable system and creating a custom layout engine. These updates improve state management, hierarchy, and performance while setting the stage for new advanced dashboard features. TS HM NR By: Teresa Alvarez Soler , Hannah Mudge and Nathaniel Reese Python Javascript +2 October 30, 2024 Export your Kibana Dev Console requests to Python and JavaScript Code The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application. MG By: Miguel Grinberg 1 2 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Developer Experience - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/developer-experience",
+    "meta_description": "Developer Experience articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial How To Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 14, 2025 Elasticsearch Index Number_of_Replicas Explaining how to configure the number_of_replicas, its implications and best practices. KB By: Kofi Bartlett How To May 12, 2025 Excluding Elasticsearch fields from indexing Explaining how to configure Elasticsearch to exclude fields, why you might want to do this, and best practices to follow. KB By: Kofi Bartlett How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 7, 2025 Joining two indices in Elasticsearch Explaining how to use the terms query and the enrich processor for joining two indices in Elasticsearch. KB By: Kofi Bartlett How To May 5, 2025 Understanding Elasticsearch scoring and the Explain API Diving into the scoring mechanism of Elasticsearch and exploring the Explain API. KB By: Kofi Bartlett 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How To - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/how-to",
+    "meta_description": "How To articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Agent Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Agent How To March 28, 2025 Connect Agents to Elasticsearch with Model Context Protocol Let’s use Model Context Protocol server to chat with your data in Elasticsearch. JB JM By: Jedr Blaszyk and Joe McElroy Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Generative AI Agent +1 September 20, 2024 LangChain and Elasticsearch: Building LangGraph retrieval agent template Elasticsearch and LangChain collaborate on a new retrieval agent template for LangGraph for agentic apps JM AT SC By: Joe McElroy , Aditya Tripathi and Serena Chou Generative AI Vector Database +2 September 2, 2024 A tutorial on building local agent using LangGraph, LLaMA3 and Elasticsearch vector store from scratch This article will provide a detailed tutorial on implementing a local, reliable agent using LangGraph, combining concepts from Adaptive RAG, Corrective RAG, and Self-RAG papers, and integrating Langchain, Elasticsearch Vector Store, Tavily AI for web search, and LLaMA3 via Ollama. PR By: Pratik Rana Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Agent - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/rag-agent",
+    "meta_description": "Agent articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Search tier autoscaling in Elasticsearch Serverless Explore search tier autoscaling in Elasticsearch Serverless. Learn how autoscaling works, how the search load is calculated and more. Elastic Cloud Serverless MP JV By: Matteo Piergiovanni and John Verwolf On August 8, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. One of the key aspects of our new serverless offerings is allowing users to deploy and use Elastic without the need to manage the underlying project nodes. To achieve this, we developed search tier autoscaling, a strategy to dynamically choose node size and count based on a multitude of parameters that we will delve into in this blog. This innovation ensures that you no longer need to worry about under-provisioning or over-provisioning your resources. Whether you are dealing with fluctuating traffic patterns, unexpected data spikes, or gradual growth, search tier autoscaling seamlessly adapts the allocated hardware to the search tier dynamically based on search activity. Autoscaling is performed on a per project basis and is completely transparent to the end user. Introduction Elastic serverless is a fully-managed product from Elastic that enables you to deploy and use Elastic products without the need to manage the underlying Elastic infrastructure, but instead focussing on extracting the most out of your data. One of the challenges of self-managed infrastructure is dealing with the ever-evolving needs a customer faces. In the dynamic world of data management, flexibility and adaptability are crucial and traditional scaling methods often fall short and require manual adjustments that can be both time-consuming and imprecise. With search tier autoscaling, our serverless offering automatically adjusts resources to match the demand of your workload in real-time. The autoscaling described in this post is specific to the Elasticsearch project type within Elastic's serverless offering. Observability and security may have different autoscaling mechanisms tailored to their unique requirements. Another important piece of information needed before diving into the details of autoscaling is how we manage our data to achieve a robust and scalable infrastructure. We use S3 as the primary source of truth, providing reliable and scalable storage. To enhance performance and reduce latency, search nodes use a local cache to quickly access frequently requested data without repeatedly retrieving it from S3. This combination of S3 storage and caching by search nodes forms an efficient system, ensuring that both durable storage and fast data access fit our user’s demands effectively. Search tier autoscaling inputs To demonstrate how autoscaling works, we’ll dive into the various metrics that are used in order to make scaling decisions. When starting a new serverless Elasticsearch project, a user can choose two parameters that will influence how autoscaling behaves: Boost Window : defines a specific time range within which search data is considered boosted. Boosted Data : data that falls within the boost window is classified as boosted data. All time-based documents with a @timestamp within the boost window range and all non-time-based documents will fall in the boosted data category. This time-based classification allows the system to prioritize this data when allocating resources. Non-Boosted Data : data outside the boost window is considered non-boosted. This older data is still accessible but is allocated fewer resources compared to boosted data. Search Power : a range that controls the number of Virtual Compute Units (VCUs) allocated to boosted data in the project. Search power can be set to: Cost Efficient : limits the available cache size for boosted data prioritizing cost efficiency over performance. Well suited for customers wanting to store very large amounts of data at a low cost. Balanced : ensures enough cache for all boosted data for faster searches. Performance : provides more resources to respond quicker to a higher volume and more complex queries. The boost window will determine the amount of boosted and non-boosted data for a project. We define boosted data for a project as the amount of data within the boost window. The total size of boosted data, together with the lower end of the selected search power range will determine the base hardware configuration for a project. This method is favored over scaling to zero (or near zero) because it helps maintain acceptable latency for subsequent requests. This is achieved by retaining our cache and ensuring CPUs are immediately available to process incoming requests. This approach avoids the delays associated with provisioning hardware from CSP and ensures the system readiness to handle incoming requests promptly. Note that the base configuration can increase over time by ingesting more data or decrease if time series data falls out of the boost window. This is the first piece of autoscaling, where we provide a base hardware configuration that can adapt to a user’s boosted data over time. Load based autoscaling Autoscaling based on interactive data is only one piece of the puzzle. It does not account for the load placed on the Search Nodes by incoming search traffic. To this effect, we have introduced a new metric called search load . search load is a measure of the amount of physical resources required to handle the current search traffic. Search Llad accounts for the resource usage that the search traffic places on the nodes at a given time, and thus allows for dynamic autoscaling in response. What is search load? Search load is a measure of the amount of physical resources required to handle the current search traffic. We report this as a measure of the number of processors required per node. However, there is some nuance here. When scaling, we move up and down between hardware configurations that have set values of CPU, memory, and disk. These values are scaled together according to given ratios. For example, to obtain more CPU, we would scale to a node with a hardware configuration that also includes more memory and more disk. Search load indirectly accounts for these resources. It does so by using the time that search threads take within a given measurement interval. If the threads block while waiting for resources (IO), this also contributes to the threads’ execution time. If all the threads are 100% utilized in addition to queuing, this indicates the need to scale up. Conversely, if there is no queuing and the search thread pool is less than 100% utilized, this indicates that it is possible to scale down. How is search load calculated? Search load is composed of two factors: Thread Pool Load : number of processor cores needed to handle the search traffic that is being processed. Queue Load : number of processor cores needed to handle the queued search requests within an acceptable timeframe. To describe how the search load is calculated, we will walk through each aspect step-by-step to explain the underlying principles. We will start by describing the Thread Pool Load . First, we monitor the total execution time of the threads responsible for handling search requests within a sampling interval, called totalThreadExecutionTime . The length of this sampling interval is multiplied by the processor cores to determine the maximum availableTime . To obtain the threadUtilization percent, we divide the total thread execution time by this availableTime . For example, a 4 core machine with a 1s sampling interval would have 4 seconds of available time (4 cores * 1s). If the total task execution time is 2s, then this results in 50% thread pool utilization (2s / 4s = 0.5). We then multiply the threadUtilization percent by the numProcessors to determine the processorsUsed , which measures the number of processor cores used. We record this value via an exponential weighted moving average (a moving average that favors recent additions) to smooth out small bursts of activity. This results in the value used for threadPoolLoad . Next, we will describe how the Queue Load is determined. Central to the calculation, there is a configuration maxTimeToClearQueue that sets the maximum acceptable timeframe that a search request may be queued. We need to know how many tasks a given thread can execute within this timeframe, so we divide the maxTimeToClearQueue by the exponential weighted moving average of the search execution time. Next, we divide the searchQueueSize by this value to determine how many threads are needed to clear the queue within the configured time frame. To convert this to the number of processors required, we multiply this by the ratio of processorsPerThread . This results in the value used for the queueLoad . The search load for a given node is then the sum of both the threadPoolLoad and the queueLoad . Search load reporting Each Search Node regularly publishes load readings to the Master Node. This will occur either after a set interval, or if a large delta in the load is detected. The Master Node keeps track of this state separately for each Search Node, and performs bookkeeping in response to various lifecycle events. When Search Nodes are added/removed, the Master Node adds or removes their respective load entries. The Master Node also reports a quality rating for each entry: Exact , Minimum , or Missing . Exact means the metric was reported recently, while Missing is assigned when a search load has not yet been reported by a new node. Search load quality is considered Minimum when the Master Node has not received an update from the search load within a configured time period, e.g. if a node becomes temporarily unavailable. The quality is also reported as Minimum when a Search Node’s load value accounts for work that is not considered indicative of future work, such as downloading files that will be subsequently cached. Quality is used to inform scaling decisions. We disallow scaling down when the quality of any node is inexact. However, we allow scaling up regardless of the quality rating. The autoscaler The autoscaler is a component of Elastic serverless designed to optimize performance and cost by adjusting the size and number of nodes in a project based on real-time metrics. It monitors metrics from Elasticsearch, determines an ideal hardware configuration, and applies the configuration to the managed Kubernetes infrastructure. With an understanding of the inputs and calculations involved in search tier metrics, we can now explore how the autoscaler leverages this data to dynamically adjust the project node size and count for optimal performance and cost efficiency. The autoscaler monitors the search tier metrics every 5 seconds. When new metrics arrive for total interactive and non-interactive data size, together with the search power range, the autoscaler will then determine the range of possible hardware configurations. These configurations range from a minimum to a maximum, defined by the search power range. The autoscaler then uses the search load reported by Elasticsearch to select a “desired” hardware configuration within the available range that has at least the number of processor cores to account for the measured search load. This desired configuration serves as an input to a stabilization phase where the autoscaler decides if the chosen scale direction can be applied immediately; if not, it is discarded. There is a 15-minute stabilization window for scaling down, meaning 15 minutes of continuous scaling down events are required for a scale down to occur. There is no stabilization period for scaling up. Scaling events are non-blocking; therefore, we can continue to make scaling decisions while subsequent operations are still ongoing. The only limit to this is defined by the stabilization window described above. The configuration is then checked against the maximum number of replicas for an index in Elasticsearch to ensure there are enough search nodes to accommodate all the configured replicas. Finally, the configuration is applied to the managed Kubernetes infrastructure, which provisions the project size accordingly. Conclusion Search tier autoscaling revolutionizes the management of Elasticsearch serverless projects. By leveraging detailed metrics, the autoscaler ensures that projects are always optimally sized. With serverless, users can focus on their business needs without the worry of managing infrastructure or being caught unprepared when their workload changes. This approach not only enhances performance during high-demand periods, but also reduces costs during times of low activity, all while being completely transparent to the end user. As a result, users can focus more on their core activities without the constant worry of manually tuning their projects to meet evolving demands. This innovation marks a significant step forward in making Elasticsearch both powerful and user-friendly in the realm of serverless computing. Try it out! Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Introduction Search tier autoscaling inputs Load based autoscaling What is search load? How is search load calculated? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Search tier autoscaling in Elasticsearch Serverless - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-serverless-tier-autoscaling",
+    "meta_description": "Explore search tier autoscaling in Elasticsearch Serverless. Learn how autoscaling works, how the search load is calculated and more."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Hybrid retrieval In this blog we introduce hybrid retrieval and explore two concrete implementations in Elasticsearch. We explore improving Elastic Learned Sparse Encoder’s performance by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores. Generative AI QH TV By: Quentin Herreros and Thomas Veasey On July 20, 2023 Part of Series Improving information retrieval in the Elastic Stack Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In our last blog post , we introduced Elastic Learned Sparse Encoder, a model trained for effective zero-shot text retrieval. Elasticsearch ® also has great lexical retrieval capabilities and rich tools for combining the results of different queries. In this blog, we introduce the concept of hybrid retrieval and explore two concrete implementations available in Elasticsearch. In particular, we explore how to improve the performance of Elastic Learned Sparse Encoder by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores. We also discuss experiments we undertook to explore some general research questions. These include how best to parameterize Reciprocal Rank Fusion and how to calibrate Weighted Sum of Scores. Hybrid retrieval Despite modern training pipelines producing retriever models with good performance in zero-shot scenarios, it is known that lexical retrievers (such as BM25) and semantic retrievers (like Elastic Learned Sparse Encoder) are somewhat complementary. Specifically, it will improve relevance to combine the results of retrieval methods, if one assumes that the more matches occur between the relevant documents they retrieve than between the irrelevant documents they retrieve. This hypothesis is plausible for methods using very different mechanisms for retrieval because there are many more irrelevant than relevant documents for most queries and corpuses. If methods retrieve relevant and irrelevant documents independently and uniformly at random, this imbalance means it is much more probable for relevant documents to match than irrelevant ones. We performed some overlap measurements to check this hypothesis between Elastic Learned Sparse Encoder, BM25, and various dense retrievers as shown in Table 1. This provides some rationale for using so-called hybrid search. In the following, we investigate two explicit implementations of hybrid search. Reciprocal Rank Fusion Reciprocal Rank Fusion was proposed in this paper. It is easy to use, being fully unsupervised and not even requiring score calibration. It works by ranking a document d with both BM25 and a model, and calculating its score based on the ranking positions for both methods. Documents are sorted by descending score. The score is defined as follows: The method uses a constant k to adjust the importance of lowly ranked documents. It is applied to the top N document set retrieved by each method. If a document is missing from this set for either method, that term is set to zero. The paper that introduces Reciprocal Rank Fusion suggests a value of 60 for k and doesn’t discuss how many documents N to retrieve. Clearly, ranking quality can be affected by increasing N while recall@N is increasing for either method. Qualitatively, the larger k the more important lowly ranked documents are to the final order. However, it is not a priori clear what would be optimal values of k and N for modern lexical semantic hybrid retrieval. Furthermore, we wanted to understand how sensitive the results are to the choice of these parameters and if the optimum generalizes between data sets and models. This is important to have confidence in the method in a zero-shot setting. To explore these questions, we performed a grid search to maximize the weighted average NDCG@10 for a subset of the BEIR benchmark for a variety of models. We used Elasticsearch for retrieval in this experiment representing each document by a single text field and vector. The BM25 search was performed using a match query and dense retrieval using exact vector search with a script_score query. Referring to Table 2, we see that for roberta-base-ance-firstp optimal values for k and N are 20 and 1000, respectively. We emphasize that for the majority of individual data sets, the same combination of parameters was optimal . We did the same grid search for distilbert-base-v3 and minilm-l12-v3 with the same conclusion for each model. It is also worth noting that the difference between the best and worst parameter combinations is only about 5%; so the penalty for mis-setting these parameters is relatively small. We also wanted to see if we could improve the performance of Elastic Learned Sparse Encoder in a zero-shot setting using Reciprocal Rank Fusion. The results on the BEIR benchmark are given in Table 3. Reciprocal Rank Fusion increases average NDCG@10 by 1.4% over Elastic Learned Sparse Encoder alone and 18% over BM25 alone. Also, importantly the result is either better or similar to BM25 alone for all test data sets. The improved ranking is achieved without the need for model tuning, training data sets, or specific calibration. The only drawback is that currently the query latency is increased as the two queries are performed sequentially in Elasticsearch. This is mitigated by the fact that BM25 retrieval is typically faster than semantic retrieval. Our findings suggest that Reciprocal Rank Fusion can be safely used as an effective “plug and play” strategy. Furthermore, it is worth reviewing the quality of results one obtains with BM25, Elastic Learned Sparse Encoder and their rank fusion on your own data. If one were to select the best performing approach on each individual data set in the BEIR suite, the increase in average NDCG@10 is, respectively, 3% and 20% over Elastic Learned Sparse Encoder and BM25 alone. As part of this work, we also performed some simple query classification to distinguish keyword and natural question searches. This was to try to understand the mechanisms that lead to a given method performing best. So far, we don’t have a clear explanation for this and plan to explore this further. However, we did find that hybrid search performs strongly when both methods have similar overall accuracy. Finally, Reciprocal Rank Fusion can be used with more than two methods or could be used to combine rankings from different fields. So far, we haven’t explored this direction. Weighted Sum of Scores Another way to do hybrid retrieval supported by Elasticsearch is to combine BM25 scores and model scores using a linear function. This approach was studied in this paper , which showed it to be more effective than Reciprocal Rank Fusion when well calibrated. We explored hybrid search via a convex linear combination of scores defined as follows: where α is the model score weight and is between 0 and 1. Ideal calibration of linear combination is not straightforward, as it requires annotations similar to those used for fine-tuning a model. Given a set of queries and associated relevant documents, we can use any optimization method to find the optimal combination for retrieving those documents. In our experiments, we used BEIR data sets and Bayesian optimization to find the optimal combination, optimizing for NDCG@10. In theory, the ratio of score scales can be incorporated into the value learned for α. However, in the following experiments, we normalized BM25 scores and Elastic Learned Sparse Encoder scores per data set using min-max normalization , calculating the minimum and maximum from the top 1,000 scores for some representative queries on each data set. The hope was that with normalized scores the optimal value of transfers. We didn’t find evidence for this, but it is much more consistent and so normalization does likely improve the robustness of the calibration. Obtaining annotations is expensive, so it is useful to know how much data to gather to be confident of beating Reciprocal Rank Fusion (RRF). Figure 1 shows the NDCG@10 for a linear combination of BM25 and Elastic Learned Sparse Encoder scores as a function of the number of annotated queries for the ArguAna data set. For reference, the BM25, Elastic Learned Sparse Encoder and RRF NDCG@10 are also shown. This sort of curve is typical across data sets. In our experiments, we found that it was possible to outperform RRF with approximately 40 annotated queries, although the exact threshold varied slightly from one data set to another. We also observed that the optimal weight varies significantly both across different data sets (see Figure 2) and also for different retrieval models. This is the case even after normalizing scores. One might expect this because the optimal combination will depend on how well the individual methods perform on a given data set. To explore the possibility of a zero-shot parameterisation, we experimented with choosing a single weight α for all data sets in our benchmark set. Although we used the same supervised approach to do this, this time choosing the weight to optimize average NDCG@10 for the full suite of data sets, we feel that there is enough variation between data sets that our findings may be representative of zero-shot performance. In summary, this approach yields better average NDCG@10 than RRF. However, we also found the results were less consistent than RRF and we stress that the optimal weight is model specific . For this reason, we feel less confident the approach transfers to new settings even when calibrated for a specific model. In our view, linear combination is not a “plug and play” approach. Instead, we believe it is important to carefully evaluate the performance of the combination on your own data set to determine the optimal settings. However, as we will see below, if it is well calibrated it yields very good results. Normalization is essential for comparing scores between different data sets and models, as scores can vary a lot without it. It is not always easy to do, especially for Okapi BM25, where the range of scores is unknown until queries are made. Dense model scores are easier to normalize, as their vectors can be normalized. However, it is worth noting that some dense models are trained without normalization and may perform better with dot products. Elastic Learned Sparse Encoder is trained to replicate cross-encoder score margins. We typically see it produce scores in the range 0 to 20, although this is not guaranteed. In general, a query history and their top N document scores can be used to approximate the distribution and normalize any scoring function with minimum and maximum estimates. We note that the non-linear normalization could lead to improved linear combination, for example if there are score outliers, although we didn’t test this. As for Reciprocal Rank Fusion, we wanted to understand the accuracy of a linear combination of BM25 and Elastic Learned Sparse Encoder — this time, though, in the best possible scenario. In this scenario, we optimize one weight α per data set to obtain the ideal NDCG@10 using linear combination. We used 300 queries to calibrate — we found this was sufficient to estimate the optimal weight for all data sets. In production, this scenario is realistically difficult to achieve because it needs both accurate min-max normalization and a representative annotated data set to adjust the weight. This would also need to be refreshed if the documents and queries drift significantly. Nonetheless, bounding the best case performance is still useful to have a sense of whether the effort might be worthwhile. The results are displayed in Table 4. This approach gives a 6% improvement in average NDCG@10 over Elastic Learned Sparse Encoder alone and 24% improvement over BM25 alone. Conclusion We showed it is possible to combine different retrieval approaches to improve their performance and in particular lexical and semantic retrieval complement one another. One approach we explored was Reciprocal Rank Fusion. This is a simple method that often yields good results without requiring any annotations nor prior knowledge of the score distribution. Furthermore, we found its performance characteristics were remarkably stable across models and data sets, so we feel confident that the results we observed will generalize to other data sets. Another approach is Weighted Sum of Scores, which is more difficult to set up, but in our experiments yielded very good ranking with the right setup. To use this approach, scores should be normalized, which for BM25 requires score distributions for typical queries, furthermore some annotated data should be used for training the method weights. In our final planned blog in this series, we will introduce the work we have been doing around inference and index performance as we move toward GA for the text_expansion feature. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3 : Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid retrieval The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Hybrid retrieval Reciprocal Rank Fusion Weighted Sum of Scores Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving information retrieval in the Elastic Stack: Hybrid retrieval - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/improving-information-retrieval-elastic-stack-hybrid",
+    "meta_description": "In this blog we introduce hybrid retrieval and explore two concrete implementations in Elasticsearch. We explore improving Elastic Learned Sparse Encoder’s performance by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Stateless: Data safety in a stateless world We discuss the data durability guarantees in stateless including how we fence new writes and deletes with a safety check which prevents stale nodes from acknowledging new writes or deletes Elastic Cloud Serverless HA By: Henning Andersen On September 6, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. In recent blog posts, we announced the stateless architecture that underpins our Elastic Cloud Serverless offering. By offloading durability guarantees and replication to an object store (e.g., Amazon S3), we gain many advantages and simplifications. Historically, Elasticsearch has relied upon local disk persistence for data safety and handling stale or isolated nodes. In this blog, we will discuss the data durability guarantees in stateless including how we fence new writes and deletes with a safety check which prevents stale nodes from unsafely acknowledging these operations. In the following blog post, we will cover the basics of the durability promise and how Elasticsearch uses an operation log (translog) to be able to quickly and safely acknowledge writes to clients. Next we will dive into the problem, introduce concepts that help us, and finally explain the additional safety check that makes us able to confidently acknowledge writes to clients. Durability promise and translog When clients write data to Elasticsearch, for instance using the _bulk API, Elasticsearch will provide an HTTP response code for the request. Elasticsearch will only provide a successful HTTP response code (200/201) when data has been safely stored. We use an operation log (called translog) where requests are appended and stored before acknowledging the write. The translog allows us to replay operations that have not been successfully persisted to the underlying Lucene index (for instance if a node crashed after we acknowledged the write to the client). For more information on the translog and Lucene indices, see this section in our recent blog post on thin indexing shards , where we explain how we now store Lucene indices and the translog in the object store. Not knowing is the worst - the problem(s) The master allocates a shard to an indexing node, that then owns indexing incoming data into that shard. However, we must account for scenarios where this node falls out of communication with the master and/or rest of the cluster. In such cases, the master will (after timeouts) assume the node is no longer operational and reassign affected shards to other nodes. The prior assignment would now be considered stale. A stale node may still be operational attempting to index and persist data it receives. In this scenario, with potentially two owners of a shard trying to acknowledging writes but out of communication with each other, we have two problems to solve: Avoiding file overwrites in the object store Ensuring that acknowledged writes are not lost Primary terms for stateless- an increasing number to the rescue Elasticsearch has for many years utilized something we call primary terms. Whenever a primary shard is assigned to a node, it is given a primary term for the allocation. If a primary shard fails or goes from unassigned to assigned, the master will increment the primary term before reassigning the primary shard. This gives a strict order of primary shard assignments and ownership, higher primary terms were assigned after lower primary terms. For stateless, we utilize primary terms in the path of index files we write to the object store to ensure that the first problem described above cannot happen. If a shard fails and is reassigned, we know it will have a higher primary term. A shard will only write files in the primary term specific path, thus there is no chance of an older shard assignment and a newer shard assignment writing the same files. They simply write to different paths. The primary term is also used to ultimately provide the durability guarantee, more on that later. Notice that primary shard relocations do not increment the primary term, instead the two nodes involved in a primary shard relocation hand off the ownership through an explicit protocol. Coordination term and node-left generation in stateless The coordination subsystem in Elasticsearch is a strongly consistent mechanism used for cluster level coordination, including cluster membership and cluster metadata (all known as cluster state). In stateless, this system also builds on top of the object store, uploading new cluster state versions. Like in stateful, it maintains an increasing number for elections called “term” (we’ll call it coordination term here to disambiguate it from the primary term described in the previous section). Whenever a node decides to start a new election, it will do so in a new coordination term, higher than any previous terms seen (more details on how this works in stateful in the blog post here ). In stateless, the election happens through an object store file we call the lease file. This file contains the coordination term and the node that claims that term is the elected master for the term. This file will help the safety check we are interested in here. If the coordination term is still the same, we know the elected master did not change. Just the coordination term is not enough though, since this does not necessarily change if a node leaves the cluster. In order to detect that a data node has not left the cluster, we also add the node-left generation to the lease file. This is an increasing number, incremented every time a node leaves the cluster. It resets from zero when the term changes (but we can disregard that for the story here). The lease file is written to the object store as part of persisting a new cluster state. This write happens before any actions (like shard recovery) are otherwise taken based on the new cluster state. Object store read after write semantics in stateless We use the object store to store all data in stateless and the visibility guarantees of the object store are therefore important to consider. Ultimately, the safety check builds on top of those guarantees. Following are the main object store visibility guarantees that we rely on: Read-after-write: after a successful write, any read will return the new contents. List-after-write: after a successful write, any listing matching the new file will return the file. These were not a given years ago, but are available today across AWS S3, GCP and Azure blob storage. Stateless: The safety check Having the necessary building blocks described above, we can now move on to the actual safety check and safety argumentation. While the translog guarantees durability of writes, we need to ensure that the node is still the assigned indexing node prior to acknowledging the write. The source of truth for that is in cluster state and the data node therefore needs to establish that it has a new enough cluster state in order to determine whether it is safe to acknowledge the write. We are only interested in non-graceful events like node crashes, network partitions and similar. Graceful events like shard relocations are handled through explicit hand-offs that guarantee their correctness (we'll not dive into this in this blog post). Let us consider an ungraceful event, for instance where the master node detects that a data node that holds a shard is no longer responding and it thus ejects the node from the cluster. We'll examine the safety check in this context and see how it avoids that a stale node potentially incorrectly acknowledges a write to client. The safety check adds one additional check before responding to the client: Read the lease file from the object store. If the coordination term or node-left generation has advanced past the values in the node's local cluster state, it cannot rely on the cluster state until it receives an updated version with a higher or equal coordination term and node-left generation. With a new enough cluster state, it can be used to check whether the primary term of the shard has changed. If it has changed, the write will fail. The happy path will incur no waiting here, since the term and node-left generation changes very infrequently relative to a normal write request frequency. The overhead of this check is thus small. Notice that the ordering is important: the translog file is successfully uploaded before the safety check. We’ll see why shortly. The ungraceful node-left event leads to an increment of the node-left generation in the lease file. Afterwards, a new node may be assigned the shard and start recovering data (this may be just one cluster state update, but the ordering of the lease file write and a node starting recovery is the only important part here and is guaranteed). The newly assigned node will then read the shard data and recover the data contained in translog. We see that we have the following ordering of events: Original data node writes translog before reading lease file Master writes lease file with incremented node-left generation before new data node starts recovering and thus before reading the translog Object store guarantees read-after-write on the lease file and translog files. There are two main situations to consider: The original data node wrote the translog file and read a lease file indicating it is still in the cluster and owner of the shard (primary term did not change). We then know that the master did not successfully update the lease file prior to the data node reading it. Therefore, the write to the translog by the original data node happens before the read of the translog by the new node assignment, guaranteeing that the operations will be available to the new node for recovery. The original data node wrote the translog file, but after possibly waiting for a new cluster state based on the information in the lease file, it is no longer the owner of the shard (making it fail the write request). We do not respond successfully to the write request, thus do not promise durability. The translog data might be available to the new node assignment during recovery, but that is fine. It is ok for a failed request to actually have persisted data durably. We thus see that any write that Elasticsearch has successfully responded to will be available for any future owners of the same shard, concluding our safety argumentation. Similarly, we can argue that a master failover case is safe. Here the coordination term rather than the node-left generation will change. We will not go through that here. This same safety check is used in a number of other critical situations: During index file deletion. When Lucene merges segments, old segments can be deleted. We add a safety check here to protect against deleting files that a newer node assignment needs. During translog file deletion. Translogs can be deleted when the index data in the object store contains all the operations. Again, we add a safety check here to protect against deleting translog files that a newer node assignment needs. Conclusion Congratulations, you made it to the end, hopefully you enjoyed the deep dive here. We described a novel mechanism for ensuring that Elasticsearch durably and safely persists writes to an object store, also in the presence of any kind of disruption causing Elasticsearch to otherwise have two nodes owning indexing into the same shard. We care deeply about such aspects and if you do too, perhaps take a look at our open job offerings . Shout out to David Turner, Francisco Fernández Castaño and Tim Brooks who did most of the real work here. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Durability promise and translog Not knowing is the worst - the problem(s) Primary terms for stateless- an increasing number to the rescue Coordination term and node-left generation in stateless Object store read after write semantics in stateless Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Stateless: Data safety in a stateless world - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/data-safety-stateless-elasticsearch",
+    "meta_description": "Explore the data durability and safety guarantees in Elasticsearch stateless, including how we fence new writes and deletes with a safety check."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. Developer Experience Javascript How To JR By: Jeffrey Rengifo On May 15, 2025 Part of Series Elasticsearch in JavaScript the proper way Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This is the first article of a series that covers how to use Elasticsearch with JavaScript. In this series, you’ll learn the basics of how to use Elasticsearch in a JavaScript environment and review the most relevant features and best practices to create a search app. By the end, you’ll know everything you need to run Elasticsearch using JavaScript. In this first part, we will review: Environment Frontend, backend, or serverless? Connecting the client Indexing documents Elasticsearch client Semantic mappings Bulk helper Searching data Lexical Query Semantic Query Hybrid Query You can check the source code with the examples here . What is the Elasticsearch Node.js client? The Elasticsearch Node.js client is a JavaScript library that puts the HTTP REST calls from the Elasticsearch API into JavaScript. This makes it easier to handle and to have helpers that simplify tasks like indexing documents in batches. Environment Frontend, backend, or serverless? To create our search app using the JavaScript client, we need at least two components: an Elasticsearch cluster and a JavaScript runtime to run the client. The JavaScript client supports all Elasticsearch solutions (Cloud, on-prem, and Serverless), and there are no major differences among them since the client handles all variations internally, so you don’t need to worry about which one to use. The JavaScript runtime, however, must be run from the server and not directly from the browser. This is because when calling Elasticsearch from the browser, the user can get sensitive information like the cluster API key, host, or the query itself. Elasticsearch recommends never exposing the cluster directly to the internet and using an intermediate layer that abstracts all this information so that the user can only see parameters. You can read more about this topic here . We suggest using a schema like this: In this case, the client only sends the search terms and an authentication key for your server while your server is in total control over the query and communication with Elasticsearch. Connecting the client Start by creating an API key following these steps . Following the previous example, we’ll create a simple Express server, and we’ll connect to it using a client from a Node.JS server. We’ll initialize the project with NPM and install the Elasticsearch client and Express. The latter is a library to bring up servers in Node.js. Using Express, we can interact with our backend via HTTP. Let’s initialize the project: npm init -y Install dependencies: npm install @elastic/elasticsearch express split2 dotenv Let me break it down for you: @elastic/elasticsearch : It is the official Node.js client express : It will allow us to spin a lightweight nodejs server to expose Elasticsearch split2 : Splits lines of text into a stream. Useful to process our ndjson files one line at a time dotenv : Allow us to manage environment variables using a .env file Create a .env file at the root of the project and add the following lines: This way, we can import those variables using the dotenv package. Create a server.js file: This code sets up a basic Express.js server that listens on port 3000 and connects to an Elasticsearch cluster using an API key for authentication. It includes a /ping endpoint that when accessed via a GET request, queries the Elasticsearch cluster for basic information using the .info() method of the Elasticsearch client. If the query is successful, it returns the cluster info in JSON format; otherwise, it returns an error message. The server also uses body-parser middleware to handle JSON request bodies. Run the file to bring up the server: node server.js The answer should look like this: And now, let’s consult the endpoint /ping to check the status of our Elasticsearch cluster. Indexing documents Once connected, we can index documents using mappings like semantic_text for semantic search and text for full-text queries. With these two field types, we can also do hybrid search . We’ll create a new load.js file to generate the mappings and upload the documents. Elasticsearch client We first need to instantiate and authenticate the client: Semantic mappings We’ll create an index with data about a veterinary hospital. We’ll store the information from the owner, the pet, and the details of the visit. The data on which we want to run full-text search, such as names and descriptions, will be stored as text. The data from categories, like the animal’s species or breed, will be stored as keywords. Additionally, we’ll copy the values of all fields into a semantic_text field to be able to run semantic search against that information too. Bulk helper Another advantage of the client is that we can use the bulk helper to index in batches. The bulk helper allows us to easily handle things like concurrence, retries, and what to do with each document that goes through the function and that succeeds or fails. An attractive feature of this helper is that you can work with streams. This function allows you to send a file line by line instead of storing the complete file in the memory and sending it to Elasticsearch in one go. To upload the data to Elasticsearch, create a file called data.ndjson in the project’s root and add the information below (alternatively, you can download the file with the dataset from here ): We use split2 to stream the file lines while the bulk helper sends them to Elasticsearch. The code above reads a .ndjson file line by line and bulk indexes each JSON object into a specified Elasticsearch index using the helpers.bulk method. It streams the file using createReadStream and split2 , sets up indexing metadata for each document, and logs any documents that fail to process. Once complete, it logs the number of successfully indexed items. Alternatively to the indexData function, you can upload the file directly via UI using Kibana, and use the upload data files UI. We run the file to upload the documents to our Elasticsearch cluster. node load.js Searching data Going back to our server.js file, we’ll create different endpoints to perform lexical, semantic, or hybrid search. In a nutshell, these types of searches are not mutually exclusive, but will depend on the kind of question you need to answer. Query type Use case Example question Lexical query The words or word roots in the question are likely to show up in the index documents. Token similarity between question and documents. I’m looking for a blue sport t-shirt. Semantic query The words in the question are not likely to show up in the documents. Conceptual similarity between question and documents. I’m looking for clothing for cold weather. Hybrid search The question contains lexical and/or semantic components. Token and semantic similarity between question and documents. I’m looking for an S size dress for a beach wedding. The lexical parts of the question are likely to be part of titles and descriptions, or category names, while the semantic parts are concepts related to those fields. Blue will probably be a category name or part of a description, and beach wedding is not likely to be, but can be semantically related to linen clothing. Lexical query (/search/lexic?q=<query_term>) Lexical search, also called full-text search, means searching based on token similarity; that is, after an analysis, the documents that include the tokens in the search will be returned. You can check our lexical search hands-on tutorial here . We test with: nail trimming Answer: Semantic query (/search/semantic?q=<query_term>) Semantic search, unlike lexical search, finds results that are similar to the meaning of the search terms through vector search. You can check our semantic search hands-on tutorial here . We test with: Who got a pedicure? Answer: Hybrid query (/search/hybrid?q=<query_term>) Hybrid search allows us to combine semantic and lexical search, thus getting the best of both worlds: you get the precision of searching by token, together with the meaning proximity of semantic search. We test with “ Who got a pedicure or dental treatment?\" Response: Conclusion In this first part of our series, we explained how to set up our environment and create a server with different search endpoints to query the Elasticsearch documents following the client/server best practices. Check out part two of our series, in which you’ll learn production best practices and how to run the Elasticsearch Node.js client in Serverless environments. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is the Elasticsearch Node.js client? Environment Frontend, backend, or serverless? Connecting the client Indexing documents Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch in JavaScript the proper way, part I - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/how-to-use-elasticsearch-in-javascript-part-i",
+    "meta_description": "Explaining how to create a production-ready Elasticsearch backend in JavaScript."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. Developer Experience ET LP By: Elastic Team and Logan Pashby On May 6, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. “When I first typed ‘drone’ into the search and saw results for ‘unmanned aerial vehicles’ without synonyms, I was like, ‘Wow, this thing really gets it.’ That’s when it clicked—it genuinely felt like magic.” — Logan Pashby, Principal Engineer, Cypris.ai Relevance at scale: Cypris’ search story Cypris is a platform that helps R&D and innovation teams navigate a massive dataset of patents and research papers of over 500 million documents. Their mission is to make it easier to track innovation, find prior art, and understand the organizations driving new technologies. But there was a problem. To get relevant results, users had to write complex boolean queries—which was fine for expert users, but a barrier for many others. Cypris needed a way to make search more intuitive and accessible. The answer was semantic search powered by vector similarity. However, they discovered that scaling semantic search over a large corpus turned out to be a tough engineering problem. Handling 500 million high dimensional vectors wasn’t just a matter of pushing them into a system and hitting “search.” “When we first indexed all 500 million vectors, we were looking at 30- to 60-second query times in the worst case.” It would require a series of carefully considered trade-offs between model complexity, hardware resources, and indexing strategy. Logan Pashby is a Principal Engineer at Cypris, where he focuses on the platform's innovation intelligence features. With expertise in topics such as deep learning, distributed systems, and full-stack development, Logan solves complex data challenges and develops efficient search solutions for R&D and IP teams. Choosing the right model Cypris’ first attempt at vector search used 750-dimensional embeddings for every document, but they quickly realized scaling such large embeddings across 500 million documents would be unmanageable. By using the memory approximation formula without quantization, the estimated bytes of RAM required would be around 1500 GB, making it clear that they needed to adjust their strategy. “We assumed, and we hoped, that the larger the dimension of the vector, the more information we could encode. A richer embedding space should mean better search relevance.” They considered using sparse vectors like Elastic’s ELSER which avoids the fixed-dimension limitations of dense embeddings by representing documents as weighted lists of tokens instead. However, at the time, ELSER’s CPU-only inference seemed too slow for Cypris’s dataset. Dense vectors, on the other hand, let them leverage off-cluster GPU acceleration, which improved throughput by 10x to 50x when generating embeddings. Cypris’ setup included an external GPU based service to compute vectors which were then indexed into Elasticsearch. The team ultimately decided on lower-dimensional dense vectors that struck a balance: they were compact enough to make indexing and search feasible, yet rich enough to maintain relevance in results. Making it work with production scale data Challenges - disk space Once Cypris had vectors ready to be indexed, they faced the next hurdle: efficiently storing and searching over them in Elasticsearch . The first step was reducing disk space. “At the end of the day, vectors are just arrays of floats.... But when you have 500 million of them, the storage requirements add up quickly.” By default, vectors in Elasticsearch are stored multiple times: first in the _source field (the original JSON document), then in doc_values (columnar storage optimized for retrieval), and finally within the HNSW graph itself. Given that each 750-dimensional float32 vector takes about 3KB, storing 500 million vectors quickly becomes problematic, potentially exceeding 1.5 terabytes per storage layer. One practical optimization Cypris used was excluding vectors from the source document in Elasticsearch. This helped reduce overhead, but it turned out disk space wasn’t the biggest challenge. The bigger challenge was memory management. Did You Know? Elasticsearch allows you to optimize disk space by excluding vectors from the source document. This can significantly reduce storage costs, especially when dealing with large datasets. However, be aware that excluding vectors from the source will impact reindexing performance. For more details, check out the Elasticsearch documentation on source filtering . Challenges - RAM explosion Known nearest neighbor (kNN) search in Elasticsearch relies on HNSW graphs, which perform best when fully loaded into RAM. With 500 million high-dimensional vectors, there were significant memory demands on the system. “Trying to fit all of those vectors in memory at query time was not an easy thing to do,” Logan adds. Cypris had to juggle multiple memory requirements: the vectors and their HNSW graphs needed to reside in off-heap memory for fast search performance, while the JVM heap had to remain available for other operations. On top of that, they still needed to support traditional keyword search, and the associated Elasticsearch inverted index would need to stay in memory as well. Managing memory with dimensionality reduction, quantization, and segments Cypris explored multiple approaches to better manage memory and storage, here were three that worked well: Lower-dimensional vectors : The Cypris team swapped to using a smaller model that reduced vector sizes, thereby lowering resource requirements. BBQ (Better Binary Quantization) : Cypris was considering int8 quantization, but when Elastic released BBQ, Cypris adopted it quickly. “We tested it out and it didn’t have a huge hit to relevance and was significantly cheaper. So we implemented it right away”, says Logan . BBQ immediately reduced the size of their vector indexes by around 20% ! Did You Know? Elasticsearch’s Binary Quantized Vectors (BBQ) can reduce the size of vector indexes by ~20%, with minimal impact on search relevance. BBQ reduces both disk usage—by shrinking index size—and memory usage, since smaller vectors take up less space in RAM during searches. It’s especially helpful when scaling KNN search with HNSW graphs, where keeping everything in memory is critical for performance. Explore how BBQ can optimize your search infrastructure in the Elasticsearch documentation on vector search. Segment and shard tuning: Cypris also optimized how Elasticsearch segments and shards were managed. HNSW graphs are built per segment, so searching dense vectors means querying across all segments in a shard. As Logan explains: “HNSW graphs are independent within each segment and each dense vector field search involves finding the nearest neighbors in every segment, making the total cost dependent on the number of segments.” Fewer segments generally mean faster searches—but aggressively merging them can slow down indexing. Since Cypris ingests new documents daily, they regularly force-merge segments to keep them slightly below the default 5GB threshold, preserving automatic merging and tombstone garbage collection. To balance search speed with indexing throughput, force-merging occurs during low-traffic periods, and shard sizes are maintained within a healthy range (below 50GB) to optimize performance without sacrificing ingestion speed. More vectors, faster searches, happy users With these optimizations, Cypris brought query times down from 30–60 seconds to 5–10 seconds . They are also seeing 60–70% of their user queries shift from the previous boolean search experience to the new semantic search interface. But the team is not stopping here! The goal is to achieve sub-second queries to support fast, iterative search and get most of their users to shift to semantic search. Cypris’ product handles 500M docs (or about 7TB+ data), providing real-time AI search and retrieval, and supports 30% quarterly company growth. The product significantly accelerated search use cases, cutting report generation from weeks down to minutes. What did the Cypris team learn? … and what’s next? 500 million vectors don’t scale themselves Handling 500 million vectors isn’t just a storage problem or a search problem—it’s both. Cypris had to balance search relevance, hardware resources, and indexing performance at every step. Did you know Elasticsearch's _search API includes a profile feature that allows you to analyze the execution time of search queries. This can help identify bottlenecks and optimize query performance. By enabling profiling, you can gain insights into how different components of your query are processed. Learn more about using the profile feature in the Elasticsearch search profiling documentation . With search, there’s always a trade-off BBQ was a major win, but it didn’t eliminate the need to rethink sharding, memory allocation, and indexing strategy. Reducing the number of segments improved search speed, but made indexing slower. Excluding vectors from the source reduced disk space but complicated reindexing, as Elasticsearch doesn’t retain the original vector data needed to efficiently recreate the index. Every optimization came with a cost that had to be carefully weighed. Prioritize your users, not the model Cypris didn’t chase the largest models or highest dimension vectors. They focused on what made sense for their users, and working backwards. “Figure out what relevance means for your data,” Logan advises. “And work backward from there.” Cypris is now expanding to other datasets, which could double the number of documents they have to index in elastic. They need to move quickly to stay competitive, “We’re a small team,” Logan says. “So everything we do has to scale—and it has to work.” To learn more, visit cypris.ai Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Jump to Relevance at scale: Cypris’ search story Choosing the right model Making it work with production scale data Challenges - disk space Challenges - RAM explosion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/building-hybrid-search-at-cypris",
+    "meta_description": "Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Connect Agents to Elasticsearch with Model Context Protocol Let’s use Model Context Protocol server to chat with your data in Elasticsearch. Agent How To JB JM By: Jedr Blaszyk and Joe McElroy On March 28, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. What if interacting with your data was as effortless as chatting with a colleague? Imagine simply asking, \"Show me all orders over $500 from last month\" or \"Which products received the most 5-star reviews?\" and getting instant, accurate answers, no querying required. Model Context Protocol (MCP) makes this possible. It seamlessly connects conversational AI with your databases and external APIs, transforming complex requests into natural conversations. While modern LLMs are great at understanding language, their true potential is unlocked when integrated with real-world systems. MCP bridges the gap between them, making data interaction more intuitive and efficient. In this post, we’ll explore: MCP architecture – How it works under the hood Benefits of an MCP server connected to Elasticsearch Building an Elasticsearch-powered MCP server Exciting times ahead! MCP's integration with your Elastic stack transforms how you interact with information, making complex queries as intuitive as everyday conversation. Model Context Protocol Model Context Protocol (MCP), developed by Anthropic, is an open standard that connects AI models to external data sources through secure, bidirectional channels. It solves a major AI limitation: real-time access to external systems while preserving conversation context. MCP architecture Model Context Protocol architecture consists of two key components: MCP Clients – AI assistants and chatbots that request information or execute tasks on behalf of users. MCP Servers – Data repositories, search engines, and APIs that retrieve relevant information or perform requested actions (e.g., calling external APIs). MCP Servers expose two primary capabilities to Clients: Resources - Structured data, documents, and content that can be retrieved and used as context for LLM interactions. This allows AI assistants to access relevant information from databases, search indexes, or other sources. Tools - Executable functions that enable LLMs to interact with external systems, perform computations, or take real-world actions. These tools extend AI capabilities beyond text generation, allowing assistants to trigger workflows, call APIs, or manipulate data dynamically. Prompts - Reusable prompt templates and workflows to standardize and share common LLM interactions. Sampling - Request LLM completions through the client to enable sophisticated agentic behaviors while maintaining security and privacy. MCP server + Elasticsearch Traditional Retrieval-Augmented Generation (RAG) systems retrieve documents based on user queries, but MCP takes it a step further: it enables AI agents to dynamically construct and execute tasks in real time. This allows users to ask natural language questions like: \"Show me all orders over $500 from last month.\" \"Which products received the most 5-star reviews?\" And get instant, precise answers, without writing a single query. MCP achieves this through: Dynamic tool selection – Agents intelligently choose the right tools exposed via MCP servers based on user intent. “Smarter” LLMs are generally better at selecting the right tools with the appropriate arguments based on context. Bidirectional communication – Agents and data sources exchange information fluidly, refining queries as needed (e.g. lookup index mapping first, only then construct the ES query). Multi-tool orchestration – Workflows can leverage tools from multiple MCP servers simultaneously. Persistent context – Agents remember previous interactions, maintaining continuity across conversations. An MCP server connected to Elasticsearch unlocks a powerful real-time retrieval architecture. AI agents can explore, query, and analyze Elasticsearch data on demand. Your data can be searchable through a simple chat interface. Beyond just retrieving data, MCP enables action. It integrates with other tools to trigger workflows, automate processes, and feed insights into analytics systems. By separating search from execution, MCP keeps AI-powered applications flexible, up-to-date, and seamlessly integrated into agentic workflows. Hands on: MCP server to chat with your Elasticsearch data To interact with Elasticsearch via an MCP server, we need at least functions to: Retrieve indices Obtain mappings Perform searches using Elasticsearch’s Query DSL Our server is written in TypeScript, and we will be using the official MCP TypeScript SDK . For setup, we recommend installing the Claude Desktop App (the free version is sufficient) since it includes a built-in MCP Client. Our MCP server essentially exposes the official JavaScript Elasticsearch client through MCP tools. Let’s start by defining the Elasticsearch client and MCP server: We will use following MCP server tools that can interact with Elasticsearch: List Indices ( list_indices ): This tool retrieves all available Elasticsearch indices, providing details such as index name, health status, and document count. Get Mappings ( get_mappings ): This tool fetches the field mappings for a specified Elasticsearch index, helping users understand the structure and data types of stored documents. Search ( search ): This tool executes an Elasticsearch search using a provided Query DSL. It automatically enables highlights for text fields, making it easier to identify relevant search results. The full Elasticsearch MCP server implementation is available in the elastic/mcp-server-elasticsearch repo. Chat with your index Let's explore how to set up the Elasticsearch MCP server so you can ask natural language questions about your data, such as \"Find all orders over $500 from last month.\" Configure your Claude Desktop App Open the Claude Desktop App Navigate to Settings > Developer > MCP Servers Click \"Edit Config\" and add this configuration to your claude_desktop_config.json : Note: This setup utilizes the @elastic/mcp-server-elasticsearch npm package published by Elastic. If you want to develop locally, you can find more details on spinning up the Elasticsearch MCP server here . Populate your Elasticseach index You can use our example data to populate the \"orders\" index for this demo This will allow you to try queries like \"Find all orders over $500 from last month\" Start using it Open a new conversation in the Claude Desktop App The MCP server will connect automatically Start asking questions about your Elasticsearch data! Check out this demo to see how easy it is to query your Elasticsearch data using natural language: How does it work? When asked 'Find all orders over $500 from last month,' the LLM recognizes the intent of searching the Elasticsearch index with specified constraints. To perform an effective search, the agent figures to: Figure out the index name: orders Understand the mappings of orders index Build the Query DSL compatible with index mappings and finally execute the search request This interaction can be represented as: Conclusion Model Context Protocol enhances how you interact with Elasticsearch data, enabling natural language conversations instead of complex queries. By bridging AI capabilities with your data, MCP creates a more intuitive and efficient workflow that maintains context throughout your interactions. The Elasticsearch MCP server is available as a public npm package ( @elastic/mcp-server-elasticsearch ), making integration straightforward for developers. With minimal setup, your team can start exploring data, triggering workflows, and gaining insights through simple conversations. Ready to experience it for yourself? Try out the Elasticsearch MCP server today and start chatting with your data. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Model Context Protocol MCP architecture MCP server + Elasticsearch Hands on: MCP server to chat with your Elasticsearch data Chat with your index Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Connect Agents to Elasticsearch with Model Context Protocol - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/model-context-protocol-elasticsearch",
+    "meta_description": "Learn about the Model Context Protocol (MCP), its benefits with Elasticsearch, and how to use an Elasticsearch MCP server to chat with your data."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. Integrations Python How To JR By: Jeffrey Rengifo On April 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this article, you will learn how to leverage LlamaIndex Workflows with Elasticsearch to quickly build a self-filtering search application using LLM. LlamaIndex Workflows propose a different approach to the issue of splitting tasks into different agents by introducing a steps-and-events architecture . This simplifies the design compared to similar methodologies based on DAG (Directed Acyclic Graph) like LangGraph. If you want to read more about agents in general, I recommend you read this article. Image Source: https://www.llamaindex.ai/blog/introducing-workflows-beta-a-new-way-to-create-complex-ai-applications-with-llamaindex One of the main LlamaIndex features is the capacity to easily create loops during execution. Loops can help us with autocorrect tasks since we can repeat a step until we get the expected result or reach a given number of retries. To test this feature, we’ll build a flow to generate Elasticsearch queries based on the user’s question using a LLM with an autocorrect mechanism in case the generated query is not valid. If after a given amount of attempts the LLM cannot generate a valid query, we’ll change the model and keep trying until timeout. To optimize resources, we can use the first query with a faster and cheaper model, and if the generation still fails, we can use a more expensive one. Understanding steps and events A step is an action that needs to be run via a code function. It receives an event together with a context, which can be shared by all steps. There are two types of base events: StartEvent , which is a flow-initiating event, and StopEven, to stop the event’s execution. A Workflow is a class that contains all the steps and interactions and puts them all together. We’ll create a Workflow to receive the user’s request, expose mappings and possible fields to filter, generate the query, and then make a loop to fix an invalid query. A query could be invalid for Elasticsearch because it does not provide valid JSON or because it has syntax errors. To show you how this works, we’ll use a practical case of searching for hotel rooms with a workflow to extract values to create queries based on the user’s search. The complete example is available in this Notebook . Steps Install dependencies and import packages Prepare data Llama-index workflows Execute workflow tasks 1. Install dependencies and import packages We’ll use mistral-saba-24b and llama3-70b Groq models, so besides elasticsearch and llama-index , we’ll need the llama-index-llms-groq package to handle the interaction with the LLMs. Groq is an inference service that allows us to use different open available models from providers like Meta, Mistral, and OpenAI. In this example, we’ll use its free layer . You can get the API KEY that we’ll use later here . Let’s proceed to install the required dependencies: Elasticsearch, the LlamaIndex core library, and the LlamaIndex Groq LLM’s package. We start by importing some dependencies to handle environment variables (os), and managing JSON. After that, we import the Elasticsearch client with the bulk helper to index using the bulk API. We finish by importing the Groq class from LlamaIndex to interact with the model, and the components to create our workflow. 2. Prepare data Setup keys We set the environment variables needed for Groq and Elasticsearch. The getpass library allows us to enter them via a prompt without echoing them. Elasticsearch client The Elasticsearch client handles the connection with Elasticsearch and allows us to interact with Elasticsearch using the Python library. Ingesting data to Elasticsearch We are going to create an index with hotel rooms as an example: Mappings We’ll use text-type fields for the properties where we want to run full-text queries; “keyword” for those where we want to apply filters or sorting, and “byte/integer” for numbers. Ingesting documents to Elasticsearch Let’s ingest some hotel rooms and amenities so users can ask questions that we can turn into Elasticsearch queries against the documents. We parse the JSON documents into a bulk Elasticsearch request. 3. LlamaIndex Workflows We need to create a class with the functions required to send Elasticsearch mapping to the LLM, run the query, and handle errors. Workflow prompts The EXTRACTION_PROMPT will provide the user’s question and index the mappings to the LLM so it can return an Elasticsearch query. Then, the REFLECTION_PROMPT will help the LLM make corrections in case of errors by providing the output from the EXTRACTION_PROMPT , plus the error caused by the query. Workflow events We created classes to handle extraction and query validation events: Workflow Now, let’s put everything together. We first need to set the maximum number of attempts to change the model to 3. Then, we will do an extraction using the model configured in the workflow. We validate if the event is StartEvent ; if so, we capture the model and question (passage). Afterward, we run the validation step, that is, trying to run the extracted query in Elasticsearch. If there are no errors, we generate a StopEvent and stop the flow. Otherwise, we issue a ValidationErrorEvent and repeat step 1, providing the error to try to correct it and return to the validation step. If there is no valid query after 3 attempts, we change the model and repeat the process until we reach the timeout parameter of 60s running time. 4. Execute workflow tasks We will make the following search: Rooms with smart TV, wifi, jacuzzi and price per night less than 300 . We’ll start using the mistral-saba-24b model and switch to llama3-70b-8192 , if needed, following our flow. Results (Formatted for readability) === EXTRACT STEP === MODEL: mistral-saba-24b OUTPUT: Step extract produced event ExtractionDone Running step validate === VALIDATE STEP === Max retries for model mistral-saba-24b reached, changing model Elasticsearch results: Step validate produced event ValidationErrorEvent Running step extract === EXTRACT STEP === MODEL: llama3-70b-8192 OUTPUT: Step extract produced event ExtractionDone Running step validate === VALIDATE STEP === Elasticsearch results: Step validate produced event StopEvent In the example above, the query failed because the mistral-saba-24b model returned it in markdown format, adding ```json at the beginning and ``` at the end. In contrast, the llama3-70b-8192 model directly returned the query using the JSON format. Based on our needs, we can capture, validate, and test different errors or build fallback mechanisms after a number of attempts. Conclusion The LlamaIndex workflows offer an interesting alternative to develop agentic flows using events and steps. With only a few lines of code, we managed to create a system that is able to autocorrect with interchangeable models. How could we improve this flow? Along with the mappings, we can send to the LLM possible exact values for the filters, reducing the number of no result queries because of misspelled filters. To do so, we can run a terms aggregation on the features and show the results to the LLM. Adding code corrections to common issues—like the Markdown issue we had—to improve the success rate. Adding a way to handle valid queries that yield no results. For example, remove one of the filters and try again to make suggestions to the user. A LLM could be helpful in choosing which filters to remove based on the context. Adding more context to the prompt, like user preferences or previous searches, so that we can provide customized suggestions together with the Elasticsearch results. Would you like to try one of these? Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Understanding steps and events Steps 1. Install dependencies and import packages 2. Prepare data Setup keys Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using LlamaIndex Workflows with Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/llamaindex-workflows-with-elasticsearch",
+    "meta_description": "Learn how to create an Elasticsearch-based step for your LlamaIndex workflow."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Engineering a new Kibana dashboard layout to support collapsible sections & more Building collapsible dashboard sections in Kibana required overhauling an embeddable system and creating a custom layout engine. These updates improve state management, hierarchy, and performance while setting the stage for new advanced dashboard features. Developer Experience TS HM NR By: Teresa Alvarez Soler , Hannah Mudge and Nathaniel Reese On January 22, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We are developing collapsible sections to hide and show panels in your Kibana dashboards to help organize content and improve performance. It’s a classic software development tale: sometimes to go forward you have to go…down? Read about how building an in-demand feature that seemed straightforward can sometimes lead you to bigger simplifications than you ever intended! 😅 Kibana dashboard collapsible sections A little bit of background: Dashboards in Kibana can contain many visualizations that Site Reliability Engineers (SREs) use to keep systems running or Security Analysts use in their investigations. These dashboards can be lengthy and slow to load. Users want to better organize dashboard content to avoid performance pitfalls and make them easier to scan. Today, the best way to accomplish this is to split dashboard content into multiple dashboards and then link them using the dashboard links panel to facilitate navigation. This unfortunately doesn’t let you see things side-by-side and makes updates and dashboard maintenance require a lot of effort from the dashboard author. To solve this need, we are developing collapsible sections to hide and show panels in your Kibana dashboards –these sections help organize content and don’t load content that is collapsed to improve performance. These new sections will allow you to group dashboard panels and data visualizations that are thematically related making it easier to find the information you are looking for. Most importantly, you can easily hide and expand these sections allowing you to load only the data that you need. This will help you create side-by-side comparisons for your charts and streamline dashboard performance. Planning the engineering approach At the onset when looking at what our customers wanted the feature seemed like a business-as-usual-sized engineering effort: Dashboards contain panels (more about those in a moment) and they are to be organized into sections and the product requirements ask that we only render them when a section is open. There’s also a drag and drop system to lay out a dashboard and it needs to account for these sections and handle a variety of moving-things-between-sections sort of use cases. Seems well in hand as an enhancement to existing code, right? Well unfortunately after a short proof of concept, we found the answer is no. It’s not that simple. Kibana uses an “embeddable” framework and this framework lacks the qualities needed to not render certain embedded objects on a dashboard. Let's take a look at why… What is an “embeddable”? Even though \"embeddable\" does not appear in the navigation menu alongside \"Discover\" and \"Dashboard\", you interact with embeddables throughout Kibana. The histogram in Discover, each panel in a Dashboard, a panel’s context menu, a Lens chart in Observability, or a Map in Security - all made possible with embeddables. Embeddables are React components that provide an API to deeply integrate with Kibana. This API allows them to be persisted and restored by any page, gives them access to the current search context, allows them to define editing UI, and is extensible so engineers can define how components interact with one another. They live in a registry, which separates their behaviours from where the code is written. Because of this, many engineers can work on different embeddables at the same time without getting in each other’s way. The need for a new embeddable system The legacy embeddable system we were working on at the time dates back to 2018. embeddable functionality is exposed through a custom user experience component abstraction. At the time, Kibana was transitioning from Angular 1 to React, so the embeddable system was designed to be framework agnostic which could smooth a theoretical transition away from React. While the architecture was required at the time, Kibana has changed a lot since then, and a move away from React is unlikely. Now, the inflexible and agnostic Embeddable architecture is a growing point of friction. Some pain points are: Complex state management: All state in an embeddable goes through one of two observables (input, output) in order to be inherited, set, or read. This requires consumers to set up complex two-way sync pipes. Limited inheritance: embeddables can have exactly one parent, limiting inheritance to a single level of hierarchy. Additionally, embeddable state flows from the parent to child, with child state overriding parent state if defined. Manual rendering: embeddables need a cumbersome manual render process and a compatibility layer between the rest of Kibana, which renders via React. Collapsible sections are not possible with a single level of hierarchy. Collapsible sections require multiple levels of hierarchy to allow panels to belong to the dashboard and a collapsible section . Otherwise, you wouldn’t be able to place panels into a collapsible section. New embeddable system So, to deliver this feature, we actually had to go “down” to the embeddable system itself and modernize how we manage embeddables: We had to design a new embeddable system. Fun! But also…..scope! The new embeddable functionality is exposed through plain old JavaScript objects and can compose their functionality by implementing interfaces. For example, an embeddable can communicate data loading by implementing the PublishesDataLoading interface. This offers the following benefits: Clean state management: Each piece of state is exposed as a read-only observable. Setter methods can be exposed for mutable state. Flexible inheritance: embeddables can have a chain of parents, allowing for as many levels of hierarchy as required. Each layer retains its own state so that the decision of which state to use can be determined at the time of consumption. With a system that tolerates the inheritance we need, collapsible sections can now be built. However, like any good refactor there’s a bit of a catch: embeddables are everywhere in Kibana and to implement this change without causing regressions we needed to migrate to the new embeddable system across Elastic’s full experience–from the Alerts page in Elastic Security to the Service Inventory in Elastic Observability and nearly everything in between. This has taken us some time but allows for some exciting new possibilities. New layout engine The driving force behind any Dashboard is the layout engine, which is the thing that allows panels to be dragged around and resized — without it, Dashboards would be entirely static (and boring)! Currently, Kibana uses the external react-grid-layout package to drive our Dashboards, which is an open-source layout engine managed by a small group of volunteers. This layout engine has worked great for our Dashboards up to this point; however, it is unfortunately missing critical features that would make collapsible sections possible out-of-the-box: either “panels within panels” or dragging panels across two separate instances of a layout. Due to the small team behind react-grid-layout, updates to the package are infrequent — this means that, even if we started contributing directly to react-grid-layout in order to add the features we need, incorporating these changes into Kibana Dashboards would be slow and unreliable. While we briefly considered making a Kibana-specific branch of react-grid-layout in order to get updates published at a pace that matched our development, the maintenance costs and inflexibility of this ultimately led us to discard this idea. After researching alternative layout engine packages, we decided that the best path forward would be to develop our own, internal layout engine — one that was built specifically with the Kibana Dashboard use case in mind! Work on this new layout engine, which we are calling kbn-grid-layout , has already started. To our knowledge, this is the first layout engine available that makes use of the native CSS grid in order to position its panels — all other layout engines that we found in our research relied on pixel-level transforms or absolute positioning. This makes it a lot easier to understand how panels are placed on a dashboard. kbn-grid-layout uses passive event handlers for all dragging and resizing events, with an emphasis on reducing the number of re-renders to a minimum during these actions to improve performance. Because we are in control of these event handlers, this allows us to focus on the user experience much more than we previously could, and we’ve added features such as auto-scrolling when dragging near the top or bottom of the screen, and locking the height of the grid during resize events to prevent unexpected behavior that could result from the browser responding to height changes before the resize event was complete. Drag event Resize event We are currently working on refining the implementation, which includes improving the management of collapsible sections, adding keyboard support for dragging and resizing (which is not currently supported by Kibana dashboards), and much more. Not only will this new layout engine unlock the ability to add collapsible sections, it is being built with accessibility and efficiency at the forefront — which means the entire Dashboard experience should be improved once we make the final layout engine swap from react-grid-layout to kbn-grid-layout ! react-grid-layout kbn-grid-layout Check it out before the release We’re nearly out of the embeddable woods and ready to enjoy the fruits of our labors with all of our customers from weekly-releasing Elastic Serverless to our selfhosted users. Our customers will be able to design a single dashboard with many sections that can be collapsed by default allowing an investigation to only load panel content that’s needed while keeping lengthy dashboards tidy. If you want to provide us feedback or sign up for early testing please let us know ! We will announce when this feature is ready to be used in the next few months. Stay tuned! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote Jump to Kibana dashboard collapsible sections Planning the engineering approach What is an “embeddable”? The need for a new embeddable system New embeddable system Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Engineering a new Kibana dashboard layout to support collapsible sections & more - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/kibana-dashboard-build-layout",
+    "meta_description": "Explore collapsible sections in Kibana dashboards and how we engineered them to organize content and boost performance."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to build autocomplete feature on search application automatically using LLM generated terms Learn how to enhance your search application with an automated autocomplete feature in Elastic Cloud using LLM-generated terms for smarter, more dynamic suggestions. Generative AI Search Relevance How To MS By: Michael Supangkat On March 5, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Autocomplete is a crucial feature in search applications, enhancing user experience by providing real-time suggestions as users type. Traditionally, autocomplete in Elasticsearch is implemented using the completion suggester , which relies on predefined terms. This approach requires manual curation of suggestion terms and often lacks contextual relevance. By leveraging LLM-generated terms via OpenAI’s completion endpoint , we can build a more intelligent, scalable, and automated autocomplete feature. Supercharge your search autocomplete using LLM In this article, we’ll explore: Traditional method of implementing autocomplete in Elasticsearch. How integrating OpenAI’s LLM improves autocomplete suggestions. Scaling the solution using Ingest Pipeline and Inference Endpoint in Elastic Cloud . Traditional autocomplete in Elasticsearch The conventional approach to building autocomplete in Elasticsearch involves defining a completion field in the index mapping. This allows Elasticsearch to provide suggestions based on predefined terms. This would be straightforward to implement, especially if you have already built a comprehensive suggestion list for a fairly static dataset. Implementation Steps Create an index with a completion field. Manually curate suggestion terms and store them in the index. Query using a completion suggester to retrieve relevant suggestions. Example: Traditional autocomplete setup First, create a new index named products_test. In this index, we define a field called suggest of type completion, which is optimized for fast autocomplete suggestions. Insert a test document into the products_test index. The suggest field stores multiple completion suggestions. Finally, we use the completion suggester query to search for suggestions starting with \"MacB.\" The prefix \"MacB\" will match \"MacBook Air M2.\" The suggest section contains matched suggestions. Options contain an array of matching suggestions, where \"text\": \"MacBook Air M2\" is the top suggestion. While effective, this method requires manual curation, constant updates to suggestion terms and does not adapt dynamically to new products or descriptions. Enhancing autocomplete with OpenAI LLM In some use cases, datasets change frequently, which requires you to continuously update a list of valid suggestions. If new products, names, or terms emerge, you have to manually add them to the suggestion list. This is where LLM steps in, as it can dynamically generate relevant completions based on real-world knowledge and live data. By leveraging OpenAI’s completion endpoint , we can dynamically generate autocomplete suggestions based on product names and descriptions. This allows for: Automatic generation of synonyms and related terms . Context-aware suggestions derived from product descriptions. No need for manual curation , making the system more scalable. Steps to implement LLM-powered autocomplete Create an inference endpoint using OpenAI’s completion API. Set up an Elasticsearch ingest pipeline that queries OpenAI for suggestions using a pre-defined prompt using a script processor Store the generated terms in an Elasticsearch index with a completion field. Use a search request to fetch dynamic autocomplete results. All the steps above can be easily completed by copying and pasting the API requests step by step in the Kibana Dev tool. In this example, we will be using the gpt-4o-mini model. You will need to get your OpenAI API key for this step. Login to your OpenAI account and navigate to https://platform.openai.com/api-keys . Next, create a new secret key or use an existing key. Creating an inference endpoint First, we create an inference endpoint. This allows us to interact seamlessly with a machine learning model (in this case OpenAI) via API, while still working within Elastic’s interface. Setting up the Elasticsearch ingest pipeline By setting up an ingest pipeline, we can process data upon indexing. In this case, the pipeline is named autocomplete-LLM-pipeline and it contains: A script processor, which defines the prompt we are sending to OpenAI to get our suggestion list. Product name and product description are included as dynamic values in the prompt. An inference processor , which refers to our OpenAI inference endpoint. This processor takes a prompt from the script processor as input, sends it to the LLM model, and stores the result in an output field called results . A split processor, which splits the text output from LLM within the results field into a comma-separated array to fit the format of a completion type field of suggest . 2 remove processors, which remove the prompt and results field after the suggest field has been populated. Indexing sample documents For this example, we are using the documents API to manually index documents from the dev tool to a temporary index called ‘products’. This is not the autocomplete index we will be using. Creating index with completion type mapping Now, we are creating the actual autocomplete index which contains the completion type field called suggest . Reindexing documents to a designated index via the ingest pipeline In this step, we are reindexing data from our products index created previously to the actual autocomplete index products_with_suggestion , through our ingest pipeline autocomplete-LLM-Pipeline . The pipeline will process the sample documents from the original index and populate the autocomplete suggest field in the destination index. Sample autocomplete suggestions As shown below, the new index (products_with_suggestion) now includes a new field called suggest , which contains an array of terms or synonyms generated by OpenAI LLM. You can run the following request to check: Results: Take note that the generated terms from LLM are not always the same even if the same prompt was used. You can check the resulting terms and see if they are suitable for your search use case. Else, you have the option to modify the prompt in your script processor to get more predictable and consistent suggestion terms. Testing the autocomplete search Now, we can test the autocomplete functionality using the completion suggester query. The example below also includes a fuzzy parameter to enhance the user experience by handling minor misspellings in the search query. You can execute the query below in the dev tool and check the suggestion results. To visualize the autocomplete results, I have implemented a simple search bar that executes a query against the autocomplete index in Elastic Cloud using our client. The search returns result based on terms in the suggestion list generated by LLM as you type. Scaling with OpenAI inference integration By using OpenAI’s completion API as an inference endpoint within Elastic Cloud , we can scale this solution efficiently: Inference endpoint allows automated and scalable LLM suggestions without having to manually create and maintain your own list. Ingest Pipeline ensures real-time enrichment of data during indexing. Script Processor within the ingest pipeline allows easy editing of the prompt in case there is a need to customise the nature of the suggestion list in a more specific way. Pipeline execution can also be configured directly upon ingestion as an index template for further automation. This enables the suggestion list to be built on the fly as new products are added to the index. In terms of cost efficiency, the model is only invoked during the ingestion process, meaning its usage scales with the number of documents processed rather than the search volume. This can result in significant cost savings compared to running the model at search time if you are expecting growth in users or search activity. Conclusion Traditionally, autocomplete relies on manually defined terms, which can be limiting and labour intensive. By leveraging OpenAI’s LLM-generated suggestions, we have the option to automate and enhance autocomplete functionality, improving search relevance and user experience. Furthermore, using Elastic’s ingest pipeline and inference endpoint integration ensures an automated, scalable autocomplete system. Overall, if your search use case requires a very specific set of suggestions from a well maintained and curated list, ingesting the list in bulk via our API conventionally as described in the first part of this article would still be a great and performant option. If managing and updating a suggestion list is a pain point, an LLM-based completion system removes that burden by automatically generating contextually relevant suggestions—without any manual input. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Jump to Supercharge your search autocomplete using LLM Traditional autocomplete in Elasticsearch Implementation Steps Example: Traditional autocomplete setup Enhancing autocomplete with OpenAI LLM Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to build autocomplete feature on search application automatically using LLM generated terms - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-autocomplete-using-llm",
+    "meta_description": "Learn how to enhance your search app with an automated autocomplete feature in Elastic Cloud using LLM-generated terms."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to optimize RAG retrieval in Elastisearch with DeepEval Learn how to optimize the Elasticsearch retriever in a RAG pipeline using DeepEval. Generative AI How To KV By: Kritin Vongthongsri On March 17, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. LLMs are prone to hallucinations, lack domain-specific expertise, and are limited by their context windows. Retrieval-Augmented Generation (RAG) addresses these issues by enabling LLMs to access relevant external context, thereby grounding their responses. Several RAG methods—such as GraphRAG and AdaptiveRAG—have emerged to improve retrieval accuracy. However, retrieval performance can still vary depending on the domain and specific use case of a RAG application. To optimize retrieval for a given use case, you'll need to identify the hyperparameters that yield the best quality. This includes the choice of embedding model, the number of top results (top-K), the similarity function, reranking strategies, and more. Optimizing retrieval means evaluating and iterating on these hyperparameters until you find the best performing combination. In this blog, we'll explore how to optimize the Elasticsearch retriever in a RAG pipeline using DeepEval. We’ll begin by installing Elasticsearch and DeepEval: Measuring retrieval performance To optimize your Elasticsearch retriever and benchmark each hyperparameter combination, you’ll need a method for assessing retrieval quality. Here are 3 key metrics that allow you to measure retrieval performance: contextual precision, contextual recall, and contextual relevancy. Contextual precision: The contextual precision metric checks if the most relevant information chunks are ranked higher than the less relevant ones, for a given input. Simply put, it ensures that the most useful information is first in the set of retrieved contexts. Contextual recall: The contextual recall metric measures how well the retrieved information chunks aligns with the expected output, or ideal LLM response. A higher contextual recall score indicates that the retrieval system is more effective at capturing every piece of relevant information available in your knowledge base. Contextual relevancy: Finally, the contextual relevancy metric assesses how relevant the information in the retrieval context is to the user input of your RAG application. … A combination of these 3 metrics is essential to ensure that the retriever fetches the right amount of information in the proper sequence, and that your LLM receives clean, well-organized data for generating accurate outputs. Ideally, you’ll want to find the combination of hyperparameters that yields the highest scores across all three metrics. However, in some cases, increasing the recall score may inevitably decrease relevance. Striking the right balance between these factors is key to achieving optimal performance. If you need custom metrics for a specific use case, G-Eval and DAG might be worth exploring. These tools allow you to define precise metrics with tailored evaluation criteria. Here are some resources that might help you better understand how these metrics are calculated: How is contextual precision calculated How is contextual recall calculated How is contextual relevancy calculated Evaluating Retrieval in RAG applications Elasticsearch hyperparameters to optimize Elasticsearch provides extensive flexibility in retrieving information for RAG pipelines, offering a wide range of hyperparameters that can be fine-tuned to optimize retrieval performance. In this section, we’ll discuss some of these hyperparameters. Before retrieval: To structure your data optimally before inserting it into your Elasticsearch vector database, you can fine-tune parameters such as chunk size and chunk overlap. Additionally, selecting the right embedding model ensures efficient and meaningful vector representations. During retrieval: Elasticsearch gives you full control over the retrieval process. You can configure the similarity function, first determining the number of candidates for the approximate search before applying KNN on the top-K candidates. Then, you select the most relevant top-K results. Moreover, you can define your retrieval strategy—whether it's semantic (leveraging vector embeddings), text-based (using query rules), or a hybrid approach that combines both methods. After retrieval: Once results are retrieved, Elasticsearch allows you to refine them further through reranking. You can select a reranking model, define a reranking window, set a minimum score threshold, and more—ensuring that only the most relevant results are prioritized. … Different hyperparameters influence certain metric scores more than others. For example, if you're seeing issues with contextual relevance, it’s likely due to a specific set of hyperparameters, including top-K. By mapping specific hyperparameters to each metric, you can iterate more efficiently and fine-tune your retrieval pipeline with greater precision. Below is a table outlining which retrieval metrics are impacted by different hyperparameters: Metric Hyperparameter Contextual Precision Reranking model, reranking window, reranking threshold Contextual Recall Retrieval strategy (text vs embedding), embedding model, candidate count, similarity function, top-K Contextual Relevancy top-K, chunk size, chunk overlap In the next section, we'll walk through how to evaluate and optimize our Elasticsearch retriever with code examples. We'll use the `\"all-MiniLM-L6-v2\"` to embed our text documents, set `top-K` to 3, and configure the number of candidates to 10. Setting up RAG with Elastic Retriever To get started, connect to your local or cloud-based Elastic cluster: Next, create an Elasticsearch index with the appropriate type mappings to store text and embeddings as dense vectors. To insert your document chunks into the Elastic index, first encode into vectors using an embedding model. For this example, we’re using \" all-MiniLM-L6-v2 \". Finally, define a retriever function to search from your elastic client for your RAG pipeline. Evaluating your RAG retriever With your Elasticsearch retriever set up, you can begin evaluating it as part of your RAG pipeline. The evaluation consists of 2 steps: Preparing an input query along with the expected LLM response, and using the input to generate a response from your RAG pipeline to create an LLMTestCase containing the input, actual output, expected output, and retrieval context. Evaluating the test case using a selection of retrieval metrics. Preparing a test case Here, we prepare an input asking \"How does Elasticsearch work?\" with the corresponding expected output: \"Elasticsearch uses dense vector and sparse vector similarity for semantic search.\" Let's examine the actual_output generated by our RAG pipeline: Finally, consolidate all test case parameters into a single LLM test case. Running evaluations To run evaluations on your elastic retriever, pass the test case and metrics we’ve defined earlier into the evaluate function. Optimizing the Retriever Once you’ve evaluated your test case, we can begin to analyze the results. Below are example evaluation results from the test case we created, as well as additional hypothetical queries a user might ask your RAG system. Query Contextual precision Contextual recall Contextual relevancy \"How does Elasticsearch work?\" 0.63 0.93 0.52 \"Explain Elasticsearch's indexing method.\" 0.57 0.87 0.49 \"What makes Elasticsearch efficient for search?\" 0.65 0.90 0.55 Contextual precision is suboptimal → Some retrieved contexts might be too generic or off-topic. Contextual recall is strong → Elasticsearch is retrieving enough relevant documents. Contextual relevancy is inconsistent → The quality of retrieved documents varies across queries. Improving retrieval quality As previously mentioned, each metric is influenced by specific retrieval hyperparameters. Given that contextual precision is suboptimal and contextual relevancy is inconsistent, it's clear that reranker hyperparameters, along with top-K, chunk size, and chunk overlap, need improvement. Here’s an example of how you might iterate on top-K using a simple for loop. This for loop helps identify the top-K value that produces the best metric scores. You should apply this approach to all hyperparameters that impact relevancy and precision scores in your retrieval system. This will allow you to determine the optimal combination! Tracking improvements DeepEval is open-source and great if you’re looking to evaluate your retrievers locally. However, if you're looking for a way to conduct deeper analyses and store your evaluation results, Confident AI brings your evaluations to the cloud and enables extensive experimentation with powerful analysis tools. Confident allows you to: Curate and manage your evaluation dataset effortlessly. Run evaluations locally using DeepEval metrics while pulling datasets from Confident AI. View and share testing reports to compare prompts, models, and refine your LLM application. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Measuring retrieval performance Elasticsearch hyperparameters to optimize Setting up RAG with Elastic Retriever Evaluating your RAG retriever Preparing a test case Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to optimize RAG retrieval in Elastisearch with DeepEval - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/rag-retrieval-elasticsearch-deepeval",
+    "meta_description": "Learn how to optimize the Elasticsearch retriever in a RAG pipeline using DeepEval."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Leveraging AutoOps to detect long-running search queries Learn how AutoOps helps you investigate long-running search queries plaguing your cluster to improve search performance. AutoOps VC By: Valentin Crettaz On January 2, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Released early November on Elastic Cloud Hosted, AutoOps significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. One of the hundreds of analyses AutoOps runs every minute to check your cluster's settings, metrics, and health alerts when long running search queries are plaguing your cluster. Long running search queries can significantly impact performance, leading to high resource consumption. Let's see how this works concretely. How does it work? The beauty of AutoOps for Elastic Cloud Hosted is that there's nothing to do. In all regions where AutoOps is supported , an AutoOps agent is automatically attached to any new or existing deployment, and within minutes, metrics will start shipping, analysis will kick in, and events will be raised as soon as something fishy is detected. There's no need to enable slow logs and set up Filebeat to tail and index them somewhere, it just works out of the box by carefully and regularly monitoring the Task Management API. In order to know if AutoOps is enabled for a given deployment, one can simply head over to his Elastic Cloud console page and click on “Manage” deployment. If the “Open AutoOps ” button appears at the top-right of the screen, then AutoOps is enabled. When opening the Deployment view in AutoOps, we're immediately presented with a quick history of all the recent events. In the screenshot below, we can see that a \"Long running search task\" event was opened recently. Clicking on the event opens up a fly out panel showing the DSL of the slow search query that has been detected along with a whole bunch of information related to the execution context of that query. Understanding long running search tasks The screenshot below shows all the information that AutoOps was able to gather and display in the event fly out panel. We’ll now review each part in more detail. 1. The involved node First, we get a link to the node where the long-running query was detected, i.e. instance-0000000223 . That link allows us to jump directly to the Nodes view where we can find a wealth of metrics and information about that specific node. 2. The involved indices We can also see which indices the query was being run on. In the present case, we can see that the query ran on logs-apache.error-default, logs-nginx.error-default and two more indices. Clicking on those indices will send us to the Shards view which will allow us to see the detailed shards breakdown of those indices on the identified node as well as all the shards of other indices also located on that node. That view will help us detect if there are any hotspots that might be responsible for causing the slow query. 3. Potential reasons for high query latency Digging deeper, we can then see that some basic query analysis took place and AutoOps surfaced a few potential reasons why the query might be slow. In this case, we can see that: the query ran on a 30 days time interval, which might represent a big volume of data there are nested aggregations, which are known to perform poorly the response might potentially contain up to 20'000 aggregation buckets, which might be taxing on node memory There are more detection rules for queries that use regular expressions or scripts. Moreover, new detection rules will be added regularly and also put into perspective with the index mappings. 4. The query context Finally, there's some more information to glean about the context of the search query, such as: for how long it has been running, whether it is cancellable or not, all the headers that were attached to the HTTP call. In this case, we can see the trace.id header (which makes it easy to find it in APM), but also X-Opaque-Id that contains an indication of the client that sent this query. Here, we can see that the query originated from a SIEM alerting rule in Kibana, but it could also be a visualization or a dashboard, or even a user running the query in Dev Tools. Also works for ES|QL But wait, there's more! AutoOps doesn't only detect long-running DSL queries, but also ES|QL ones. On the screenshot below, we can see that a slow ES|QL query has been detected by AutoOps. All the same context information is available for ES|QL queries, except that no query analysis is currently done. As a result, AutoOps doesn’t yet provide any insights into how to improve ES|QL queries, but that will be added soon. After detecting long-running search query Since this event is raised when a long-running search query has been detected, there are a few options forward. When inspecting the query, if it looks like a rogue query or a query run from Dev Tools by a careless user, then the task can simply be cancelled if it’s still running. On the other hand, if it looks like a legitimate query and it is not running anymore, the next step should be to investigate the “ reasons for increased latency ” where AutoOps listed a few potential issues that were detected by inspecting the query. This is only done for DSL at this time, ES|QL will be supported in the future. How long is long? By default, AutoOps will raise a \"Long running search task\" event if the search query has been running for more than one minute. This is a default configuration setting that can easily be modified by clicking on the three dots icon at the top-right of the event fly out panel and then choosing “Customize” in order to change the default duration threshold. After clicking on “Customize”, a dialog window pops up and offers the possibility to modify the duration threshold (in minutes) before raising \"Long running search task\" events. If AutoOps is monitoring several clusters, there’s also the opportunity to apply the custom setting only on specific clusters and not all. Wrapping up As we can see, AutoOps helps detect long-running search queries and dig out a wealth of information about them. Make sure to leverage all that information to improve your search queries and relieve your cluster as much as possible from unbearable loads. Also note that the \"Long running search task\" event is just one out of hundreds of other insightful events that AutoOps knows to detect. If your deployment is in one of the supported regions, feel free to head over to your Elastic Cloud account and launch AutoOps to see how it makes cluster management so much simpler. Also stay tuned for future articles on other very helpful events and recommendations. Report an issue Related content AutoOps December 18, 2024 Resolving high CPU usage issues in Elasticsearch with AutoOps How AutoOps pinpointed and resolved high CPU usage in an Elasticsearch cluster: A step-by-step case study. MD By: Musab Dogan AutoOps How To November 20, 2024 Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. SF By: Sachin Frayne AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Jump to How does it work? Understanding long running search tasks 1. The involved node 2. The involved indices 3. Potential reasons for high query latency Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Leveraging AutoOps to detect long-running search queries - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/slow-search-elasticsearch-query-autoops",
+    "meta_description": "Learn how to detect and investigate Elasticsearch long running search queries using AutoOps to improve your search performance."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch hybrid search Learn about hybrid search, the types of hybrid search queries Elasticsearch supports, and how to craft them. Vector Database How To VC By: Valentin Crettaz On February 17, 2025 Part of Series Vector search introduction and implementation Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. This article is the last one in a series of three that dives into the intricacies of vector search (aka semantic search) and how it is implemented in Elasticsearch. The first part was focused on providing a general introduction to the basics of embeddings (aka vectors) and how vector search works under the hood. Armed with all the vector search knowledge learned in the first article, the second article guided you through the meanders of how to set up vector search and execute k-NN searches in Elasticsearch. In this third and last part, we will leverage what we have learned in the first two parts and build upon that knowledge by delving into how to craft powerful hybrid search queries in Elasticsearch. The need for hybrid search Before diving into the realm of hybrid search, let’s do a quick refresh of what we learned in the first article of this series regarding how lexical and semantic search differ and how they can complement each other. To sum it up very briefly, lexical search is great when you have control over your structured data and your users are more or less clear on what they are searching for. Semantic search, however, provides great support when you need to make unstructured data searchable and your users don’t really know exactly what they are searching for. It would be fantastic if there was a way to combine both in order to squeeze as much substance out of each one as possible. Enter hybrid search! In a way, we can see hybrid search as some sort of “sum” of lexical search and semantic search. However, when done right, hybrid search can be much better than just the sum of those parts, yielding far better results than either lexical or semantic search would do on their own. Running a hybrid search query usually boils down to sending a mix of at least one lexical search query and one semantic search query and then merging the results of both. The lexical search results are scored by a similarity algorithm, such as BM25 or TF-IDF , whose score scale is usually unbounded as the max score depends on the number and frequency of terms stored in the inverted index. In contrast, semantic search results can be scored within a closed interval, depending on the similarity function that is being used (e.g., [0; 2] for cosine similarity). In order to merge the lexical and semantic search results of a hybrid search query, both result sets need to be fused in a way that maintains the relative relevance of the retrieved documents, which is a complex problem to solve. Luckily, there are several existing methods that can be utilized; two very common ones are Convex Combination (CC) and Reciprocal Rank Fusion (RRF). Basically, Convex Combination, also called Linear Combination, seeks to combine the normalized score of lexical search results and semantic search results with respective weights and β \\beta β (where 0 ≤ α , β 0 \\leq \\alpha, \\beta 0 ≤ α , β ), such that: CC can be seen as a weighted average of the lexical and semantic scores Weights between 0 and 1 are used to deboost the related query, while weights greater than 1 are used to boost it. RRF, however, doesn’t require any score calibration or normalization and simply scores the documents according to their rank in the result set, using the following formula, where k is an arbitrary constant meant to adjust the importance of lowly ranked documents: Both CC and RRF have their pros and cons as highlighted in Table 1, below: Table 1: Pros and cons of CC and RRF Convex Combination Reciprocal Rank Fusion Pros Good calibration of weights makes CC more effective than RRF Doesn’t require any calibration, fully unsupervised and there’s no need to know min/max scores Cons Requires a good calibration of the weights and the optimal weights are specific to each data set Not trivial to tune the value of k and the ranking quality can be affected by increasing result set size It is worth noting that not everyone agrees on these pros and cons depending on the assumptions being made and the data sets on which they have been tested. A good summary would be that RRF yields slightly less accurate scores than CC but has the big advantage of being “plug & play” and can be used without having to fine-tune the weights with a labeled set of queries. Elastic decided to support both the CC and RRF approaches. We’ll see how this is carried out later in this article. If you are interested in learning more about the rationale behind that choice, you can read this great article from the Elastic blog and also check out this excellent talk on RRF presented at Haystack 2023 by Elastician Philipp Krenn. Timeline After enabling brute-force kNN search on dense vectors in 7.0 back in 2019, Elasticsearch started supporting approximate nearest neighbors (ANN) search in February 2022 with the 8.0 release and hybrid search support came right behind with the 8.4 release in August 2022. Figure 1, below, shows the Elasticsearch timeline for bringing hybrid search to market: Figure 1: Hybrid search timeline for Elasticsearch The anatomy of hybrid search in Elasticsearch As we’ve briefly hinted at in our previous article , vector search support in Elasticsearch has been made possible by leveraging dense vector models (hence the dense_vector field type), which produce vectors that usually contain essentially non-zero values and represent the meaning of unstructured data in a multi-dimensional space. However, dense models are not the only way of performing semantic search. Elasticsearch also provides an alternative way that uses sparse vector models. Elastic created a sparse NLP vector model called Elastic Learned Sparse EncodeR , or ELSER for short, which is an out-of-domain (i.e., not trained on a specific domain) sparse vector model that does not require any fine-tuning. It was pre-trained on a vocabulary of approximately 30,000 terms , and as it’s a sparse model most of the vector values (i.e., more than 99.9%) are zeros. The way it works is pretty simple. At indexing time, the sparse vectors containing term/weight pairs are generated using the inference ingest processor and stored in fields of type sparse_vector , which is the sparse counterpart to the dense_vector field type. At query time, a specific DSL query also called sparse_vector replaces the original query terms with terms available in the ELSER model vocabulary that are known to be the most similar to them given their weights. Sparse or dense? Before heading over to hybrid search queries, we would like to briefly highlight the differences between sparse and dense models . Figure 2, below, shows how the piece of text “the quick brown fox” is encoded by each model. In the sparse case, the four original terms are expanded into 30 weighted terms that are closely or distantly related to them. The higher the weight of the expanded term, the more related it is to the original term. Since the ELSER vocabulary contains more than 30,000 terms, it means that the vector representing “the quick brown fox” has as many dimensions and contains only ~0.1% of the non-zero values (i.e., ~30 / 30,000), hence why we call these models sparse. Figure 2: Comparing sparse and dense model encoding In the dense case, “the quick brown fox” is encoded into a much smaller embeddings vector that captures the semantic meaning of the text. Each of the 384 vector elements contains a non-zero value that represents the similarity between the piece of text and each of the dimensions. Note that the names we have given to dimensions (i.e., is_mouse , is_brown , etc.) are purely fictional, and their purpose is just to give a concrete description of the values. Another important difference is that sparse vectors are queried via the inverted index (yes, like lexical search), whereas as we have seen in previous articles, dense vectors are indexed in specific graph-based or cluster-based data structures that can be searched using approximate nearest neighbors (ANN) algorithms . We won’t go any further into the details of how ELSER came to be, but if you’re interested in understanding how that model was born, we recommend you check out this article from the Elastic Search Labs, which explains in detail the thought process that led Elastic to develop it. If you are thinking about evaluating ELSER, it might be worth checking Elastic’s relevance workbench , which demonstrates how ELSER compares to a normal BM25 lexical search. We are also not going to dive into the process of downloading and deploying the ELSER model in this article, but you can take a moment and turn to the official documentation that explains very well how to do it. Hybrid search support Whether you are going to use dense or sparse retrieval, Elastic provides hybrid search support for both model types. The first type is a mix of a lexical search query specified in the query search option and a vector search query (or an array thereof) specified in the knn search option. The second one introduces a new search option called retriever (introduced in 8.14 and GA in 8.16) which also contains an array of search queries that can be of lexical (e.g., match ) or semantic (e.g., sparse_vector ) nature. If all this feels somewhat abstract to you, don’t worry, as we will shortly dive into the details to show how hybrid searches work in practice and what benefits they provide. In this article, we are going to focus on the second option using retrievers. Hybrid search with dense models Hybrid search using retrievers basically boils down to running a lexical search query mixed with an approximate k-NN search in order to improve relevance. Such a hybrid search query is shown below: As we can see above, a hybrid search query simply leverages the rrf retriever that combines a lexical search query (e.g., a match query) made with a standard retriever and a vector search query specified in the knn retriever. What this query does is first retrieve the top five vector matches at the global level, then combine them with the lexical matches, and finally return the ten best matching hits. The rrf retriever uses RRF ranking in order to combine vector and lexical matches. The RRF ranking formula can be further parametrized by two variables called rank_constant and rank_window_size (defaults to size ) that can be specified in the rrf retriever section, as shown below: This query runs the same way as the previous one, except that `rank_window_size` documents (instead of only 10) are retrieved from the vector and lexical queries and then ranked by RRF. Finally, the top documents ranked from 1 to `size` (e.g., 10) are then returned in the result set. The last thing to note about this hybrid query type is that RRF ranking requires a commercial license (Platinum or Enterprise), but if you don’t have one, you can still leverage hybrid searches with CC scoring or by using a trial license that allows you to enjoy the full feature set for one month. Hybrid search with sparse models The second hybrid search type for querying sparse models works exactly the same way as for dense vectors.. Below, we can see what such a hybrid query looks like: In the above query, we can see that the retrievers array contains one lexical match query as well as one semantic sparse_vector query that works on the ELSER sparse model that we introduced earlier. Hybrid search with dense and sparse models So far, we have seen two different ways of running a hybrid search, depending on whether a dense or sparse vector space was being searched. At this point, you might wonder whether we can mix both dense and sparse data inside the same index, and you’ll be pleased to learn that it is indeed possible. One concrete application could be that you need to search both a dense vector space with images and a sparse vector space with textual descriptions of those images. Such a query would look like this where we combine a standard retriever with a knn one: In the above payload, we can see a sparse_vector query searching for image descriptions within the ELSER sparse vector space, and in the knn retriever a vector search query searching for image embeddings (e.g., “brown fox” represented as an embedding vector) in a dense vector space. In addition, we leveraged RRF by using the rrf retriever. You can even add another lexical search query to the mix using another standard retriever, and it would look like this: The above payload highlights that we can leverage every possible way to specify a hybrid query containing a lexical search query, a vector search query, and a semantic search query. Limitations The main limitation to be aware of when evaluating the ELSER sparse model is that it only supports up to 512 tokens when running text inference. So, if your data contains longer text excerpts that you need to be fully searchable, you are left with two options: a) use another model that supports longer text, b) split your text into smaller segments, or 3) if you are on 8.15 or above, you can leverage the semantic_text field type, which handles automatic chunking. Optimizations It is undeniable that vectors, whether sparse or dense, can get quite long, from a few dozen to a few thousand dimensions depending on the inference model that you’re using. Also, whether you’re running a text inference on a small sentence containing just a few words or a large body of text, the generated embeddings vector representing the meaning will always have as many dimensions as configured in the model you’re using. As a result, these vectors can take quite some space in your documents and, hence, on your disk. The most obvious optimization to cater to this issue is to configure your index mapping to remove the vector fields (i.e., both dense_vector and sparse_vector ) from your source documents. By doing so, the vector values would still be indexed and searchable, but they would not be part of your source documents anymore, thus reducing their size substantially. It’s pretty simple to achieve this by configuring your mapping to exclude the vector fields from the _source , as shown in the code below: In order to show you some concrete numbers, we have run a quick experiment. We have loaded an index with the msmarco-passagetest2019-top1000 data set, which is a subset of the Microsoft MARCO Passage Ranking full data set. The 60 MB TSV file contains 182,469 text passages. Next, we have created another index containing the raw text and the embeddings vectors (dense) generated from the msmarco-MiniLM-L-12-v3 sentence-transformer model available from Hugging Face. We’ve then repeated the same experiment, but this time configuring the mapping to exclude the dense vector from the source documents. We’ve also run the same test with the ELSER sparse model, one time by storing the sparse_vector field inside the documents and one time by excluding them. Table 2, below, shows the size of each resulting index, whose names are self-explanatory. We can see that by excluding dense vector fields from the source, the index size is divided by 3 and by almost 3.5 in the rank feature case. Index Size (in MB) index-with-dense-vector-in-source 376 index-without-dense-vector-in-source 119 index-with-sparse_vector-in-source 1,300 index-without-sparse_vector-in-source 387 Admittedly, your mileage may vary, these figures are only indicative and will heavily depend on the nature and size of the unstructured data you will be indexing, as well as the dense or sparse models you are going to choose. A last note of caution worth mentioning concerning this optimization is that if you decide to exclude your vectors from the source, you will not be able to use your index as a source index to be reindexed into another one because your embedding vectors will not be available anymore. However, since the index still contains the raw text data, you can use the original ingest pipeline featuring the inference processor to regenerate the embeddings vectors. Let’s conclude In this final article of our series on vector search, we have presented the different types of hybrid search queries supported by Elasticsearch. One option is to use a combination of lexical search (e.g., query ) and vector search (e.g., knn ); the other is to leverage the newly introduced retriever search option with sparse_vector queries. We first did a quick recap of the many advantages of being able to fuse lexical and semantic search results in order to increase accuracy. Along the way, we reviewed two different methods of fusing lexical and semantic search results, namely Convex Combination (CC) and Reciprocal Rank Fusion (RRF), and looked at their respective pros and cons. Then, using some illustrative examples, we showed how Elasticsearch provides hybrid search support for sparse and dense vector spaces alike, using both Convex Combination and Reciprocal Rank Fusion as scoring and ranking methods. We also briefly introduced the Elastic Learned Sparse EncodeR model (ELSER), which is their first attempt at providing an out-of-domain sparse model built on a 30,000 tokens vocabulary. Finally, we concluded by pointing out one limitation of the ELSER model, and we also explained a few ways to optimize your future hybrid search implementations. If you like what you’re reading, make sure to check out the other parts of this series: Part 1: A Quick Introduction to Vector Search Part 2: How to Set Up Vector Search in Elasticsearch Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to The need for hybrid search Timeline The anatomy of hybrid search in Elasticsearch Sparse or dense? Hybrid search support Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch hybrid search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/hybrid-search-elasticsearch",
+    "meta_description": "Learn about hybrid search, the types of hybrid search queries Elasticsearch supports, and how to craft them."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures Benchmarking in preview provides Elasticsearch up to 37% better performance on Azure Cobalt 100 Arm-based VMs. Vector Database Generative AI Integrations Elastic Cloud Hosted YG HM By: Yuvraj Gupta and Hemant Malik On May 21, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures In November, Microsoft announced their custom in-house Arm architecture-based Azure Cobalt CPUs aimed to improve performance and efficiency. As part of our strategic partnership with Microsoft, we are constantly looking for ways to deliver the fruits of innovation to our customers using Elasticsearch on Microsoft Cloud. “At Elastic, we love working with the Microsoft teams, from silicon to models,” said Shay Banon, co-founder and chief technology officer at Elastic . “The rate of progress on the Azure team is impressive, and we are excited to collaborate with them to bring these benefits to our users as fast as possible.” Performance gains for Elastic users on Azure Arm-based VMs Elastic used our macro benchmarking framework for Elasticsearch Rally with the elastic/logs track to determine the maximum indexing performance on the preview of Azure Cobalt-powered virtual machines. Our benchmarks observed up to 37% higher indexing throughput performance when using the Epsv6 VMs compared to prior generation of Arm-based VMs (Epsv5 series) on Azure. “With the introduction of the new Cobalt 100 Arm-based VMs, we aim to deliver Elastic users on Azure up to 37% greater performance compared to the previous generation,” said Paul Nash, Corporate Vice President, product, Azure Infrastructure Platform at Microsoft Corp. “Continual investments like these to deliver better and better performance represent our commitment to provide the best infrastructure powering Elastic Cloud on Azure.” AI innovations in Elastic Cloud on Azure As Microsoft Azure innovates to delivery purpose-built infrastructure for AI using Cobalt 100 Arm-based VMs, we look forward to delivering these performance and efficiency gains to our users through Elastic Cloud on Microsoft Azure. This will empower our users to harness Arm computing innovations_ _as they build their GenAI applications using the Elastic Search AI Platform . To learn more about the new Cobalt 100 Arm-based Azure virtual machines, refer to the Microsoft blog . Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Jump to Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures Performance gains for Elastic users on Azure Arm-based VMs AI innovations in Elastic Cloud on Azure Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/azure-arm-elasticsearch-performance",
+    "meta_description": "Benchmarking in preview provides Elasticsearch up to 37% better performance on Azure Cobalt 100 Arm-based VMs."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. Search Relevance ML Research AL By: Andre Luiz On April 3, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Filters and facets are mechanisms used to refine search results, helping users find relevant content or products more quickly. In the classical approach, rules are manually defined. For example, in a movie catalog, attributes such as genre are pre-defined for use in filters and facets. On the other hand, with AI models, new attributes can be automatically extracted from the characteristics of movies, making the process more dynamic and personalized. In this blog, we explore the pros and cons of each method, highlighting their applications and challenges. Filters and facets Before we begin, let's define what filters and facets are. Filters are predefined attributes used to restrict a set of results. In a marketplace, for example, filters are available even before a search is performed. The user can select a category, such as \"Video games\" , before searching for \"PS5\" , refining the search to a more specific subset instead of the entire database. This significantly increases the chances of obtaining more relevant results. Facets work similarly to filters but are only available after the search is performed. In other words, the search returns results, and based on them, a new list of refinement options is generated. For example, when searching for a PS5 console, facets such as storage capacity , shipping cost , and color may be displayed to help users choose the ideal product. Now that we have defined filters and facets, let's discuss the impact of the classical and Machine Learning (ML)-based approaches on their implementation and usage. Each method has advantages and challenges that influence search efficiency. Classical approach In this approach, filters and facets are manually defined based on predefined rules. This means that the attributes available for refining the search are fixed and planned in advance, considering the catalog structure and user needs. For example, in a marketplace, categories such as \"Electronics\" or \"Fashion\" may have specific filters like brand, format and price range. These rules are created statically, ensuring consistency in the search experience but requiring manual adjustments whenever new products or categories emerge. Although this approach provides predictability and control over the displayed filters and facets, it can be limited when new trends arise that demand dynamic refinement. Pros: Predictability and control: Since filters and facets are manually defined, management becomes easier. Low complexity: No need to train models. Ease of maintenance: As rules are predefined, adjustments and corrections can be made quickly. Cons : Reindexing required for new filters: Whenever a new attribute needs to be used as a filter, the entire dataset must be reindexed to ensure that documents contain this information. Lack of dynamic adaptation: Filters are static and do not automatically adjust to changes in user behavior. Implementation of filters/facets – Classical approach In Dev Tools, Kibana , we will create a demonstration of filters/facets using the classical approach . First, we define the mapping to structure the index: The brand and storage fields are set as keyword , allowing them to be used directly in aggregations ( facets ). The price field is of type float , enabling the creation of price ranges . In the next step, the product data will be indexed: Now, let's retrieve classic facets by grouping the results by brand, storage, and price range. In the query, size:0 was defined. In this scenario, the goal is to retrieve only the aggregation results without including the documents corresponding to the query. The response will include counts for Brand , Storage , and Price , helping to create filters and facets. Machine learning/AI-based approach In this approach, Machine Learning (ML) models, including Artificial Intelligence (AI) techniques, analyze data attributes to generate relevant filters and facets. Instead of relying on predefined rules, ML/AI leverages indexed data characteristics. This enables the dynamic discovery of new facets and filters. Pros : Automatic updates: New filters and facets are generated automatically, without the need for manual adjustments. Discovery of new attributes: It can identify previously unconsidered data characteristics as filters, enriching the search experience. Reduced manual effort: The team does not need to constantly define and update filtering rules as AI learns from available data. Cons: Maintenance complexity: The use of models may require pre-validation to ensure the consistency of the generated filters. Requires ML and AI expertise: The solution demands qualified professionals to fine-tune and monitor model performance. Risk of irrelevant filters: If the model is not well-calibrated, it may generate facets that are not useful for users. Cost: The use of ML and AI may require third-party services, increasing operational costs. It's worth noting that even with a well-calibrated model and a well-crafted prompt, the generated facets should still go through a review step. This validation can be manual or based on moderation rules, ensuring that the content is appropriate and safe. While not necessarily a drawback, it is an important consideration to ensure the quality and suitability of the facets before they are made available to users. Implementation of filters/facets – AI approach In this demonstration, we will use an AI model to automatically analyze product characteristics and suggest relevant attributes. With a well-structured prompt, we extract information from the catalog and transform it into filters and facets. Below, we present each step of the process. Initially, we will use the Inference API to register an endpoint for integration with an ML service. Below is an example of integration with OpenAI's service . Now, we define the pipeline to execute the prompt and obtain the new filters generated by the model. Running a simulation of this pipeline for the \"PlayStation 5\" product, with the following description: Stunning Gaming: Marvel at stunning graphics and experience the features of the new PS5. Breathtaking Immersion: Discover a deeper gaming experience with support for haptic feedback, adaptive triggers, and 3D Audio technology. Slim Design: With the PS5 Digital Edition, gamers get powerful gaming technology in a sleek, compact design. 1TB of Storage: Have your favorite games ready and waiting for you to play with 1TB of built-in SSD storage. Backward Compatibility and Game Boost: The PS5 console can play over 4,000 PS4 games. With Game Boost, you can even enjoy faster, smoother frame rates in some of the best PS4 console games. Let's observe the prompt output generated from this simulation. Now a new field, dynamic_facets , will be added to the new index to store the facets generated by the AI. Using the Reindex API , we will reindex the videogames index to videogames_1 , applying the generate_filter_ai pipeline during the process. This pipeline will automatically generate dynamic facets during indexing. Now, we will run a search and get the new filters: Results: To symbolize the implementation of the facets, below is a simple front-end: The UI code presented is here . Conclusion Both approaches to creating filters and facets have their benefits and points of concern. The classic approach, based on manual rules, offers control and lower costs but requires constant updates and does not dynamically adapt to new products or features. On the other hand, the AI ​​and Machine Learning-based approach automates facet extraction, making the search more flexible and allowing the discovery of new attributes without manual intervention. However, this approach can be more complex to implement and maintain, requiring calibration to ensure consistent results. The choice between the classic and AI-based approaches depends on the needs and complexity of the business. For simpler scenarios, where data attributes are stable and predictable, the classic approach can be more efficient and easier to maintain, avoiding unnecessary costs with infrastructure and AI models. On the other hand, the use of ML/AI to extract facets can add significant value, improving the search experience and making filtering more intelligent. The important thing is to evaluate whether automation justifies the investment or whether a more traditional solution already meets the business needs effectively. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to Filters and facets Classical approach Implementation of filters/facets – Classical approach Machine learning/AI-based approach Implementation of filters/facets – AI approach Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Generating filters and facets using ML - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/filters-facets-using-ml",
+    "meta_description": "Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. Integrations Ingestion How To JR TM By: Jeffrey Rengifo and Tomás Murúa On February 18, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In this article, we'll cover the CLIP multimodal model, explore alternatives, and analyze their pros and cons through a practical example of a mock real estate website that allows users to search for properties using pictures as references. What is CLIP? CLIP (Contrastive Language–Image Pre-training) is a neural network created by OpenAI, trained with pairs of images and texts to solve tasks of finding similarities between text and images and categorize \"zero-shot\" images so the model was not trained with fixed tags but instead, we provide unknown classes for the model so it can classify the image we provide to it. CLIP has been the state-of-the-art model for a while and you can read more articles about it here: Implementing image search How to implement image similarity search However, over time more alternatives have come up. In this article, we'll go through the pros and cons of two alternatives to CLIP using a real estate example. Here’s a summary of the steps we’ll follow in this article: Basic configuration: CLIP and Elasticsearch For our example, we will create a small project with an interactive UI using Python. We will install some dependencies, like the Python transformers, which will grant us access to some of the models we'll use. Create a folder /clip_comparison and follow the installation instructions located here . Once you're done, install the Elasticsearch's Python client , the Cohere SDK and Streamlit : NOTE: As an option, I recommend using a Python virtual environment (venv) . This is useful if you don't want to install all of the dependencies on your machine. Streamlit is an open-source Python framework that allows you to easily get a UI with little code. We'll also create some files to save the instructions we'll use later: app.py : UI logic. /services/elasticsearch.py : Elasticsearch client initialization, queries, and bulk API call to index documents. /services/models.py : Model instances and methods to generate embeddings. index_data.py : Script to index images from a local source. /data : our dataset directory. Our App structure should look something like this: Configuring Elasticsearch Follow the next steps to store the images for the example. We'll then search for them using knn vector queries . Note: We could also store text documents but for this example, we will only search in the images. Index Mappings Access Kibana Dev Tools (from Kibana: Management > Dev Tools) to build the data structure using these mappings: The field type dense_vector will store the embeddings generated by the models. The field binary will store the images as base64. Note: It's not a good practice to store images in Elasticsearch as binary. We're only doing it for the practical purpose of this example. The recommendation is to use a static files repository. Now to the code. The first thing we need to do is initialize the Elasticsearch client using the cloud id and api-key . Write the following code at the beginning of the file /services/elasticsearch.py : Configuring models To configure the models, put the model instances and their methods in this file: /services/models.py . The Cohere Embed-3 model works as a web service so we need an API key to use it. You can get a free one here . The trial is limited to 5 calls per minute, and 1,000 calls per month. To configure the model and make the images searchable in Elasticsearch, follow these steps: Convert images to vectors using CLIP Store the image vectors in Elasticsearch Vectorize the image or text we want to compare to the stored images. Run a query to compare the entry of the previous step to the stored images and get the most similar ones. Configuring CLIP To configure CLIP, we need to add to the file models.py, the methods to generate the image and text embeddings. For all the models, you need to declare similar methods: one to generate embeddings from an image (clip_image_embeddings) and another one to generate embeddings from text (clip_text_embeddings). The outputs.detach().cpu().numpy().flatten().tolist() chain is a common operation to convert pytorch tensors into a more usable format: .detach(): Removes the tensor from the computation graph as we no longer need to compute gradients . .cpu(): Moves tensors from GPU to CPU as numpy only supports CPU . .numpy(): Converts tensor to numPy array. .flatten() : Converts into a 1D array. .toList() : Converts into Python List. This operation will convert a multidimensional tensor into a plain list of numbers that can be used for embeddings operations. Let's now take a look at some CLIP alternatives. Alternative 1: JinaCLIP JinaCLIP is a CLIP variant developed by Jina AI, designed specifically to improve the search of images and text in multimodal applications. It optimizes CLIP performance by adding more flexibility in the representation of images and text. Compared to the original OpenAI CLIP model, JinaCLIP performs better in text-to-text, text-to-image, image-to-text, and image-to-image tasks as we can see in the chart below: Model Text-Text Text-to-Image Image-to-Text Image-Image jina-clip-v1 0.429 0.899 0.803 0.916 openai-clip-vit-b16 0.162 0.881 0.756 0.816 %increase vs OpenAI CLIP 165% 2% 6% 12% The capacity to improve precision in different types of queries makes it a great tool for tasks that require a more precise and detailed analysis. You can read more about JinaCLIP here . To use JinaCLIP in our app and generate embeddings, we need to declare these methods: Alternative 2: Cohere Image Embeddings V3 Cohere has developed an image embedding model called Embed-3, which is a popular alternative to CLIP. The main difference is that Cohere focused on the representation of enterprise data like charts, product images, and design files. Embed-3 uses an advanced architecture that reduces the bias risk towards text data, which is currently a disadvantage in other multimodal models like CLIP, so it can provide more precise results between text and image. You can see below a chart by Cohere showing the improved results using Embed 3 versus CLIP in this kind of data: For more info, go to Embed3. Just like we did with the previous models, let's declare the methods to use Embed 3: With the functions ready, let's index the dataset in Elasticsearch by adding the following code to the file index_data.py : Index the documents using the command: The response will show us the amount of elements indexed by index: Once the dataset has been indexed, we can create the UI. Test UI Creating the UI We are going to use Streamlit to build a UI and compare the three alternatives side-by-side. To build the UI, we'll start by adding the imports and dependencies to the file app.py : For this example, we'll use two views; one for the image search and another one to see the image dataset: Let's add the view code for Search Image: And now, the code for the Images view: We'll run the app with the command: Thanks to multimodality, we can run searches in our image database based on text (text-to-image similarity) or image (image-to-image similarity). Searching with the UI To compare the three models, we'll use a scenario in which a real estate webpage wants to improve its search experience by allowing users to search using image or text. We'll discuss the results provided by each model. We'll upload the image of a \"rustic home\": Here we have the search results. As you can see, based on the image we uploaded, each model generated different results: In addition, you can see results based on the text to find the house features: If searching for “modern”, the three models will show good results. But, JinaCLIP and Cohere will be showing the same houses in the first positions. Features Comparison Below you have a summary of the main features and prices of the three options we covered in this article: Model Created by Estimated Price Features CLIP OpenAI $0.00058 per run in Replicate (https://replicate.com/krthr/clip-embeddings) General multimodal model for text and image; suitable for a variety of applications with no specific training. JinaCLIP Jina AI $0.018 per 1M tokens in Jina (https://jina.ai/embeddings/) CLIP variant optimized for multimodal applications. Improved precision retrieving texts and images. Embed-3 Cohere $0.10 per 1M tokens, $0.0001 per data and images at Cohere (https://cohere.com/pricing) Focuses on enterprise data. Improved retrieval of complex visual data like graphs and charts. If you will search on long image descriptions, or want to do text-to-text as well as image-to-text, you should discard CLIP, because both JinaCLIP and Embed-3 are optimized for this use case. Then, JinaCLIP is a general-use model, while Cohere’s one is more focused on enterprise data like products, or charts. When testing the models on your data, make sure you cover: All modalities you are interested in: text-to-image, image-to-text, text-to-text Long and short image descriptions Similar concept matches (different images of the same type of object) Negatives Hard negative: Similar to the expected output but still wrong Easy negative: Not similar to the expected output and wrong Challenging scenarios: Different angles/perspectives Various lighting conditions Abstract concepts (\"modern\", \"cozy\", \"luxurious\") Domain-specific cases: Technical diagrams or charts (especially for Embed-3) Product variations (color, size, style) Conclusion Though CLIP is the preferred model when doing image similarity search, there are both commercial and non-commercial alternatives that can perform better in some scenarios. JinaCLIP is a robust all-in-one tool that claims to be more precise than CLIP in text-to-text embeddings. Embed-3 follows Cohere's line of catering to business clients by training models with real data using typical business docs. In our small experiment, we could see that both JinaClip and Cohere show interesting image-to-image and text-to-image results and perform very similarly to CLIP with these kinds of tasks. Elasticsearch allows you to search for embeddings, combining vector search with full-text-search, enabling you to search both for images and for the text in them. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to What is CLIP? Basic configuration: CLIP and Elasticsearch Configuring Elasticsearch Index Mappings Configuring models Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Exploring CLIP alternatives - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/openai-clip-alternatives",
+    "meta_description": "Learn about OpenAI’s CLIP, its configuration, and alternatives for image-to-image and text-to-image search like JinaCLIP & Cohere Image Embeddings V3."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch vector database for native grounding in Google Cloud’s Vertex AI Platform Elasticsearch is now publicly available as the first third-party native grounding engine for Google Cloud’s Vertex AI platform and Google’s Gemini models. It enables joint users to build fully customizable GenAI experiences grounded in enterprise data, powered by the best-of-breed Search AI capabilities from Elasticsearch. Integrations VA By: Valerio Arvizzigno On April 9, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elastic is thrilled to announce that the Elasticsearch vector database is now integrated into Google Cloud’s Vertex AI platform as a natively supported information retrieval engine, empowering users to leverage the multimodal strengths of Google’s Gemini models with the advanced AI-powered semantic and hybrid search capabilities of Elasticsearch. Developers can now create their RAG applications within a unified journey, grounding their chat experiences on their private data in a low-code, flexible way. Whether you’re building AI agents for your customers and internal employees or leveraging LLMs generation within your software, the Vertex AI platform puts Elasticsearch relevance at your fingertip with minimal configuration. This integration allows easier and faster adoption for Gemini models in production use cases, driving GenAI from PoCs to real-life scenarios. In this blog, we will walk you through integrating Elasticsearch with Google Cloud’s Vertex AI platform for seamless data grounding and building fully customizable GenAI applications. Let’s discover how. Google Cloud’s Vertex AI and Gemini models grounded on your data with Elasticsearch Users leveraging Vertex AI services and tools for creating GenAI applications can now access the new “Grounding” option to bring their private data into their conversational interaction automatically. Elasticsearch is now part of this feature and could be used via both: Vertex AI LLM APIs , which directly enrich Google’s Gemini models at generation time (preferred); Grounded Generation API , used instead in the Vertex AI Agent Builder ecosystem to build agentic experiences. With this integration, Elasticsearch – the most downloaded and deployed vector database – will bring your relevant enterprise data wherever it’s needed in your internal end customer-facing chats, which is crucial for the real-life adoption of GenAI into business processes. The aforementioned APIs will allow developers to adopt this new partnered feature in their code. However, prompt engineering and testing remain crucial steps in the application development and serve as an initial discovery playground. To support this, Elasticsearch is designed for easy evaluation by users within the Vertex AI Studio console tool. All it takes is a few simple steps to configure the Elastic endpoints with the desired parameters (index to be searched, the number of documents to be retrieved, and the desired search template) within the “Customize Grounding” tab in the UI, as shown below (note that for it to work, you have to type in the API key with the word \"ApiKey\" in the UI and code examples below). Now you’re ready to generate with your private knowledge! Production-ready GenAI applications with ease Elastic and Google Cloud work to provide developer-first, comprehensive, and enjoyable experiences. Connecting to Elastic natively in both LLM and Grounding Generation API reduces complexity and overhead while building GAI applications on Vertex AI, avoiding unnecessary additional APIs and data orchestration while grounding in just one unified call. Let’s see how it works in both scenarios. The first example is executed with the LLM API: In the above example, with the retrieval field of the API requesting content generation to Gemini 2.0 Flash, we can contextually set a retrieval engine for the request. Setting api_spec to “ELASTIC_SEARCH” enables the usage of additional configuration parameters such as the API Key and the cluster endpoint (needed to route a request to your Elastic cluster), the index to retrieve data from, and the Search template to be used for your search logic. Similarly, the same outcome could be achieved with the Grounding Generation API, setting the groundingSpec parameter: With both approaches, the response will provide an answer with the most relevant private documents found in Elasticsearch – and the related connected data sources – to support your query. Simplicity, however, should not be confused with a lack of personalization to fulfill your specific needs and use cases. With this in mind, we designed it to allow you to perfectly adapt the search configuration to your scenario. Fully customizable search at your fingertips: search templates To provide maximum customization to your search scenario, we’ve built, in collaboration with Google Cloud,the experience on top of our well-known Search Templates . Elasticsearch search templates are an excellent tool for creating dynamic, reusable, and maintainable search queries. They allow you to predefine and reuse query structures. They are particularly useful when executing similar queries with different parameters, as they save development time and reduce the chance of errors. Templates can include placeholders for variables, making the queries dynamic and adaptable to different search requirements. While using Vertex AI APIs and Elasticsearch for grounding, you must reference a desired search template – as shown in the code snippets above – where the search logic is implemented and pushed down to Elasticsearch. Elastic power users can asynchronously manage, configure, and update the search approaches and tailor them to the specific indices, models, and data in a fully transparent way for Vertex AI users, web-app developers, or AI engineers, who only need to specify the name of the template in the grounding API. This design allows for complete customization, putting the extensive Elasticsearch retrieval features at your disposal in a Google Cloud AI environment while ensuring modularity, transparency, and ease of use for different developers, even those unfamiliar with Elastic. Whenever you need BM25 search, semantic search, or a hybrid approach between the two (Have you explored retrievers already? Composable retrieval techniques in a single search API call), you can define your custom logic in a search template, which Vertex AI can automatically leverage. This also applies to embeddings and reranking models you choose to manage vectors and results. Depending on your use case, you may want to host models on Elastic’s ML nodes, use a third-party service endpoint through the Inference API, or run your local model on-prem. This is doable via a search template, and we’ll see how it works in the next section. Start with reference templates, then build your own To help you get started quickly, we’ve provided a set of compatible search template samples to be used as an initial reference; you can then modify and build your custom ones upon: Semantic Search with ELSER model (sparse vectors and chunking) Semantic Search with e5 multilingual model (dense vectors and chunking) Hybrid Search with Vertex AI text-embedding model You can find them in this GitHub repo . Let’s look at one example: creating embeddings with Google Cloud’s Vertex AI APIs on a product catalog. First, we need to create the search template in Elasticsearch as shown below: In this example, we will execute KNN search on two fields within one single search: title_embedding – the vector field containing the name of the product – and description_embedding – the one containing the representation of its description. You can leverage the excludes syntax to avoid returning unnecessary fields to the LLM, which may cause noise in its processing and impact the quality of the final answer. In our example, we excluded the fields containing vectors and image urls. Vectors are created on the fly at query time on the submitted input via an inference endpoint to the Vertex AI embeddings API, googlevertexai_embeddings_004 , previously defined as follows: You can find additional information on how to use Elastic’s Inference API here . We’re now ready to test our templated search: The params fields will replace the variables we set in the template scripts in double curl brackets. Currently, Vertex AI LLM and Grounded Generation APIs can send to Elastic the following input variables: “query” - the user query to be searched “index_name” - the name of the index where to search “num_hits” - how many documents we want to retrieve in the final output Here’s a sample output: The above query is precisely what Google Cloud’s Vertex AI will run on Elasticsearch behind the scenes when referring to the previously created search template. Gemini models will use the output documents to ground its answer: when you ask “What do I need to patch my drywall?” instead of getting a generic suggestion, the chat agent will provide you with specific products! End-to-end GenAI journey with Elastic and Google Cloud Elastic partners with Google Cloud to create production-ready, end-to-end GenAI experiences and solutions. As we’ve just seen, Elastic is the first ISV to be integrated directly into the UI and SDK for the Vertex AI platform, allowing seamless, grounded Gemini models prompts and agents using our vector search features. Moreover, Elastic integrates with Vertex AI and Google AI Studio ’s embedding, reranking, and completion models to create and rank vectors without leaving the Google Cloud landscape, ensuring Responsible AI principles. By supporting multimodal approaches, we jointly facilitate applications across diverse data formats. You can tune, test, and export your GenAI search code via our Playground . But it’s not just about building search apps: Elastic leverages Gemini models to empower IT operations, such as in the Elastic AI Assistants, Attack Discovery, and Automatic Import features , reducing daily fatigue for security analysts and SREs on low-value tasks, and allowing them to focus on improving their business. Elastic also enables comprehensive monitoring of Vertex AI usage , tracking metrics and logs, like response times, tokens, and resources, to ensure optimal performance. Together, we manage the complete GenAI lifecycle, from data ingestion and embedding generation to grounding with hybrid search, while ensuring robust observability and security of GenAI tools with LLM-powered actions. Explore more and try it out! Are you interested in trying this out? The feature is currently GA on your Google Cloud projects! If you haven’t already, one of the easiest ways to get started with Elastic Search AI Platform and explore our capabilities is with your free Elastic Cloud trial or by subscribing through Google Cloud Marketplace . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Jump to Google Cloud’s Vertex AI and Gemini models grounded on your data with Elasticsearch Production-ready GenAI applications with ease Fully customizable search at your fingertips: search templates Start with reference templates, then build your own End-to-end GenAI journey with Elastic and Google Cloud Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch vector database for native grounding in Google Cloud’s Vertex AI Platform - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-google-cloud-vertex-ai-native-grounding",
+    "meta_description": "Elasticsearch is now publicly available as the first third-party native grounding engine for Google Cloud’s Vertex AI platform and Google’s Gemini models. It enables joint users to build fully customizable GenAI experiences grounded in enterprise data, powered by the best-of-breed Search AI capabilities from Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elastic Cloud adds Elasticsearch Vector Database optimized instance to Google Cloud Elasticsearch's vector search optimized profile for GCP is available. Learn more about it and how to use it in this blog. Vector Database Generative AI Elastic Cloud Hosted SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta On April 25, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elastic Cloud Vector Search optimized hardware profile is available for Google Elastic Cloud users. This hardware profile is optimized for applications that require the storage of dense or sparse embeddings for search and Generative AI use cases powered by RAG (retrieval augmented generation). This release follows the previous release of a Vector Search optimized hardware profile for AWS Elastic Cloud users in Nov 2023. GCP Vector Search optimized instances: what you need to know Elastic Cloud users benefit from having Elastic managed infrastructure across all major cloud providers (GCP, AWS and Azure) along with wide region support for GCP users. For more specific details on the instance configuration for this hardware profile, refer to our documentation for instance type: gcp.es.datahot.n2d.64x8x11 Vector Search, HNSW, and memory Elasticsearch uses the Hierarchical Navigable Small World graph (HNSW) data structure to implement its Approximate Nearest Neighbor search (ANN). Because of its layered approach, HNSW's hierarchical aspect offers excellent query latency. To be most performant, HNSW requires the vectors to be cached in the node's memory. This caching is done automatically and uses the available RAM not taken up by the Elasticsearch JVM. Because of this, memory optimizations are important steps for scalability. Consult our vector search tuning guide to determine the right setup for your vector search embeddings and whether you have adequate memory for your deployment. With this in mind, the Vector Search optimized hardware profile is configured with a smaller than standard Elasticsearch JVM heap setting. This provides more RAM for caching vectors on a node, allowing users to provision fewer nodes for their vector search use cases. If you’re using compression techniques like scalar quantization , the memory requirement is lowered by a factor of 4. To store quantized embeddings (available in versions Elasticsearch 8.12 and later) simply ensure that you’re storing in the correct element_type: byte . To utilize our automatic quantization of float vectors update your embeddings to use index type: int8_hnsw like in the following mapping example. In upcoming versions, Elasticsearch will provide this as the default mapping, removing the need for users to adjust their mapping. Combining this optimized hardware profile with Elasticsearch’s automatic quantization are two examples where Elastic is focused on vector search to be cost-effective while still being extremely performant. Getting Started with Elastic Cloud vector search optimized profile for GCP Start a free trial on Elastic Cloud and simply select the new Vector Search optimized profile to get started. Migrating existing Elastic Cloud deployments Migrating to this new Vector Search optimized hardware profile is a few clicks away. Simply navigate to your Elastic Cloud management UI, click to manage the specific deployment, and edit the hardware profile. In this example, we are migrating from a ‘Storage optimized’ profile to the new ‘Vector Search’ optimized profile. When choosing to do so, while there is a reduction to available storage and vCPU, what is gained is the ability to store more vectors per memory with vector search. Migrating to a new hardware profile uses the grow and shrink approach for deployment changes. This approach adds new instances, migrates data from old instances to the new ones, and then shrinks the deployment by removing the old instances. This approach allows for high availability during configuration changes even for single availability zones. The following image shows a typical architecture for a deployment running in Elastic Cloud, where vector search will be the primary use case. This example deployment uses our new Vector Search optimized hardware profile, now available in GCP. This setup includes: Two data nodes in our hot tier with our vector search profile One Kibana node One Machine Learning node One integration server One master tiebreaker By deploying these two “full-sized” data nodes with the Vector Search optimized hardware profile and while taking advantage of Elastic’s automatic dense vector scalar quantization , you can index roughly 60 million vectors, including one replica (with 768 dimensions). Conclusion Vector search is a powerful tool when building modern search applications, be it for semantic document retrieval on its own or integrating with an LLM service provider in a RAG setup . Elasticsearch provides a full-featured vector database natively integrated with a full-featured search platform. Along with improving vector search feature set and usability, Elastic continues to improve scalability. The vector search node type is the latest example, allowing users to scale their search application. Elastic is committed to providing scalable, price effective infrastructure to support enterprise grade search experiences. Customers can depend on us for reliable and easy to maintain infrastructure and cost levers like vector compression, so you benefit from the lowest possible total cost of ownership for building search experiences powered by AI. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to GCP Vector Search optimized instances: what you need to know Vector Search, HNSW, and memory Getting Started with Elastic Cloud vector search optimized profile for GCP Migrating existing Elastic Cloud deployments Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elastic Cloud adds Elasticsearch Vector Database optimized instance to Google Cloud - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-vector-profile-gcp",
+    "meta_description": "Elasticsearch's vector search optimized profile for GCP is available. Learn more about it and how to use it in this blog."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Storage wins for time-series data in Elasticsearch Explore Elasticsearch's storage improvements for time series data and best practices for configuring a TSDS with storage efficiency. Search Analytics How To MG KK By: Martijn Van Groningen and Kostas Krikellas On June 10, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this article, we describe the most impactful storage improvements incorporated in our Elasticsearch time-series data offering and provide insights into the scenarios we expect our system to perform better - and worse - with regard to storage efficiency. Background Elasticsearch has recently invested in better support for storing and querying time-series data. Storage efficiency has been a main area of focus, with many projects delivering substantial wins that can lead to savings of up to 60-80%, compared to saving the data in standard indices. In certain scenarios, our system achieves storage efficiency of less than one byte per data point, competing head-to-head with state-of-the-art, specialized TSDB systems. Let's take a look at the recent improvements in storage efficiency for time-series data. Storage improvements in Elasticsearch time-series data Synthetic source Elasticsearch stores the original JSON document body in the _source field by default. This duplication penalizes storage with diminishing returns for metrics, as they are normally inspected through aggregation queries that don’t use this field. To mitigate this, we introduced synthetic _source that reconstructs a flavor of the original _source on demand, using the data stored in the document fields. The caveats are that a limited number of field types are supported and _source synthesizing is slower than retrieving it from a stored field. Still, these restrictions are largely irrelevant for metrics datasets that mostly rely on keyword, numeric, boolean and IP fields and use aggregate queries that don’t take the _source content into account. We’re separately working on eliminating these limitations to make synthetic source applicable to any mapping. The storage wins are immediate and apparent: enabling synthetic source reduces the size of time series data stream (TSDS) indices by 40-60% (more on performance evaluation below). Synthetic source is thus used by default in TSDS since it was released (v.8.7). Specialized codecs TSDB systems make heavy use of specialized codecs that take advantage of the chronological ordering of recorded metrics to reduce the number of bytes per data point. Our offering extends the standard Lucene codecs with support for run-length encoding , delta-of-deltas (2nd derivative), GCD and XOR encoding for numeric values. Codecs are specified at the Lucene segment level, so older indices can take advantage of the latest codecs when indexing fresh data. To boost the efficiency of these compression techniques, indices get sorted by an identifier calculated over all dimension fields (ascending order), and then by timestamp (descending order, to return the latest data point per time series). This way, dimension fields (mostly keywords) get efficiently compressed with run-length encoding, while numeric values for metrics get clustered per time-series and ordered by time. Since most time-series change slowly over time, with occasional spikes, and Elasticsearch relies on Lucene’s vertically partitioned storage engine , this approach minimizes deltas between successively stored data and boosts storage efficiency. Metadata trimming The _id field is a metadata field used to uniquely identify each document in Elasticsearch. It has limited value for metrics applications, since time-series analysis relies on queries aggregating values over time, rather than inspecting individual metric values. To that end, TSDS trims the stored values but keeps the inverted index for this field to still support doc retrieval queries. This leads to 10-20% storage reduction with no loss of functionality. Lifecycle integration TSDSs can be integrated with data lifecycle management mechanisms, namely ILM and Data Stream Lifecycle . These tools automate deleting older indexes, while ILM also supports moving indices to tiers with cheaper storage (e.g. using spinning disks or archival cloud storage) as they age. Lifecycle management reduces storage costs, with no compromise on querying performance for frequently-accessed metrics and with minimal user involvement. Downsampling In many metrics applications, it’s preferable to keep finely-grained data in the short term only (e.g. per-minute data for the last week), and acceptable to increase granularity for older data to save on storage (e.g. per-hour data for the last month, per-day data for the last 2 years). Downsampling replaces raw metrics data with a statistical representation of pre-aggregated metrics over configurable time periods (e.g. hourly or daily). This improves both storage efficiency, since the size of downsampled indices is a fraction of the raw metrics indices, and querying performance, since aggregation queries scan pre-aggregated results instead of calculating them over raw data on-the-fly. Downsampling is integrated with ILM and DSL that automate its application and allow for different resolutions of downsampled data as they age. Test results for TSDS storage efficiency TSDS storage gains We track performance, including storage usage and efficiency, for TSDS through nightly benchmarks . The TSDB track (see disk usage visualization) visualizes the impact of our storage improvements. We’ll next present storage usage before TSDS was released, how it improved when TSDS was GA-ed, and what’s the current status. The TSDB track’s dataset ( k8s metrics) has nine dimension fields, with each document containing 33 fields (metrics and dimensions) on average. The index contains a day's worth of metrics over 116,633,696 documents. Indexing the TSDB track’s dataset before ES version 8.7 required 56.9GB of storage. It is interesting to break this down by metadata fields, the timestamp field, dimension fields and metric fields to gain insight into storage usage: Field name Percentage _id 5.1% _seq_no 1.4% _source 78.0% @timestamp 1.31% Dimension fields 2.4% Metric fields 5.1% Other fields 9.8% The _source metadata field is the largest contributor to the storage footprint by far. Synthetic source was one of the improvements that our metrics effort motivated to improve storage efficiency, as mentioned earlier. This is evident in ES 8.7 that uses synthetic source for TSDS by default. In this case, the storage footprint drops to 6.5GB - a 8.75x improvement in storage efficiency. Breaking this down by field type: Field name Percentage _id 18.7% _seq_no 14.1% @timestamp 12.6% Dimension fields 3.6% Metric fields 12.0% Other fields 50.4% The improvement is due to the _source no longer being stored, as well as applying index sorting to store metrics from the same time series sequentially, thus boosting the efficiency of standard Lucene codecs. Indexing the TSDB track’s dataset with ES 8.13.4 occupies 4.5GB of storage - a further 44 % improvement. The breakdown by field type is: Field name Percentage _id 12.2% _seq_no 20.6% @timestamp 14.0% Dimension fields 1.6% Metric fields 6.7% Other fields 58.6% This is a substantial improvement, compared to the 8.7.0 version. The main contributing factors to the latest iteration are the _id field taking up less storage space (its stored values get trimmed), while dimension fields and other numeric fields get compressed more efficiently using the latest time-series codecs. The majority of storage is now attributed to “other fields”, i.e. fields providing context similar to dimensions but are not used in calculating the identifier that’s used for index sorting, so their compression is not as efficient as with dimension fields. Downsample storage gains Downsampling trades querying resolution for storage gains, depending on the downsampling interval. Downsampling the metrics in TSDB track’s dataset (with metrics collected every 10 seconds) using a 1-minute interval results in an index of 748MB - a 6x improvement. The downside is that metrics get pre-aggregated on a per-minute granularity, so it’s no longer possible to inspect individual metric recordings or aggregate over sub-minute intervals (e.g. per 5 seconds). Most importantly, aggregation results on the pre-computed statistics (min, max, sum, count, average) are the same as if calculated over the original data, so downsampling doesn’t incur any cost in accuracy. If lower resolution can be tolerated and metrics get downsampled using an hourly interval, the resulting downsampled index will use just 56MB of storage. Note that the improvement is 13.3x , i.e. lower than 60x that one would expect from switching from a per-minute downsampling interval to a per-hour one. This is due to additional metadata that all indices require to store per segment, a constant overhead that becomes more noticeable as the index size reduces. Putting everything together The following graph shows how storage efficiency evolved across versions, as well as what additional savings downsampling can provide. Kindly note that the vertical axis is in logarithmic scale. In total, we achieved a 12.5x improvement in storage efficiency for our metrics offering over the past releases. This can even reach 1000x or better, if we trade bucketing resolution for reduced storage footprint through downsampling. Configuration hints for TSDS In this section, we explore best practices for configuring a TSDS with storage efficiency in mind. Favor many metrics per document While Elasticsearch uses vertical partitioning to store each field separately, fields are still grouped logically in docs. Since metrics share dimensions that are included in the same doc, the storage overhead for dimensions and metadata gets better amortized when we include as many metrics as possible in each indexed doc. On the flip side, storing a single metric in each doc, along with its associated dimensions, maximizes the overhead of dimensions and metadata and bloats storage. More concretely, we used synthetic datasets to quantify the impact of the number of metrics per document. When we included all metrics (20) in each indexed doc, TSDS used as little as 0.9 bytes per data point - approximating the performance of state-of-the-art, purpose-built metrics systems (0.7 bytes per data point) that lack the rich indexing and querying capabilities of Elasticsearch for unstructured data. Conversely, when each indexed doc had a single metric, TSDS required 20 bytes per data point , a substantial increase in the storage footprint. It therefore pays off to group together as many metrics as possible in each indexed doc, sharing the same dimensions values. Trim unnecessary dimensions The Elasticsearch architecture allows our metrics offering to scale far better than competing systems, when it comes to the number of time series per metric (i.e. the product of the dimension cardinalities) in the order of millions or more, at a manageable performance cost. Still, dimensions do take considerable space and high cardinalities reduce the efficiency of our compression techniques for TSDS. It’s therefore important to carefully consider what fields are included in indexed documents for metrics and aggressively prune dimensions to the minimum required set for dashboards and troubleshooting. One interesting example here was an Observability mapping including an IP field that turned out to contain up to 16 IP ( v4, v6) addresses of the hosting machine. It had a substantial impact on both storage footprint and indexing throughput and was hardly used. Replacing it with a machine label led to a sizable storage improvement with no loss of debuggability. Use lifecycle management ILM facilitates moving older, infrequently-accessed data to cheaper storage options, and both ILM and Data Stream Lifecycle can handle deleting metrics data as they age. This fully-automated approach reduces storage costs without changing index mappings or configuration and is thus highly encouraged. More so, it is worth considering trading metrics resolution for storage through downsampling, as data ages. This technique leads to both substantial storage wins and more responsive dashboards, assuming that the reduction in bucketing resolution is acceptable for older data - a common case in practice, as it’s fairly rare to inspect months-old data at a per-minute granularity, for instance. Next steps We’ve achieved a significant improvement in the storage footprint for metrics over the past years. We intend to apply these optimizations to additional data types beyond metrics, and specifically logs data. While some features are metrics-specific, such as downsampling, we still hope to see reductions in the order of 2-4x using a logs-specific index configuration. Despite reducing the storage overhead of metadata fields that all Elasticsearch indices require, we plan to trim them more aggressively. Good candidates are the _id and _seq_no fields. Furthermore, there are opportunities to apply more advanced indexing techniques, such as sparse indices , to timestamps and other fields supporting range queries. The downsampling mechanism has a big potential for improved querying performance, if a small storage penalty is acceptable. One idea is to support multiple downsampling resolutions (e.g. raw, per-hour and per-day) on overlapping time periods, with the query engine automatically picking the most appropriate resolution for each query. This would allow users to spec downsampling to match their dashboard time scaling and make them more responsive, as well as kick off downsampling within minutes after indexing. It would also unlock keeping raw data along with downsampled, potentially using a slower/cheaper storage layer. Try it out Sign up for Elastic Cloud , if you don’t have an account yet Configure a TSDS and use it for storing and querying metrics Explore downsampling to see if it fits your use case Enjoy the storage savings Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Background Storage improvements in Elasticsearch time-series data Synthetic source Specialized codecs Metadata trimming Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Storage wins for time-series data in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/time-series-data-elasticsearch-storage-wins",
+    "meta_description": "Explore Elasticsearch's storage improvements for time series data and best practices for configuring a TSDS with storage efficiency."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. AutoOps Elastic Cloud Hosted ZS OS By: Ziv Segal and Ori Shafir On November 6, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. While Elasticsearch is a powerful and scalable search engine that offers a vast selection of capabilities, many users find it challenging due to its sometimes complex administration and management experience. We hear you, and we're excited to share some big news! The Opster team has been hard at work making AutoOps even better, and a seamless part of the Elastic platform. AutoOps is available in select Elastic Cloud regions, and coverage is rapidly expanding! AutoOps makes Elastic Cloud easy to operate AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. With AutoOps, you will be able to: Minimize administration time with insights tailored to your Elasticsearch utilization and configuration Analyze hundreds of Elasticsearch metrics in real-time with pre-configured alerts to detect and flag issues before they become critical Get root cause analysis with drill-downs to point-in-time of issue occurrence, and resolution suggestions including in-context Elasticsearch commands Improve resource utilization by providing optimization suggestions In each of the scenarios below, let’s see examples of issues that users may come across, and how AutoOps insights (screenshots) can help right away! Real scenarios: how AutoOps makes Elasticsearch easy to operate The scenarios below provide real-world issues and how AutoOps provide root cause analysis, with drill-downs to point-in-time of issue occurrence, and recommendations on how to resolve the issue. Scenario #1: Finding a query causing severe search latency Issue: Users complain that their dashboards are slow and take a long time to load… AutoOps insight: AutoOps reports a “Long running search task” event, identifying a search running for 4 minutes with 4 nested aggregations and suggesting ways to optimize the query causing the latency. Resolution AutoOps provides a cURL command to cancel the query. By identifying and canceling the long running search task, the administrator was able to block this specific query. AutoOps monitors the Task Management API and flags long running search tasks providing an easy way to detect long running search queries and optimize them. AutoOps provides in-context Elasticsearch commands to resolve the issues, such as canceling the long running search task Scenario #2: Ineffective use of data tiering, leading to slow search and indexing Issue: Users report slow search performance and indexing. AutoOps insight: AutoOps detects multiple issues stemming from increased load due to indexing activities on warm nodes, resulting in a high indexing queue and slow searches on one of these nodes. AutoOps detects that indexing activities are occurring in warm nodes, there is a high indexing queue and slow searches were detected on one of those warm nodes. Resolution: The team updated their ILM policy to ensure that indices only move off the hot tier once no further indexing activities are expected. AutoOps detects that indexing occurred in the hot tier AutoOps detects that the Index queue is high and provides a list of recommendations for resolution AutoOps Slow search performance event - detects slow search performance on the loaded node Scenario #3: Investigating production down time Issue: An outage is reported, and CPU usage on the cluster is high momentarily AutoOps insight: AutoOps identifies the time window during which CPU utilization was high, and provides a drill-down into the point of time of the issue with a recommendation to check slow logs. Drilling down further into the node view reveals that the CPU is high every day, at about 7am. Resolution SRE finds a script scheduled to run daily at 7 am, by amending the script they are able to fix the issue and stabilize the cluster. AutoOps provides hyperlinks for quick drill-downs into detected issues Drill down screens provide extra context with metrics on nodes, indices and shards and templates optimizations Scenario #4: Customer Kibana dashboards are slow Issue Customers complain that Kibana dashboards are at times slower than usual AutoOps insight AutoOps detects large shards that could lead to slow search performance and recommends reindexing into smaller indices and reviewing the ILM policy. Resolution The team follows AutoOps’ recommendation to change the shards sizes, improving the dashboard’s responsiveness and cluster stability. AutoOps monitors shards sizes and alerts when and how to optimize shards AutoOps and Elastic: name a more iconic duo! By analyzing hundreds of Elasticsearch metrics, your configuration, and usage patterns, AutoOps recommends operational and monitoring insights that deliver real savings in administration time and hardware costs. Elasticsearch Performance optimizations: AutoOps tells you exactly how to keep your Elasticsearch clusters running smoothly. It offers tailored insights based on your specific usage and configuration, helping you maintain high performance. Real-time Issue Detection for Elasticsearch specific issues: AutoOps continuously analyzes hundreds of Elasticsearch metrics and provides pre-configured alerts to catch issues like ingestion bottlenecks, data structure misconfigurations, unbalanced loads, slow queries, and more - before they become bigger issues. Easy Troubleshooting: Troubleshooting can be complex, especially in larger environments. AutoOps performs root cause analysis and provides drill-downs to the exact point in time when an issue occured, and resolution paths including in-context Elasticsearch commands and best practices. Cost visibility and optimization for Elasticsearch deployments : AutoOps identifies underutilized nodes, small or large indices and shards, and suggests data tier optimizations. This can help with better resource utilization and potential hardware cost savings. Seamless Integration: AutoOps isn't just a standalone tool; it's built into Elastic Cloud and integrates with alerting and messaging frameworks (MS Teams and Slack), incident management systems (PagerDuty and OpsGenie) and other tools. You can customize AutoOps alerts and notifications for your use case. Query Optimization, Template Optimization, and lots more! Built into AutoOps is our expertise in running and managing lots of types of Elastic environments. AutoOps identifies and alerts you on expensive queries, data types that exist and if/when they should (or should not) be used, for example storing numbers as integers/longs so they are optimized for range queries. There are many other types of suggestions built in, that we hope you will find useful! When will AutoOps be available for me? We’re rolling out AutoOps in phases, starting with select Elastic Cloud Hosted regions and coverage is expanding rapidly. Next up, we’ll focus on our Elastic Cloud Serverless users. While Elastic Cloud Serverless is already making Elasticsearch easier to use, AutoOps will take it to the next level by offering advanced monitoring and optimization capabilities. And for our self-managed customers, we haven’t forgotten you. Plans are in the works to bring AutoOps your way, too! Try AutoOps: the easy way to operate Elasticsearch Elasticsearch is powerful, but should also be as simple and efficient as possible for your use. With AutoOps, we’re delivering on that promise in a big way. Whether you’re striving for optimal performance or looking to cut costs, AutoOps offers insights and tools to help you. Got questions or eager to dive into AutoOps? Here are some ways to start, and happy optimizing! AutoOps home page - watch three minute video Try AutoOps using an Elastic Cloud trial account AutoOps product documentation Report an issue Related content AutoOps January 2, 2025 Leveraging AutoOps to detect long-running search queries Learn how AutoOps helps you investigate long-running search queries plaguing your cluster to improve search performance. VC By: Valentin Crettaz AutoOps December 18, 2024 Resolving high CPU usage issues in Elasticsearch with AutoOps How AutoOps pinpointed and resolved high CPU usage in an Elasticsearch cluster: A step-by-step case study. MD By: Musab Dogan AutoOps How To November 20, 2024 Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. SF By: Sachin Frayne Vector Database Generative AI +2 May 21, 2024 Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures Benchmarking in preview provides Elasticsearch up to 37% better performance on Azure Cobalt 100 Arm-based VMs. YG HM By: Yuvraj Gupta and Hemant Malik Vector Database Generative AI +1 May 21, 2024 Elastic Cloud adds Elasticsearch Vector Database optimized profile to Microsoft Azure Elasticsearch added a new vector search optimized profile to Elastic Cloud on Microsoft Azure. Get started and learn how to use it here. SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta Jump to AutoOps makes Elastic Cloud easy to operate Real scenarios: how AutoOps makes Elasticsearch easy to operate Scenario #1: Finding a query causing severe search latency Scenario #2: Ineffective use of data tiering, leading to slow search and indexing Scenario #3: Investigating production down time Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "AutoOps makes every Elasticsearch deployment simple(r) to manage - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/autoops-elasticsearch-easy-operations",
+    "meta_description": "Learn about AutoOps for Elasticsearch and how it simplifies cluster management with performance recommendations, resource utilization, and cost insights."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. Generative AI How To JW By: James Williams On March 26, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Azure AI Document Intelligence is a powerful tool for extracting structured data from PDFs. It can be used to effectively extract text and table data. Once the data is extracted, it can be indexed into Elastic Cloud Serverless to power RAG (Retrieval Augmented Generation). In this blog, we will demonstrate how powerful Azure AI Document Intelligence is by ingesting four recent Elastic N.V. quarterly reports. The PDFs range from 43 to 196 pages in length and each PDF contains both text and table data. We will test the retrieval of table data with the following prompt: Compare/contrast subscription revenue for Q2-2025, Q1-2025, Q4-2024 and Q3-2024? This prompt is tricky because it requires context from four different PDFs that represent this information in tabular format. Let’s walk through an end-to-end reference that consists of two main parts: Python notebook Downloads four quarters’ worth of PDF 10-Q filings for Elastic N.V. Uses Azure AI Document Intelligence to parse the text and table data from each PDF file Outputs the text and table data into a JSON file Ingests the JSON files into Elastic Cloud Serverless Elastic Cloud Serverless Creates vector embeddings for PDF text+table data Powers vector search database queries for RAG Pre-configured OpenAI connector for LLM integration A/B test interface for chatting with the 10-Q filings Prerequisites The code blocks in this notebook require API keys for Azure AI Document Intelligence and Elasticsearch. The best starting point for Azure AI Document intelligence is to create a Document Intelligence resource . For Elastic Cloud Serverless, refer to the get started guide. You will need Python 3.9+ to run these code blocks. Create an .env file Place secrets for Azure AI Document Intelligence and Elastic Cloud Serverless in a .env file. Install Python packages Create input and output folders Download PDF files Download four recent Elastic 10-Q quarterly reports. If you already have PDF files, feel free to place them in the ‘./pdf’ folder. Parse PDFs using Azure AI Document Intelligence A lot is going on in code blocks that parse the PDF files. Here’s a quick summary: Set Azure AI Document Intelligence imports and environment variables Parse PDF paragraphs using AnalyzeResult Parse PDF tables using AnalyzeResult Combine PDF paragraph and table data Bring it all together by doing 1-4 for each PDF file and store the result in JSON Set Azure AI Document Intelligence imports and environment variables The most important import is AnalyzeResult. This class represents the outcome of a document analysis and contains details about the document. The details we care about are pages, paragraphs and tables. Parse PDF paragraphs using AnalzeResult Extract the paragraph text from each page. Do not extract table data. Parse the PDF tables using AnalyzeResult Extract the table content from each page. Do not extract paragraph text. The most interesting side effect of this technique is that there is no need to transform table data. LLMs know how to read text that looks like: “Cell [0, 1]: table data…” . Combine PDF paragraph and table data Pre-process chunking at the page level preserves context so that we can easily validate RAG retrieval manually. Later, you will see that this pre-process chunking will not have a negative effect on the RAG output. Bring it all together Open each PDF in the ./pdf folder, parse the text and table data and save the result in a JSON file that has entries for page_number , content_text and pdf_file . The content_text field represents the page paragraphs and table data for each page. Load data into Elastic Cloud Serverless The code blocks below handle: Set imports for the Elasticsearch client and environment variables Create index in Elastic Cloud Serverless Load the JSON files from ./json directory into the pdf-chat index Set imports for the Elasticsearch client and environment variables The most important import is Elasticsearch. This class is responsible for connecting to Elastic Cloud Serverless to create and populate the pdf-chat index. Create index in Elastic Cloud Serverless This code block creates an index named “pdf_chat” that has the following mappings: page_content - For testing RAG using full-text search page_content_sparse - For testing RAG using sparse vectors page_content_dense - For testing RAG using dense vectors page_number - Useful for constructing citations pdf_file - Useful for constructing citations Notice the use of copy_to and semantic_text . The copy_to utility copies body_content to two semantic text fields. Each semantic text field maps to an ML inference endpoint, one for the sparse vector and one for the dense vector. Elastic-powered ML inference will auto-chunk each page into 250 token chunks with a 100 token overlap. Load the JSON files from ./json directory into the pdf-chat index This process will take several minutes to run because we are: Loading 402 pages of PDF data Creating sparse text embeddings for each page_content chunk Creating dense text embeddings for each page_content chunk There is one last code trick to call out. We are going to set the elastic document ID by using the following naming convention: FILENAME_PAGENUMBER . This will make it easy/breezy to see the PDF file and page number associated with citations in Playground. Elastic Cloud Serverless Elastic Cloud Serverless is an excellent choice for prototyping a new Retrieval-Augmented Generation (RAG) system because it offers fully managed, scalable infrastructure without the complexity of manual cluster management. It supports both sparse and dense vector search out of the box, allowing you to experiment with different retrieval strategies efficiently. With built-in semantic text embedding, relevance ranking, and hybrid search capabilities, Elastic Cloud Serverless accelerates iteration cycles for search powered applications. With the help of Azure AI Document Intelligence and a little Python code, we are ready to see if we can get the LLM to answer questions grounded in truth. Let’s open Playground and conduct some manual A/B testing using different query strategies. Full text search This query will return the top ten pages of content that get a full-text search hit. Full-text search came close but it was only able to provide the right answer for three out of four quarters. This is understandable because we are stuffing the LLM context with ten full pages of data. And, we are not leveraging semantic search. Sparse vector search This query will return the top two semantic text fragments from pages that match our query using powerful sparse vector search. Sparse vector search powered by Elastic’s ELSER does a really good job retrieving table data from all four PDF files. We can easily double check the answers by opening the PDF page number associated with each citation. Dense vector search Elastic also provides an excellent dense vector option for semantic text ( E5 ). E5 is good for multi-lingual data and it has lower inference latency for high query per second use cases. This query will return the top two semantic text fragments that match our user input. The results are the same as with sparse search but notice how similar both queries are. The only difference is the “field” name. Hybrid search ELSER is so good for this use case that we do not need hybrid search. But, if we wanted to, we could combine dense and vector search into a single query. Then, rerank the results using RRF(Reciprocal Rank Fusion) . So what did we learn? Azure AI Document Intelligence Is very capable of parsing both text and table data in PDF files. Integrates well with the elasticsearch python client . Elastic Serverless Cloud Has built-in ML inference for sparse and dense vector embeddings at ingest and query time. Has powerful RAG A/B test tooling that can be used to identify the best retrieval technique for a specific use case. There are other techniques and technologies that can be used to parse PDF files. If your organization is all-in on Azure, this approach can deliver an excellent RAG system. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Create an .env file Install Python packages Create input and output folders Download PDF files Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Parse PDF text and table data with Azure AI Document Intelligence - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/azure-ai%E2%80%93document-intelligence-parse-pdf-text-tables",
+    "meta_description": "Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to automate synonyms and upload using our Synonyms API Discover how LLMs can be used to identify and generate synonyms automatically, allowing terms to be programmatically loaded into the Elasticsearch synonym API. Search Relevance How To AL By: Andre Luiz On March 27, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Improving the quality of search results is essential for providing an efficient user experience. One way to optimize searches is by automatically expanding the queried terms through synonyms. This allows queries to be interpreted more broadly, covering language variations and thus improving result matching. This blog explores how large language models (LLMs) can be used to identify and generate synonyms automatically, allowing these terms to be programmatically loaded into Elasticsearch's synonym API. When to use synonyms? The use of synonyms can be a faster and more cost-effective solution compared to vector search. Its implementation is simpler as it does not require deep knowledge of embeddings or a complex vector ingestion process. Additionally, resource consumption is lower since vector search demands greater storage capacity and memory for embedding indexing and retrieval. Another important aspect is search regionalization. With synonyms, it is possible to adapt terms according to local language and customs. This is useful in situations where embeddings may fail to match regional expressions or country-specific terms. For example, some words or acronyms may have different meanings depending on the region, but are naturally treated as synonyms by local users. In Brazil, this is quite common. \"Abacaxi\" and \"ananás\" are the same fruit (pineapple), but the second term is more commonly used in some regions of the Northeast. Similarly, the well-known \"pão francês\" in the Southeast may be known as \"pão careca\" in the Northeast. How to use LLMs to generate synonyms? To obtain synonyms automatically, we can use LLMs, which analyze the context of a term and suggest appropriate variations. This approach allows for dynamically expanding synonyms, ensuring a broader and more accurate search without relying on a fixed dictionary. In this demonstration, we will use an LLM to generate synonyms for e-commerce products. Many searches return few or no results due to variations in the queried terms. With synonyms, we can solve this issue. For example, a search for \"smartphone\" can encompass different models of mobile phones, ensuring users find the products they are looking for. Prerequisites Before getting started, we need to set up the environment and define the required dependencies. We will use the solution provided by Elastic to run Elasticsearch and Kibana locally in Docker . The code will be written in Python, v3.9.6, with the following dependencies: Creating the product index Initially, we will create an index of products without synonym support. This will allow us to validate queries and then compare them to an index that includes synonyms. To create the index, we bulk load a product dataset using the following command in Kibana DevTools: Generating synonyms with LLM In this step, we will use an LLM to dynamically generate synonyms. To achieve this, we will integrate the OpenAI API, defining an appropriate model and prompt. The LLM will receive the product category and name, ensuring that the synonyms are contextually relevant. From the created product index, we will retrieve all items in the \"Electronics\" category and send their names to the LLM. The expected output will be something like: With the generated synonyms, we can register them in Elasticsearch using the Synonyms API. Managing synonyms with the Synonyms API The Synonyms API provides an efficient way to manage synonym sets directly within the system. Each synonym set consists of synonym rules, where a group of words is treated as equivalent in searches. Example of creating a synonym set This creates a set called \"my-synonyms-set,\" where \"hello\" and \"hi\" are treated as equivalents, as well as \"bye\" and \"goodbye.\" Implementing synonym creation for the product catalog Below is the method responsible for building a synonym set and inserting it into Elasticsearch. The synonym rules are generated based on the mapping of synonyms suggested by the LLM. Each rule has an ID, corresponding to the product name in slug format, and the list of synonyms calculated by the LLM. Below is the request payload to create the synonym set: With the synonym set created in the cluster, we can move on to the next step, which is creating a new index with synonym support using the defined set. The complete Python code with the synonyms generated by LLM and the synonym set creation defined by the Synonyms API is below: Creating an index with synonym support A new index will be created where all data from the products index will be reindexed. This index will use the synonyms_filter , which applies the products-synonyms-set created earlier. Below is the index mapping configured to use synonyms: Reindexing the products index Now, we will use the Reindex API to migrate the data from the products index to the new products_02 index, which includes synonym support. The following code was executed in Kibana DevTools: After the migration, the products_02 index will be populated and ready to validate searches using the configured synonym set. Validating search with synonyms Let's compare the search results between the two indexes. We will execute the same query on both indexes and validate whether the synonyms are being used to retrieve results. Search in the products index (without synonyms) We will use Kibana to perform searches and analyze the results. In the Analytics > Discovery menu, we will create a Data View to visualize the data from the indexes we created. Within Discovery, click on Data View and define a name and an index pattern. For the \" products \" index, we will use the \" products ” pattern. Then, we will repeat the process to create a new Data View for the \" products_02 \" index, using the \" products_02” pattern. With the Data Views configured, we can return to Analytics > Discovery and start the validations. Here, after selecting DataView products and performing a search for the term \"tablet\", we get no results, even though we know that there are products like \"Kindle Paperwhite\" and \"Apple iPad Air\". Search in the products_02 index (support synonyms) When performing the same query on the \" products_synonyms \" Data View, which supports synonyms, the products were retrieved successfully. This demonstrates that the configured synonym set is working correctly, ensuring that different variations of the searched terms return the expected results. We can achieve the same result by running the same query directly in Kibana DevTools. Simply search the products_02 index using the Elasticsearch Search API: Conclusion Implementing synonyms in Elasticsearch improved the accuracy and coverage of product catalog searches. The key differentiator was the use of an LLM , which generated synonyms automatically and contextually, eliminating the need for predefined lists. The model analyzed product names and categories, ensuring relevant synonyms for e-commerce. Additionally, the Synonyms API simplified dictionary management, allowing synonym sets to be modified dynamically. With this approach, search became more flexible and adaptable to different user query patterns. This process can be continually improved with new data and model adjustments, ensuring an increasingly efficient research experience. References Run Elasticsearch locally https://www.elastic.co/guide/en/elasticsearch/reference/current/run-elasticsearch-locally.html Synonyms API https://www.elastic.co/guide/en/elasticsearch/reference/current/synonyms-apis.html Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to When to use synonyms? How to use LLMs to generate synonyms? Prerequisites Creating the product index Generating synonyms with LLM Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to automate synonyms and upload using our Synonyms API - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-synonyms-automate",
+    "meta_description": "Discover how LLMs can be used to identify and generate synonyms automatically, allowing terms to be programmatically loaded into the Elasticsearch synonym API."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Ingest autoscaling in Elasticsearch Learn more about how Elasticsearch autoscales to address ingestion load. Elastic Cloud Serverless PS HA FC By: Pooya Salehi , Henning Andersen and Francisco Fernández Castaño On July 29, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. Sizing an Elasticsearch cluster correctly is not easy. The optimal size of the cluster depends on the workload that the cluster is experiencing, which may change over time. Autoscaling adapts the cluster size to the workload automatically without human intervention. It avoids over-provisioning resources for the cluster only to accommodate peak usage and it also prevents degrading cluster performance in case of under-provisioning. We rely on this mechanism to free users of our Elastic Cloud Serverless offering from having to make sizing decisions for the indexing tier . Ingest autoscaling requires continuously estimating the resources required to handle the incoming workload, and provisioning and de-provisioning these resources in a timely manner. In this blog post we explore ingest autoscaling in Elasticsearch, covering the following: How ingest autoscaling works in Elasticsearch Which metrics we use to quantify the indexing workload the cluster experiences in order to estimate resources required to handle that workload How these metrics drive the autoscaling decisions. Ingest autoscaling overview Ingest autoscaling in Elasticsearch is driven by a set of metrics that is exposed by Elasticsearch itself. These metrics reflect the ingestion load and the memory requirement of the indexing tier. Elasticsearch provides an autoscaling metrics API that serves these metrics which allows an external component to monitor these metrics and make decisions whether the cluster size needs to change (see Figure 1). In the Elastic Cloud Serverless service, there is an autoscaler component which is a Kubernetes Controller. The autoscaler polls the Elasticsearch autoscaling metrics API periodically and calculates the desired cluster size based on these metrics. If the desired cluster size is different from the current one, the autoscaler changes the cluster size to consolidate the available resources in the cluster towards the desired resources. This change is both in terms of the number of Elasticsearch nodes in the cluster and the CPU, memory and disk available to each node. Figure 1 : ingestion autoscaling overview An important consideration for ingest autoscaling is that when the cluster receives a spike in the indexing load the autoscaling process can take some time until it effectively adapts the cluster size. While we try to keep this reaction time as low as possible, it cannot be instantaneous. Therefore, while the cluster is scaling up, the Elasticsearch cluster should be able to temporarily push back on the load it receives if the increased load is otherwise going to cause cluster instability issues. The increase in the indexing load can manifest itself in the cluster requiring more resources, i.e., CPU, memory or disk. Elasticsearch has protection mechanisms that allows nodes to push back on the indexing load if any of these resources becomes a bottleneck. To handle indexing requests Elasticsearch uses dedicated thread pools sized based on the number of cores available to the node. If the increased indexing load results in CPU or other resources becoming a bottleneck, incoming indexing requests are queued. The maximum size of this queue is limited and any request arriving at the node when the queue is full will be rejected with a 429 HTTP code. Elasticsearch also keeps track of the required memory to address ongoing indexing requests and rejects incoming requests (with a 429) if the indexing buffer grows beyond 10% of the available heap memory . This limits the memory used for indexing and ensures the node will not go out of memory. The Elastic Cloud Serverless offering relies on the object store as the main storage for indexed data. The local disk on the nodes are used temporarily to hold indexed data. Periodically, Elasticsearch uploads the indexed data to the object store which allows freeing up the local disk space as we rely on the object store for durability of the indexed document. Nonetheless, under high indexing load, it is possible for the node to run out of disk space before the periodic upload task gets a chance to run and free up the local disk space. To handle these cases, Elasticsearch monitors the available local disk space and if necessary throttles the indexing activity while it attempts to free up space by enforcing an upload to the object store rather than waiting for the periodic upload to take place. Note that this throttling in turn results in queueing of the incoming indexing requests. These protection mechanisms allow an Elasticsearch cluster to temporarily reject requests and provide the client with a response that indicates that the cluster is overloaded while the cluster tries to scale up. This push-back signal from Elasticsearch provides the client with a chance to react by reducing the load if possible or retrying the request which should eventually succeed if retried when the cluster is scaled up. Metrics The two metrics that are used for ingest autoscaling in Elasticsearch are ingestion load and memory. Ingestion load Ingestion load represents the number of threads that is needed to cope with the current indexing load. The autoscaling metrics API exposes a list of ingestion load values, one for each indexing node. Note that as the write thread pools (which handle indexing requests) are sized based on the number of CPU cores on the node, this essentially determines the total number of cores that is needed in the cluster to handle the indexing workload. The ingestion load on each indexing node consists of two components: Thread pool utilization: the average number of threads in the write thread pool processing indexing requests during that sampling period. Queued ingestion load: the estimated number of threads needed to handle queued write requests. The ingestion load of each indexing node is calculated as the sum of these two values for all the three write thread pools . The total ingestion load of the Elasticsearch cluster is the sum of the ingestion load of the individual nodes. n o d e _ i n g e s t i o n _ l o a d = ∑ ( t h r e a d _ p o o l _ u t i l i z a t i o n + q u e u e d _ i n g e s t i o n _ l o a d ) t o t a l _ i n g e s t i o n _ l o a d = ∑ ( n o d e _ i n g e s t i o n _ l o a d ) \\small node\\_ingestion\\_load = \\sum(thread\\_pool\\_utilization + queued\\_ingestion\\_load) \\newline total\\_ingestion\\_load = \\sum(node\\_ingestion\\_load) n o d e _ in g es t i o n _ l o a d = ∑ ( t h re a d _ p oo l _ u t i l i z a t i o n + q u e u e d _ in g es t i o n _ l o a d ) t o t a l _ in g es t i o n _ l o a d = ∑ ( n o d e _ in g es t i o n _ l o a d ) Figure 2 : ingestion load components The thread pool utilization is an exponentially weighted moving average (EWMA) of the number of busy threads in the thread pool, sampled every second. The EWMA of the sampled thread pool utilization values is configured such that the sampled values of the past 10 seconds have the most effect on the thread pool utilization component of the ingestion load and samples older than 60 seconds have very negligible impact. To estimate the resources required to handle the queued indexing requests in the thread pool, we need to have an estimate for how long each queued task can take to execute. To achieve this, each thread pool also provides an EWMA of the request execution time. The request execution time for an indexing request is the (wall-clock) time taken for the request to finish once it is out of the queue and a worker thread starts executing it. As some queueing is acceptable and should be manageable by the thread pool, we try to estimate the resources needed to handle the excess queueing. We consider up to 30s worth of tasks in the queue manageable by the existing number of workers and account for an extra thread proportional to this value. For example, if the average task execution time is 200ms, we estimate that each thread is able to handle 150 indexing requests within 30s, and therefore account for one extra thread for each 150 queued items. q u e u e d _ i n g e s t i o n _ l o a d = q u e u e _ s i z e × a v e r a g e _ r e q u e s t _ e x e c u t i o n _ t i m e 30 s \\small queued\\_ingestion\\_load = \\frac{queue\\_size \\times average\\_request\\_execution\\_time}{30s} q u e u e d _ in g es t i o n _ l o a d = 30 s q u e u e _ s i ze × a v er a g e _ re q u es t _ e x ec u t i o n _ t im e ​ Note that since the indexing nodes rely on pushing indexed data into the object store periodically, we do not need to scale the indexing tier based on the total size of the indexed data. However, the disk IO requirements of the indexing workload needs to be considered for the autoscaling decisions. The ingestion load represents both CPU requirements of the indexing nodes as well as disk IO since both CPU and IO work is done by the write thread pool workers and we rely on the wall clock time to estimate the required time to handle the queued requests. Each indexing node calculates its ingestion load and publishes this value to the master node periodically. The master node serves the per node ingestion load values via the autoscaling metrics API to the autoscaler. Memory The memory metrics exposed by the autoscaling metrics API are node memory and tier memory. The node memory represents the minimum memory requirement for each indexing node in the cluster. The tier memory metric represents the minimum total memory that should be available in the indexing tier. Note that these values only indicate the minimum to ensure that each node is able to handle the basic indexing workload and hold the cluster and indices metadata, while ensuring that the tier includes enough nodes to accommodate all index shards. Node memory must have a minimum of 500MB to be able to handle indexing workloads , as well as a fixed amount of memory per each index . This ensures all nodes can hold metadata for the cluster, which includes metadata for every index. Tier memory is determined by accounting for the memory overhead of the field mappings of the indices and the amount of memory needed for each open shard allocated on a node in the cluster. Currently, the per-shard memory requirement uses a fixed estimate of 6MB. We plan to refine this value. The estimate for the memory requirements for the mappings of each index is calculated by one of the data nodes that hosts a shard of the index. The calculated estimates are sent to the master node. Whenever there is a mapping change this estimate is updated and published to the master node again. The master node serves the node and total memory metrics based on these information via the autoscaling metrics API to the autoscaler. Scaling the cluster The autoscaler is responsible for monitoring the Elasticsearch cluster via the exposed metrics, calculating the desirable cluster size to adapt to the indexing workload, and updating the deployment accordingly. This is done by calculating the total required CPU and memory resources based on the ingestion load and memory metrics. The sum of all the ingestion load per node values determines the total number of CPU cores needed for the indexing tier. The calculated CPU requirement and the provided minimum node and tier memory resources are mapped to a predetermined set of cluster sizes. Each cluster size determines the number of nodes and the CPU, memory and disk size of each node. All nodes within a certain cluster size have the same hardware specification. There is a fixed ratio between CPU, memory and disk, thus always scaling all 3 resources linearly. The existing cluster sizes for the indexing tier are based on node sizes starting from 4GB/2vCPU/100GB disk to 64GB/32vCPU/1600GB disk. Once the Elasticsearch cluster scales up to the largest node size (64GB memory), any further scale-up adds new 64GB nodes, allowing a cluster to scale up to 32 nodes of 64GB. Note that this is not a hard upper bound on the number of Elasticsearch nodes in the cluster and can be increased if necessary. Every 5 seconds the autoscaler polls metrics from the master node, calculates the desirable cluster size and if it is different from the current cluster size, it updates the Elasticsearch Kubernetes Deployment accordingly. Note that the actual reconciliation of the deployment towards the desired cluster size and adding and removing the Elasticsearch nodes to achieve this is done by Kubernetes. In order to avoid very short-lived changes to the cluster size, we account for a 10% headroom when calculating the desired cluster size during a scale down and a scale down takes effect only if all desired cluster size calculations within the past 15 minute have indicated a scale-down. Currently, the time that it takes for an increase in the metrics to lead to the first Elasticsearch node being added to the cluster and ready to process indexing load is under 1 minute. Conclusion In this blog post, we explained how ingest autoscaling works in Elasticsearch, the different components involved, and the metrics used to quantify the resources needed to handle the indexing workload. We believe that such an autoscaling mechanism is crucial to reduce the operational overhead of an Elasticsearch cluster for the users by automatically increasing the available resources in the cluster when necessary. Furthermore, it leads to cost reduction by scaling down the cluster when the available resources in the cluster are not required anymore. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Ingest autoscaling overview Metrics Ingestion load Memory Scaling the cluster Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Ingest autoscaling in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-ingest-autoscaling",
+    "meta_description": "Learn more about how Elasticsearch autoscales to address ingestion load."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Resolving high CPU usage issues in Elasticsearch with AutoOps How AutoOps pinpointed and resolved high CPU usage in an Elasticsearch cluster: A step-by-step case study. AutoOps MD By: Musab Dogan On December 18, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this article, we’ll dive into a real-world scenario where AutoOps was instrumental in diagnosing and addressing high CPU usage in a production Elasticsearch cluster. The incident began with a customer support ticket and ended with actionable insights to ensure smoother operations in the future. Introduction: Diagnosing High CPU usage in Elasticsearch Efficiently managing Elasticsearch clusters is crucial for maintaining application performance and reliability. When a customer experiences sudden performance bottlenecks, the ability to quickly diagnose the issue and provide actionable recommendations becomes a key differentiator. This review explores how AutoOps, a powerful monitoring and management tool, helped us identify and analyze a high CPU utilization issue affecting an Elasticsearch cluster. The article provides a step-by-step account of how AutoOps identified the root cause, along with the benefits this tool offers in streamlining the investigation process. The high CPU situation On July 14, 2024, a production cluster named “Palomino” experienced an outage. The customer reported the issue the next day, citing high CPU usage as a potential root cause. Despite the issue being non-urgent (as the outage was resolved), understanding the underlying cause remained critical for preventing recurrence. The initial request was as follows: The investigation began with one keyword in mind: high CPU usage . Using AutoOps for diagnosing high CPU usage Step 1: Analyzing AutoOps events AutoOps immediately flagged multiple “High CPU Utilization” events. Clicking on an event provided comprehensive details, including: When the event started and ended. The node experiencing the most pressure. Initial recommendations, such as enabling search slow logs. While the suggestion to enable slow logs was noted, we continued exploring for a deeper root cause. If you want to activate search slowlogs, you can use this link . Step 2: Node view analysis Focusing on the node with the highest CPU pressure (instance-0000000008), AutoOps filtered the graphs to highlight metrics specific to that node during the event window. This view confirmed significant CPU usage spikes. Step 3: Broader investigation By zooming out to analyze a larger time range, we observed that the CPU increase coincided with a rise in both search and indexing requests. Expanding the view further revealed that the issue was not limited to one node but affected all nodes in the cluster. Step 4: Identifying patterns The investigation revealed a critical pattern: a regular spike around 7:00 AM each day , triggered by simultaneous search and indexing requests. This repetitive behavior was the root cause of the high CPU utilization. Step 5: Actionable insights AutoOps provided three critical questions to ask the customer: What is happening every day at 7:40 AM (GMT+3)? Can these requests be distributed more evenly over time to decrease pressure? Have you monitored the CPU graph (AutoOps > Node View > Host and Process > CPU) at 7:00 AM after implementing changes? Finding the root cause of problems generally takes 90% of the time, while fixing the problem takes 10%. Thanks to AutoOps, we were able to handle this 90% more easily and much faster. Hint: To find the problematic query AutoOps plays a crucial role. It will help you find where the problematic query/indexing runs, eg, on which node, shard, and index. Also, thanks to the long_running_search_task event, without any manual effort, AutoOps can identify the problematic query and create an event with a recommended approach to fine-tune the query. Benefits of using AutoOps Rapid Identification: AutoOps’ event-based monitoring pinpointed the affected node and time range within minutes. Clear Recommendations: Suggestions like enabling slow logs will help focus on troubleshooting efforts. Pattern Recognition: By correlating metrics across all nodes and timeframes, AutoOps uncovered the recurring nature of the issue. User-Friendly Views: Filtering and zooming capabilities made it easy to visualize trends and anomalies. Conclusion Thanks to AutoOps, we transformed a vague report of high CPU usage into a clear, actionable plan. By identifying a recurring pattern of activity, we provided the customer with the tools and insights to prevent similar issues in the future. If your team manages production systems, incorporating tools like AutoOps into your workflow can significantly enhance visibility and reduce the time to resolve critical issues. Report an issue Related content AutoOps January 2, 2025 Leveraging AutoOps to detect long-running search queries Learn how AutoOps helps you investigate long-running search queries plaguing your cluster to improve search performance. VC By: Valentin Crettaz AutoOps How To November 20, 2024 Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. SF By: Sachin Frayne AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Jump to Introduction: Diagnosing High CPU usage in Elasticsearch The high CPU situation Using AutoOps for diagnosing high CPU usage Step 1: Analyzing AutoOps events Step 2: Node view analysis Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Resolving high CPU usage issues in Elasticsearch with AutoOps - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-cpu-usage-high",
+    "meta_description": "Learn how to diagnose and fix the Elasticsearch high CPU usage issue. We'll use AutoOps to pinpoint & resolve the issue and gain insights for prevention."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Embeddings and reranking with Alibaba Cloud AI Service Using Alibaba Cloud AI Service features with Elastic. Generative AI Integrations How To TM By: Tomás Murúa On February 26, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this article, we'll cover how to integrate Alibaba Cloud AI features with Elasticsearch to improve relevance in semantic searches. Alibaba Cloud AI Search is a solution that integrates advanced AI features with Elasticsearch tools, by leveraging the Qwen LLM family to contribute with advanced models for inference and classification. In this article, we'll use descriptions of novels and plays written by the same author to test the Alibaba reranking and sparse embedding endpoints. Steps Configure Alibaba Cloud AI Create Elasticsearch mappings Index data into Elasticsearch Query data Bonus: Answering questions with completion Configure Alibaba Cloud AI Alibaba Cloud AI reranking and embeddings Open inference Alibaba Cloud offers different services. In this example, we'll use the descriptions of popular books and plays by Agatha Christie to test Alibaba Cloud embeddings and reranking endpoints in semantic search. The Alibaba Cloud AI reranking endpoint is a semantic reranking functionality. This type of reranking uses a machine learning model to reorder search results based on their semantic similarity to a query. This allows you to use out-of-the-box semantic search capabilities on existing full-text search indices. The sparse embedding endpoint is a type of embedding where most values are zero, making relevant information more prominent. Get Alibaba Cloud API Key We need a valid API Key to integrate Alibaba with Elasticsearch. To get it, follow these steps: Access the Alibaba Cloud portal from the Service Plaza section. Go to the left menu API Keys as shown below. Generate a new API Key. Configure Alibaba Endpoints We´ll first configure the sparse embedding endpoint to transform the text descriptions into semantic vectors: Embeddings endpoint: We´ll then configure the rerank endpoint to reorganize results. Rerank Endpoint: Now that the endpoints are configured, we can prepare the Elasticsearch index. Create Elasticsearch mappings Let's configure the mappings . For this, we need to organize both the texts with the descriptions as well as the model-generated vectors. We'll use the following properties: semantic_description : to store the embeddings generated by the model and run semantic searches. description : we'll use a \" text \" type to store the novels and plays’ descriptions and use them for full-text search. We'll include the copy_to parameter so that both the text and the semantic field are available for hybrid search: With the mappings ready, we can now index the data. Index data into Elasticsearch Here's the dataset with the descriptions that we'll use for this example. We'll index it using the Elasticsearch Bulk API . Note that the first two documents, “Black Coffee” and “The Mousetraps” are plays while the others are novels. Query data To see the different results we can get, we'll run different types of queries, starting with semantic query, then applying reranking, and finally using both. We'll use the same question \"Which novel was written by Agatha Christie?\" expecting to get the three documents that explicitly say novel, plus the one that says book. The two plays should be the last results. Semantic search We'll begin querying the semantic_text field to ask: \"Which novel was written by Agatha Christie?\" Let's see what happens: Response: In this case, the response prioritized most of the novels, but the document that says “book” appears last. We can still further refine the results with reranking. Refining results with Reranking In this case, we'll use a _inference/rerank request to assess the documents we got in the first query and improve their rank in the results. Response: The response here shows that both plays are now at the bottom of the results. Semantic search and reranking endpoint combined Using a retriever , we'll combine the semantic query and reranking in just one step: Response: The results here differ from the semantic query. We can see that the document with no exact match for \"novel\" but that says “book” ( The Murder of Roger Ackroyd) appears higher than in the first semantic search. Both plays are still the last results, just like with reranking. Bonus: Answering questions with completion With embeddings and reranking we can satisfy a search query, but still, the user will see all the search results and not the actual answer. With the examples provided, we are one step away from a RAG implementation, where we can provide the top results + the question to an LLM to get the right answer. Fortunately, Alibaba Cloud AI Service also provides an endpoint service we can use to achieve this purpose. Let’s create the endpoint Completion Endpoint: And now, send the results and question from the previous query: Query Response Conclusion Integrating Alibaba Cloud AI Search with Elasticsearch allows us to easily access completion, embedding, and reranking models to incorporate them into our search pipeline. We can use the reranking and embedding endpoints, either separately or together, with the help of a retriever. We can also introduce the completion endpoint to finish up a RAG end-to-end implementation. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Configure Alibaba Cloud AI Alibaba Cloud AI reranking and embeddings Get Alibaba Cloud API Key Configure Alibaba Endpoints Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Embeddings and reranking with Alibaba Cloud AI Service - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/alibaba-cloud-ai-embeddings-reranking",
+    "meta_description": "Learn how to use Alibaba Cloud AI features with Elastic, including embeddings and reranking."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. Integrations AG By: Ajay Krishnan Gopalan On May 8, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this blog, we will discuss how to implement a scalable data processing pipeline using NVIDIA NeMo Retriever extraction models, Unstructured Platform and Elasticsearch. This pipeline transforms unstructured data from a data source into structured, searchable content ready for downstream AI applications, such as RAG. Retrieval Augmented Generation (RAG) is an AI technique where Large Language Models (LLMs) are provided with external knowledge to generate responses to user queries. This allows LLM responses to be tailored to specific context, making answers more accurate and relevant. Before we get started, let’s take a look at the key components enabling this pipeline and what each brings to the table. Pipeline components NeMo Retriever extraction is a set of microservices for transforming unstructured documents into structured content and metadata. It handles document parsing, visual structure identification, and OCR processing at scale. The RAG NVIDIA AI Blueprint provides a starting point for how to use the NeMo Retriever microservices in a high-performance extraction pipeline. Unstructured is an ETL+ platform for orchestrating the entirety of unstructured data processing: from ingesting unstructured data from multiple data sources, converting raw, unstructured files into structured data through a configurable workflow engine, enriching data with additional transformations, all the way to uploading the results into vector stores, databases and search engines. It provides a visual UI, APIs, and scalable backend infrastructure to orchestrate document parsing, enrichment, and embedding in a single workflow. Elasticsearch is an industry-leading search and analytics engine that now includes native vector search capabilities. It can function as both a traditional text database and a vector database, enabling semantic search at scale with features like k-NN similarity search. Now that we’ve introduced the core components, let’s take a look at how they work together in a typical workflow before diving into the implementation. RAG with NeMo Retriever - Unstructured - Elasticsearch While here we only provide key highlights, you can find the full notebook here . This blog can be divided into 3 parts: Setting up the source and destination connectors Setting up the workflow with Unstructured API RAG over the processed data Unstructured workflow is represented as a DAG where the nodes, called connectors, control where the data is ingested from and where the processed results are uploaded to. These nodes are required in any workflow. A source connector configures ingestion of the raw data from a data source, and the destination connector configures the data uploading of the processed data into a vector store, search engine, or a database. For this blog, we store research papers in Amazon S3 and we want the processed data to be delivered into Elasticsearch for downstream use. This means that before we can build a data processing workflow, we need to create a source connector for Amazon S3, and a destination connector for Elasticsearch with Unstructured API. Step 1: Setting up the S3 source connector When creating a source connector, you need to give it a unique name, specify its type (e.g. S3 , or Google Drive ), and provide the configuration which typically contains the location of the source you're connecting to (e.g. S3 bucket URI, or Google Drive folder) and authentication details. Step 2: Setting up the Elasticsearch destination connector Next, let’s set up the Elasticsearch destination connector. The Elasticsearch index that you use must have a schema that is compatible with the schema of the documents that Unstructured produces for you—you can find all the details in the documentation . Step 3: Creating a workflow with Unstructured Once you have the source and destination connectors, you can create a new data processing workflow. We’ll build the workflow DAG with the following nodes: NeMo Retriever for document partitioning Unstructured’s Image Summarizer, Table Summarizer, and Named Entity Recognition nodes for content enrichment Chunker and Embedder nodes for making the content ready for similarity search Once your job for this workflow completes, the data is uploaded into Elasticsearch and we can proceed with building a basic RAG application. Step 4: RAG setup Let's go ahead with a simple retriever that will connect to the data, take in the user query, embed it with the same model that was used to embed the original data, and calculate cosine similarity to retrieve the top 3 documents. Then let's set up a workflow to receive a user query, fetch similar documents from Elasticsearch, and use the documents as context to answer the user’s question. Putting everything together we get: And a response: Elasticsearch provides various strategies to enhance search, including Hybrid search , a combination of approximate semantic search and keyword-based search. This approach can improve the relevance of the top documents used as context in the RAG architecture. To enable it, you need to modify the vector_store initialization as follows: Conclusion Good RAG starts with well-prepared data, and Unstructured simplifies this critical first step. By enabling partitioning with NeMo Retriever, metadata enrichment of unstructured data and efficient ingestion into Elasticsearch, it ensures that your RAG pipeline is built on a solid foundation, unlocking its full potential for all your downstream tasks. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Jump to Pipeline components RAG with NeMo Retriever - Unstructured - Elasticsearch Step 1: Setting up the S3 source connector Step 2: Setting up the Elasticsearch destination connector Step 3: Creating a workflow with Unstructured Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/unstructured-data-processing-with-nvidia-nemo-retriever-unstructured-and-elasticsearch",
+    "meta_description": "Unstructured data processing with NV‑Ingest, Unstructured & Elastic"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elastic Playground: Using Elastic connectors to chat with your data Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources. Integrations Ingestion How To JR TM By: Jeffrey Rengifo and Tomás Murúa On February 3, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Elastic connectors make it easy to index and combine data from different sources to run unified searches. With the addition of Playground you can set up a knowledge base that you can chat with and ask questions. Connectors are a type of Elastic integration that are helpful for syncing data from different sources to an Elasticsearch index. In this article, we'll see how to index a Confluence Wiki using the Elastic connector, configure an index to run semantic queries, and then use Playground to chat with your data. Steps Configure the connector Preparing the index Chat with data using Playground Configure the connector In our example, our Wiki works as a centralized repository for a hospital and contains info on: Doctors' profiles: speciality, availability, contact info. Patients' files: Medical records and other relevant data. Hospital guidelines: Policies, emergency protocols and instructions for staff. We'll index the content from our Wiki using the Elasticsearch-managed Confluence connector . The first step is to get your Atlassian API Key : Configuring the Confluence native connector You can follow the steps here to guide you through the configuration: Access your Kibana instance and go to Search > Connectors Click on add a connector and select Confluence from the list. Name the new connector \"hospital\". Then click on the create new Index button. Click on edit configurations and, for this example, we need to modify the data source for \"confluence cloud\". The required fields are: Confluence Cloud account email API Key Confluence URL label Save the configuration and go to the next step. By default, the connector will index: Pages Spaces Blog Posts Attachments To make sure to only index the wiki, you need to use an advanced filter rule to include only pages inside the space named \"Hospital Health\" identified as \"HH\". You can check out additional examples here . Now, let's run a Full Content Sync to index our wiki. Once completed, we can check the indexed documents on the tab \"Documents\". Preparing the index With what we have so far, we could run full text queries on our content. Since we want to make questions instead of looking for keywords, we now need to have semantic search. For this purpose we will use Elasticsearch ELSER model as the embeddings provider. To configure this, use the Elasticsearch's inference API . Go to Kibana Dev Tools and copy this code to start the endpoint: Now the model is loading in the background. You might get a 502 Bad Gateway error if you haven't used the ELSER model before. To make sure the model is loading, check Machine Learning > Trained Models: Let's add a semantic_text field using the UI. Go to the connector's page, select Index mappings, and click on Add Field. Select \"Semantic text\" as field type. For this example, the reference field will be \"body\" and the field name content_semantic. Finally, select the inference endpoint we've just configured. Before clicking on \"Add field\", check that your configuration looks similar to this: Now click on \"Save mapping\": One you've ran the Full Content Sync from the UI, let's check it's ok by running a semantic query: The response should look something like this: Chat with your data using Playground What is Playground? Playground is a low code platform hosted in Kibana that allows you to easily create a RAG application and ask questions to your indices, regardless if they have embeddings. Playground not only provides a UI chat with citations and provides full control over the queries, but also handles different LLMs to synthesize the answers. You can read this article for a deeper insight and test the online demo to familiarize yourself with it. Configure Playground To begin, you only need the credentials for any of the compatible models : OpenAI (or any local model compatible with OpenAI API) Amazon Bedrock Google Gemini When you open Playground, you have the option to configure the LLM provider and select the index with the documents you want to use as knowledge base. For this example, we'll use OpenAI. You can check this link to learn how to get an API key . Let's create our OpenAI connector by clicking Connect to an LLM > OpenAI and let's fill in the fields as in the image below: To select the index we created using the Confluence connector, click on \"Add data sources\" and click on the index. NOTE: You can select more than one index, if you want. Now that we're done configuring, we can start making questions to the model. Aside from choosing to include citations with the source document in your answers, you can also control which fields to send to the LLM to use in search. The View Code window provides the python code you need to integrate this into your apps. Conclusion In this article, we learned that we can use connectors both to search for information in different sources as well as a knowledge base using Playground. We also learned to easily deploy a RAG application to chat with your data without leaving the Elastic environment. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Configure the connector Configuring the Confluence native connector Preparing the index Chat with your data using Playground Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elastic Playground: Using Elastic connectors to chat with your data - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/playground-connectors-data-chat",
+    "meta_description": "Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Lucene Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Lucene January 3, 2025 Lucene Wrapped 2024 2024 has been another major year for Apache Lucene. In this blog, we’ll explore the key highlights. CH By: Chris Hegarty Lucene December 27, 2024 Lucene bug adventures: Fixing a corrupted index exception Sometimes, a single line of code takes days to write. Here, we get a glimpse of an engineer's pain and debugging over multiple days to fix a potential Apache Lucene index corruption. BT By: Benjamin Trent Vector Database Lucene December 4, 2024 Smokin' fast BBQ with hardware accelerated SIMD instructions How we optimized vector comparisons in BBQ with hardware accelerated SIMD (Single Instruction Multiple Data) instructions. CH By: Chris Hegarty Vector Database Lucene +1 November 18, 2024 Better Binary Quantization (BBQ) vs. Product Quantization Why we chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch. BT By: Benjamin Trent 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Lucene - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/lucene",
+    "meta_description": "Lucene articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Semantic reranking in Elasticsearch with retrievers Explore strategies for using semantic reranking to boost the relevance of top search results, including semantic reranking with retrievers. Vector Database Search Relevance How To AD NC By: Adam Demjen and Nick Chow On May 28, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This guide explores strategies for using semantic reranking to boost the relevance of top search results, including as direct inference call, in the context of a search experience, or as part of a simplified search flow with retrievers. Before diving into the details, let's explain what semantic reranking is and why it is important. What is semantic reranking? Semantic reranking is a method that allows us to utilize the speed and efficiency of fast retrieval methods while layering semantic search on top of it. It also lets us immediately add semantic search capabilities to existing Elasticsearch installations out there. With the advancement of machine learning-powered semantic search we have more and more tools at our disposal for finding matches quickly from millions of documents. However, like cramming for a final exam, optimizing for speed means making some tradeoffs, and that usually comes at a loss in fidelity. To offset this, we see some tools emerging and becoming increasingly available on the other side of the gradient. These are much slower, but can tell how closely a document matches a query with much more accuracy. To explain some key terms: reranking is the process of reordering a set of retrieved documents in order to improve search relevance. In semantic reranking this is done with the help of a reranker machine learning model, which calculates a relevance score between the input query and each document. Rerankers typically operate on the top K results, a narrowed-down window of relevant candidates fulfilling the search query, since reranking a large list of documents would be extremely costly. Why is semantic reranking important? Semantic reranking is an important refinement layer for search users for a couple of reasons. First, users are expecting more from their search, where the right result isn't in the top ten hits or in the first page, but is the top answer . It's like that old search joke - the best place to hide a secret is in the second page of search results. Except today it's even more narrow: anything below the top one, two, or maybe three results will likely get discarded. This applies even more so for RAG (Retrieval Augmented Generation) - those Generative AI use cases need a tight context window. The best document could be the 4th result, but if you're only feeding in the top three, you aren't going to get the right answer, and the model could hallucinate. On top of that, Generative AI use cases work best with an effective cutoff . You could define a minimum score or count up to which the results are considered \"good\", but this is hard to do without consistent scoring. Semantic reranking solves these problems by reordering the documents so that the most relevant ones come out on top. It provides usable, normalized and well-calibrated scores, so you can measure how closely your results match your query. So you more reliably get much more accurate top results to feed to your large language model, and you can cut off results if there's a big dropoff in score in the top K hits to prevent hallucinations. How do we perform semantic reranking? The rerank inference type Elastic recently introduced inference endpoints and related APIs . This feature allows us to use certain services, such as built-in or 3rd party machine learning models, to perform inference tasks. Supported inference tasks come in various shapes - for example a sparse_embedding task is where an ML model (such as ELSER) receives some text and generates a weighted set of terms, whereas a text_embedding task creates vector embeddings from the input. Elastic Serverless - and the upcoming 8.14 release - adds a new task type: rerank . In the first iteration rerank supports integrating with Cohere 's Rerank API. This means you can now create an inference endpoint in Elastic, supply your Cohere API key, and enjoy semantic reranking out of the box! Let's see that in action with an example taken from the Cohere blog . Assuming you have set up your rerank inference endpoint in Elastic with the Cohere Rerank v3 model, we can pass a query and an array of input text. As we can see, the short passages all relate to the word \"capital\", but not necessarily to the meaning of the location of the seat of government, which is what the query is looking for: The rerank task responds with an array of scores and document indices: The topmost entry tells us that the highest relevance score of 99.8% is the 4th document ( \"index\": 3 with zero-based indexing) of the original list, a.k.a. \"Washington, D.C. ...\" . The rest of the documents are semantically less relevant to the original query. This reranking inference step is an important puzzle piece of an optimized search experience, and now we are ready to place it in the puzzle board! Reranking search results today - through your application One way of harnessing the power of semantic reranking is to implement a workflow like this in a search application: A user enters a query in your app's UI. The search engine component retrieves a set of documents that match this query. This can be done using any retrieval strategy: lexical (BM25), vector search (e.g. kNN) or a method that combines the two, such as RRF. The application takes the top K documents, extracts the text field we are querying against from each document, then sends this list of texts to the rerank inference endpoint, which is configured to use Cohere. The inference endpoint passes the documents and the query to Cohere. The result is a list of scores and indices to match each score. Your app takes these scores, assigns them to the documents, and reorders them by this score in a descending order. This effectively moves the semantically most relevant documents to the top. If this flow is used in RAG to provide some sources to a generative LLM (such as summarizing an answer), then you can rest assured it will work with the right context and provide answer. This works great, but it involves many steps, data massaging, and a complex processing logic with many moving parts. Can we simplify this? Reranking search results tomorrow - with retrievers Let's spend a minute talking about retrievers. Retriever is a new type of abstraction in the _search API, which is more than just a simple query. It's a building block for an end-to-end search flow for fetching hits and potentially modifying the documents' scores and their order. Retrievers can be used in a pipeline pattern, where each retriever unit does something different in the search process. For example we can configure a first-stage retriever to fetch documents, pass the results to a second-stage retriever to combine with other results, trim the number of candidates etc. As a final stage, a retriever can update the relevance score of documents. Soon we'll be adding new reranking capabilities with retrievers, text similarity reranker retriever being the first one. This will perform reranking on top K hits by calling a rerank inference endpoint. The workflow will be simplified into a single API call that hides all the complexity! This is what the previously described multi-stage workflow looks like as a single retriever query: The text_similarity_reranker retriever is configured with the following details: Nested retriever Reranker inference configuration Additional controls, such as minimum score cutoff for eliminating irrelevant hits Below is an example text_similarity_reranker query. Let's dissect it to understand each part better! The request defines a retriever query as the root property. The outermost retriever will execute last, in this case it's a text_similarity_reranker . It specifies a standard first-stage retriever, which is responsible for fetching some documents. The standard retriever accepts an Elasticsearch query , which is a BM25 match in the example. The text similarity reranker is pointed at the text document field that contains the text for semantic reranking. The top 100 documents will be sent for reranking to the cohere-rerank-v3-model rerank inference endpoint we have configured with Cohere. Only those documents will be returned that receive at least 60% relevance score in the reranking process. The response is the exact same structure as that of a search query. The _score property is the semantic relevance score from the reranking process, and _rank refers to the ranking order of documents. Semantic reranking with retrievers will be available shortly in a coming Elastic release. Conclusion Semantic reranking is an incredibly powerful tool for boosting the performance of a search experience or a RAG tool. It can be used as direct inference call, in context of a search experience, or as part of a simplified search flow with retrievers. Users can pick and choose the set of tools that work best for their use case and context. Happy reranking! 😃 Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to What is semantic reranking? Why is semantic reranking important? How do we perform semantic reranking? The Reranking search results today - through your application Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Semantic reranking in Elasticsearch with retrievers - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/semantic-reranking-with-retrievers",
+    "meta_description": "Explore strategies for using semantic reranking to boost the relevance of top search results, including semantic reranking with retrievers."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing Elastic Rerank: Elastic's new semantic re-ranker model Learn about how Elastic's new re-ranker model was trained and how it performs. ML Research TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou On November 25, 2024 Part of Series Semantic reranking & the Elastic Rerank model Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our previous blog in this series we introduced the concept of semantic reranking. In this blog we're going to discuss the re-ranker we've trained and are releasing in technical preview. Introduction One of our purposes at Elastic is lowering the bar to achieve high quality text search. Building on top of Lucene, Elasticsearch provides a rich set of scalable well optimized full-text retrieval primitives including lexical retrieval with BM25 scoring, learned sparse retrieval and a vector database. We've recently introduced the concept of retrievers to the search API, which allow for composable operations including semantic reranking. We're also working towards introducing advanced search pipelines to ES|QL . Starting with our serverless offering, we're releasing in technical preview the Elastic Rerank model. This is a cross-encoder reranking model. Over time we plan to integrate it with our full product suite and provide optimized versions to run on ML nodes in any cluster; exactly as we do for our retrieval model . We've also been working on some exciting new inference capabilities, which will be ideally suited to reranking workloads, so expect further announcements. This first version targets English text reranking and provides some significant advantages in terms of quality as a function of inference cost compared to the other models we evaluated. In this blog post we will discuss some aspects of its architecture and training. But first… How does it compare? In our last blog , we discussed how lexical retrieval with BM25 scoring or (BM25 for short) represents an attractive option in cases where the indexing costs using sparse or dense models would be very high. However, newer methodologies tend to give very significant relevance improvements compared to BM25, particularly for more complex natural language queries. As we've discussed before, the BEIR suite is a high quality and widely used benchmark for English retrieval. It is also used by the MTEB benchmark to assess retrieval quality of text embeddings. It includes various tasks, including open-domain Question Answering (QA) where BM25 typically struggles. Since BM25 represents a cost effective first stage retriever, it is interesting to understand to what extent we can use reranking to “fix up” its relevance as measured by BEIR. In our next blog, we're going to present a detailed analysis of the different high quality re-rankers we include in the table below. This includes more qualitative analysis of their behavior as well as some additional insights into their cost-relevance trade-offs. Here, we follow prior art and describe their effectiveness reranking the top 100 results from BM25 retrieval. This is fairly deep reranking and not something we would necessarily recommend for inference on CPU. However, as we'll show in our next blog, it provides a reasonable approximation of the uplift in relevance you can achieve from reranking. Model Parameter Count Average nDCG@10 BM25 - 0.426 MiniLM-L-12-v2 33M 0.487 mxbai-rerank-base-v1 184M 0.48 monoT5-large 770M 0.514 Cohere v3 n/a 0.529 bge-re-ranker-v2-gemma 2B 0.568 Elastic 184M 0.565 Average nDCG@10 for the BEIR reranking the top 100 BM25 retrieved documents To give a sense of the relative cost-relevance trade-off of the different models we've plotted this table below. Average nDCG@10 for the BEIR reranking the top 100 BM25 retrieved documents. Up and to the left is better For completeness we also show the individual dataset results for Elastic Rerank below. This represents an average improvement of 39% across the full suite. At the time of writing, this puts reranked BM25 around position 20 of all methods on the MTEB leaderboard . All more effective models use large embeddings, with at least 1024 dimensions, and significantly larger models (on average 30⨉ larger than Elastic Rerank). Dataset BM25 nDCG@10 Reranked nDCG@10 Improvement AguAna 0.47 0.68 44% Climate-FEVER 0.19 0.33 80% DBPedia 0.32 0.45 40% FEVER 0.69 0.89 37% FiQA-2018 0.25 0.45 76% HotpotQA 0.6 0.77 28% Natural Questions 0.33 0.62 90% NFCorpus 0.33 0.37 12% Quora 0.81 0.88 9% SCIDOCS 0.16 0.20 23% Scifact 0.69 0.77 12% Touche-2020 0.35 0.36 4% TREC-COVID 0.69 0.86 25% MS MARCO 0.23 0.42 85% CQADupstack (avg) 0.33 0.41 27% *nDCG@10 per dataset for the BEIR reranking the top 100 BM25 retrieved documents using the Elastic Rerank model. Architecture As we've discussed before , language models are typically trained in multiple steps. The first stage training takes randomly initialized model weights and trains on a variety of different unsupervised tasks such as masked token prediction. These pretrained models are then trained on further downstream tasks, such as text retrieval, in a process called fine-tuning. There is extensive empirical evidence that the pre-training process generates useful features which can be repurposed for new tasks in a process called transfer learning. The resulting models display significantly better performance and significantly reduced train time compared to training the downstream task alone. This technique underpins a lot of the post BERT successes of transformer based NLP. The exact pre-training methods and the model architecture affect downstream task performance as well. For our re-ranker we've chosen to train from a DeBERTa v3 checkpoint. This combines various successful ideas from the pre-training literature and has provided state-of-the-art performance as a function of model size on a variety of NLP benchmarks when fine-tuned. To very briefly summarize this model: DeBERTa introduced a disentangled positional and content encoding mechanism that allows it to learn more nuanced relationships between hidden representations of the content and position of other tokens in the sequence. We conjecture this is particularly important for reranking since matching words in the query and document text and comparing their semantics is presumably a key ingredient. DeBERTa v3 adopts the ELECTRA pre-training objective which, GAN style , tries to simultaneously train a model to supply effective fake tokens and learn to recognize those fakes. They also propose a small improvement to parameterisation of this process. If you're interested you can find the details here . For the first version, we trained the base variant of this model family. This has 184M parameters, but since its vocabulary is around 4⨉ the size of BERT's, the backbone is only 86M parameters, with 98M parameters used for the input embedding layer. This means the inference cost is comparable to BERT base. In our next blog we explore optimal strategies for budget constrained reranking. Without going into the details suffice to say we plan to train a smaller version of this model via distillation. Data sets and training Whenever you train a new task on a model there is always a risk that it forgets important information. Our first step training the Elastic Reranker was therefore to make the best attempt to extract relevance judgments from DeBERTa as it is. We use a standard pooling approach; in particular, we add a head that: Computes A ( D ( L ( h ( [ C L S ] ) ) ) A(D(L(h([CLS]))) A ( D ( L ( h ([ C L S ]))) where A A A is a GeLU activation, D D D is a dropout layer and L L L is a linear layer. In pre-training the [ C L S ] [CLS] [ C L S ] token representation, h ( [ C L S ] ) h([CLS]) h ([ C L S ]) , is used for a next sentence classification task. This is well aligned with relevance assessment so its a natural choice to use as the input to the head, Computes the weighted average of the output activations to score the query-document pair. We train the head parameters to convergence, freezing the rest of the model, on a subset of our full training data. This step updates the head to read out what useful information it can for relevance judgments from the pre-trained [ C L S ] [CLS] [ C L S ] token representation. Performing a two step fine-tune like this yielded around a 2% improvement in the final nDCG@10 on BEIR. It is typical to train ranking tasks with contrastive methods. Specifically, a query is compared to one relevant (or positive) and one or more irrelevant (or negative) documents and the model is trained to prefer the relevant one. Rather than using a purely contrastive loss, like maximizing the log probability of the positive document, a strong teacher model can be used to provide ground truth assessment of the relevance of the documents. This choice handles issues such as mislabeling of negative examples. It also provides significantly more information per query than just maximizing the log probability of the relevant document. To train our cross-encoder we use a teacher to supply a set of scores from which we compute a reference probability distribution of the positive and negative documents for each query using the softmax function as follows: P ( q , d ) = e s c o r e ( q , d ) ∑ d ′ ∈ p ∪ N e s c o r e ( q , d ′ ) P(q,d)=\\frac{e^{score(q,d)}}{\\sum_{d'\\in {p}\\cup N} e^{score(q,d')}} P ( q , d ) = ∑ d ′ ∈ p ∪ N ​ e score ( q , d ′ ) e score ( q , d ) ​ Here, q q q is the query text, p p p is the positive text, N N N is the set of negative texts, d ∈ p ∪ N d∈p∪N d ∈ p ∪ N and the s c o r e score score function is the output of the cross-encoder. We minimize the cross-entropy of our cross-encoder scores with this reference distribution. We also tried Margin-MSE loss, which worked well training ELSER, but found cross-entropy was more effective for the reranking task. This whole formulation follows because it is natural to interpret pointwise ranking as assessing the probability that each document is relevant to the query. In this case, minimizing cross-entropy amounts to fitting a probability model by maximum likelihood with the nice properties such an estimator confers. Compared to Margin-MSE, we also think we get gains because cross-entropy allows us to learn something about the relationship between all scores, since it is minimized by exactly matching the reference distribution. This is relevant because, as we discuss below, we train with more than one negative. For the teacher, we use a weighted average ensemble of a strong bi-encoder model and a strong cross-encoder model. We found the bi-encoder provides a more nuanced assessment of the negative examples, which we hypothesize is due to large batch training that contrasts millions of distinct texts per batch. However, the cross-encoder was better at differentiating the positive and negative examples. In fact, we expect there are further improvements to be made in this area. Specifically, for model selection we use a small but effective proxy to a diverse retrieval task and we plan to explore if it is beneficial to use black box optimization of our teacher on this task. The training dataset and negative sampling are critical for model quality. Our training dataset comprises a mixture of open QA datasets and datasets with natural pairs, like article heading and summary. We apply some basic cleaning and fuzzy deduplication to these. Using an open source LLM, we also generated around 180 thousand pairs of synthetic queries and passages with varying degrees of relevance. We used a multi-stage prompting strategy to ensure this dataset covers diverse topics and a variety of query types, such as keyword search, exact phrase matching and short and long natural language questions. In total, our training dataset contains around 3 million queries. It has been generally observed that quality can degrade with the reranking depth. Typically hard negative mining uses shallow sampling of retrieval results: it searches out the hardest negatives for each query. Document diversity increases with retrieval depth and we believe that typical hard negative mining therefore doesn't present the re-ranker with sufficient diversity. In particular, training must demonstrate adequate diversity in the relationship between the query and the negative documents. This flaw isn't solved by increasing overall query and document diversity; training must include negative documents from the deep tail of retrieval results. For this reason, we extract the top 128 documents for each query using multiple methods. We then sample five negatives from this pool of candidates using a probability distribution shaped by their scores. Using this many negatives per query is not typical; however, we found increasing the number of sampled negatives gave us a significant bump in final quality. A nice side effect of using a large and diverse negative set for each query is it should help model calibration. This is a process by which the model scores are mapped to a meaningful scale, such as an estimate of relevance probability. Well calibrated scores provide useful information for downstream processing or directly to the user. They also help with other tasks such as selecting a cutoff at which to drop results. We plan to release some work we've done studying calibration strategies and how effectively they can be supplied to different retrieval and reranking models in a separate blog. Training language models has traditionally required learn rate scheduling to achieve the best possible results. This is the process whereby the multiplier of the step size used for gradient descent is changed as training progresses. It presents some challenges: the total number of training steps must be known in advance; also it introduces multiple additional hyperparameters to tune. Some recent interesting work demonstrated that it is possible to drop learn rate scheduling if you adopt a new weight update scheme that includes averaging of the parameters along the optimization trajectory. We adopted this scheme, using AdamW as the base optimizer, and found it produced excellent results as well as being easy to tune. Summary In this blog we've introduced our new Elastic Rerank model. It is fine-tuned from the DeBERTa v3 base model on a carefully prepared data set using distillation from an ensemble of a bi-encoder and cross-encoder model. We showed that it provides state-of-the-art relevance reranking lexical retrieval results. Furthermore, it does so using dramatically fewer parameters than competitive models. In our next blog post we study its behavior in much more detail and revisit the other high quality models with which we've compared it here. As a result, we're going to provide some additional insights into model and reranking depth selection. Report an issue Related content ML Research Search Relevance October 29, 2024 What is semantic reranking and how to use it? Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou Vector Database Search Relevance +1 May 28, 2024 Semantic reranking in Elasticsearch with retrievers Explore strategies for using semantic reranking to boost the relevance of top search results, including semantic reranking with retrievers. AD NC By: Adam Demjen and Nick Chow Vector Database How To May 14, 2024 Search relevance tuning: Balancing keyword and semantic search This blog offers practical strategies for tuning search relevance that can be complementary to semantic search. KD By: Kathleen DeRusso Elastic Cloud Serverless May 15, 2024 Building Elastic Cloud Serverless Explore the architecture of Elastic Cloud Serverless and key design and scalability decisions we made along the way of building it. JT By: Jason Tedor ES|QL Python August 20, 2024 An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus Use Elasticsearch Query Language (ES|QL) to run statistical analysis on demographic data index in Elasticsearch. BA By: Baha Azarmi Jump to Introduction How does it compare? Architecture Data sets and training Summary Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing Elastic Rerank: Elastic's new semantic re-ranker model - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-semantic-reranker-part-2",
+    "meta_description": "Learn about how Elastic's new re-ranker model was trained and how it performs."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Exploring depth in a 'retrieve-and-rerank' pipeline Select an optimal re-ranking depth for your model and dataset. ML Research TP TV QH By: Thanos Papaoikonomou , Thomas Veasey and Quentin Herreros On December 5, 2024 Part of Series Semantic reranking & the Elastic Rerank model Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this last blog post in our series we explore in detail the characteristics of various high quality re-rankers including our own Elastic Rerank model. In particular, we focus on qualitative and quantitative evaluation of retrieval quality as a function of reranking depth. We provide some high-level guidelines for how to select reranking depth and recommend reasonable defaults for the different models we tested. We employ a \"retrieve-and-rerank\" pipeline using BM25 as our first stage retriever. We focus on English language text search and use BEIR to benchmark our end-to-end accuracy. Summary Below we show that end-to-end relevance follows three broad patterns as a function of re-ranking depth: Fast increase followed by saturation Fast increase to a maximum then decay Steady decay with any amount of re-ranking For the re-rankers and datasets tested pattern 1 accounted for around 72.6% of all results, followed by pattern 2 (20.2%) and then pattern 3 (7.1%). Unsurprisingly the overall strongest re-rankers, such as Elastic Rerank, display the most consistent improvements with re-ranking depth. We propose a simple model which explains the curves we observe and show it provides a surprisingly good fit across all datasets and re-rankers we tested. This suggests that the probability of finding a positive document at a given depth in the retrieval results follows a Pareto distribution . Furthermore, we can think of the different patterns being driven by the fraction of relevant (or positive) documents the re-ranker can detect and the fraction of irrelevant (or negative) documents it mistakenly identifies as relevant. We also study effectiveness versus efficiency as a mechanism to choose the re-ranking depth and to perform model selection. In the case there is no hard efficiency constraint, as a rule-of-thumb we pick the depth that attains 90% of the maximum effectiveness. This yields a 3× improvement in compute cost compared to maximizing effectiveness, so we feel it represents a good efficiency tradeoff. For our benchmark, the 90% rule suggests one should re-rank around 100 pairs from BM25 on average, although stronger models give more benefit from deeper re-ranking. We also observe an important first stage retrieval effect. For some data sets we study the retriever recall saturates at a relatively low depth. In those scenarios we see significantly shallower maximum and 90% effectiveness depths. In the realistic scenarios there are efficiency constraints, such as the maximum permitted query latency or compute cost. We propose a scheme to simultaneously select the model and re-ranking depth subject to an efficiency constraint. We find that when efficiency is at a premium, deep re-ranking with small models tends to out perform shallow re-ranking with larger higher quality models . This pattern reverses as we relax the efficiency constraint. We also find that Elastic Rerank provides state-of-the-art effectiveness versus efficiency, being optimal for nearly all constraints we test. For our benchmark we found re-ranking around the top 30 results from BM25 represented a good choice when compute cost is important. The re-rankers For this investigation we evaluate a collection of re-rankers of different sizes and capabilities. More specifically: Elastic Rerank: This was trained from a DeBERTa V3 base model checkpoint. We discussed some aspects of its training in our last post . It has roughly 184M parameters (86M in the \"backbone\" + 98M in the input embedding layer). The large embedding parameter count is because the vocabulary is roughly 4× the size of BERT. bge-reranker-v2-gemma : This is an LLM based re-ranker trained on one of the Google Gemma series of models. It’s one of the strongest re-ranking models available and has around 2B parameters. monot5-large : This is a model trained on the MS MARCO passage dataset using T5-large as backbone. At the time of release it demonstrated state-of-the-art zero shot performance and it’s still one of the strongest baselines. It has around 770M parameters. mxbai-rerank-base-v1 : This model is provided by Mixedbread AI and according to the company’s release post it was trained by a) first collecting the top-10 results from search engines for a large number of queries, b) asking a LLM to judge the results for their relevance to the query and finally c) using these examples for training. The model uses the same DeBERTa architecture as the Elastic Reranker. MiniLM-L12-v2 : This is a cross-encoder model trained on the MS MARCO passage ranking task. It follows the BERT architecture with around 33M parameters. Cohere-v3 : This is an efficient and high quality commercial re-ranker provided by Cohere. No further information is available regarding the parameter count for this model. Oracle In the graphs below we include the performance of an \"oracle\" that has access to the relevance judgments ( qrels ) per dataset and thus can sort the documents by their relevance score descending. This puts any relevant document before any irrelevant document and higher relevance documents higher than lower relevance ones. These data points represent the performance of the ideal re-ranker (assuming perfect markup) and quantify the available space of improvement for the re-ranking models. It also captures the dependence of the end-to-end accuracy on the first stage retriever as the re-ranker only has visibility over the items that the retriever returns. Main patterns We use nDCG@10 as our evaluation metric, which is the standard in the BEIR benchmark, and we plot these scores as a function of the re-ranking depth. Re-ranking depth is the number of candidate documents retrieved by BM25 and subsequently sent to the re-ranker. Since we are using nDCG@10 the score is affected only when a document that is found lower in the retrieved list is placed in the top-10 list by the re-ranker. In this context, it can either increase nDCG@10 if it is relevant or it can evict a relevant document and decrease nDCG@10. In the following we describe the main patterns that we identified in these graphs across the different combinations of datasets and models we tested. We present them in decreasing order of frequency and provide some possible explanations of the observed behavior. \"Pareto\" curve This accounts for most of the cases that we see. It can be divided into three phases as follows: Phase A: A rapid increase which takes place mostly at smaller depths (< 100) Phase B: Further improvements at a smaller rate Phase C: A \"plateau\" in performance Below you can see runs from DBpedia and HotpotQA , where the black dashed horizontal line depicts the nDCG@10 score of BM25 Figure 1 : nDCG@10 as a function of reranking depth on DBpedia Figure 2 : nDCG@10 as a function of reranking depth on HotpotQA Discussion A monotonically increasing curve has a simple explanation: as we increase the re-ranking depth, the first-stage retriever provides a larger pool of candidates to the next stage so that the re-ranker models can identify additional relevant documents and place them high in the result list. Based on the shape of these curves, we hypothesize that the rate at which we discover positives as a function of depth follows a power law. In particular, if we assume that the re-ranker moves each positive into the top-10 list, nDCG@10 will be related to the count of positives the retriever returns in total for a given depth. Therefore, if our hypothesis is correct its functional form would be related to the cumulative density function (CDF) of a power law. In the following, we fit a scaled version of a generalized Pareto CDF to the nDCG@10 curves to test this hypothesis. Below you can see some examples of fitted curves applied to a selection of datasets ( FiQA , Natural Questions , DBpedia and HotpotQA ) using different re-rankers. Figure 3 : Curve fitting of the nDCG graph using a generalized Pareto CDF Visually it is clear that the generalized Pareto CDF is able to fit the observed curves well, which supports our hypothesis. Since we don’t match the performance of the oracle the overall behavior is consistent with the model having some false negative (FN) fraction, but a very low false positive (FP) fraction: adding more examples will occasionally shuffle an extra positive to the top, but won’t rank a negative above the positives found so far. \"Unimodal\" curve This family of graphs is characterized by the following phases: Phase A: Rapid increase until the peak Phase B: Performance decrease at a smaller rate Below you can see two examples of this pattern: one when the MiniLM-L12-v2 model is applied on to the TREC-COVID dataset and a second when the mxbai-rerank-base-v1 model is applied to the FEVER dataset. In both cases, the black dashed line represents the performance of the BM25 baseline Figure 4 : Two cases of the \"unimodal\" pattern Discussion This sort of curve would be explained by exactly the same Pareto rate of discovery of extra relevant documents. However, it also appears there is some small non-zero FP fraction. Since the rate of discovery of additional relevant documents decreases monotonically, at a certain depth the rate of discovery of relevant documents multiplied by the true positive (TP) fraction will equal the rate of discovery of irrelevant documents multiplied by the FP fraction and the nDCG@10 will have a unique maximum. Thereafter, it will decrease because in aggregate re-ranking will push relevant documents out of the top-10 set. There are some likely causes for the presence of a non-zero FP rate: Incomplete markup : In other words the model surfaces items which are actually relevant, but not marked as such which penalizes the overall performance. This is something we have investigated in a previous blog . Re-ranker training : Here, we broadly refer to issues that have to do with the training of the re-ranker. One possible explanation is provided in this paper by Gao et al. where the authors emphasize the importance of tailoring a re-ranker to the retriever because there might be cases where false positives at lower positions share confounding characteristics with the true positives which ultimately \"confuses\" the re-ranker. However, we note that this pattern is more common for overall weaker re-ranking models. As we discussed in our previous blog , a potential mitigation for training issues in a zero-shot setting is to ensure that we present sufficiently diverse negatives to cover a broad set of possible confounding features. In other words, it could be the case that models which exhibit these problems haven’t mined enough deep negatives for training and thus deeper retrieval results are effectively \"out-of-domain\". Note that there are some edge cases where it’s hard to distinguish between the \"Pareto\" and \"Unimodal\" patterns. This happens when the peak in performance is achieved earlier than the maximum depth but the performance decrease is marginal. Based on the terminology used so far this would qualify as a \"Unimodal\" case. To address this, we introduce this extra rule: we label curves as \"Pareto\" if their nDCG gain at maximum depth is ≥ 95% of the maximum nDCG gain and \"Unimodal\" otherwise. Bad fit This category comprises all cases where the application of a re-ranker does not bring a performance benefit at any depth compared to BM25. On the contrary, we observe a continuous degradation as we re-rank more documents. As an example we can take ArguAna, which is a particularly challenging task in BEIR as it involves the retrieval of the best counterargument to the input. This is not a typical IR scenario and some studies even consider reporting results without it. We experimented with different re-rankers (even with some that didn’t make it into the final list) and we observed that many of them ( Cohere-v3 , bge-reranker-v2-gemma and Elastic Rerank being the only exceptions ) exhibited the same pattern. Below we show the results for monot5-large . Figure 5 : A \"bad-fit\" example Discussion We propose two possible explanations: The re-ranker could be a bad fit for the task at hand, which is sufficiently out of the training domain that its scoring is often incorrect, The re-ranker could just be worse than BM25 for the particular task. BM25 is a strong zero-shot baseline, particularly for certain query types such as keyword searches, because it relies on lexical matching with scoring tailored to the whole corpus. Overview of patterns Overall, the distribution of the patterns ( P → \"Pareto\" curve, U → \"Unimodal\" curve, B → \"Bad fit\") across all scenarios is as follows: Figure 6 : Distribution of patterns across all scenarios Regarding the \"Pareto\" pattern which is by far the most common, we note some observations from relevant works. First, this paper from Naver Labs presents results which are in line with our findings. There, the authors experiment with 3 different ( SPLADE ) retrievers and two different cross-encoders and test the pipeline on TREC-DL 19, 20 and BEIR. They try three different values for the re-ranking depth (50, 100 and 200) and the results show that in the majority of the cases the performance increases at a rapid pace at smaller depths (i.e. 50) and then almost saturates. A relevant result is also presented in this blog post from Vespa where the author employs a \"retrieve-and-rerank\" pipeline using BM25 and ColBERT on the MLDR dataset and finds that the nDCG metric can be improved significantly by re-ordering just the top ten documents. Finally, in this paper from Meng et al. we observe similar results when two retrieval systems (BM25 and RepLLaMA) are followed by a RankLLaMA re-ranker. The authors perform experiments on the TREC DL19 and 20 datasets investigating 8 Ranked List Truncation (RLT) methods, one of which is \"Fixed-k\" that aligns with our setup. In none of these works do the authors identify an explicit underlying process that could explain the observed nDCG curve. Since we found the behavior was consistent with our simple model across different datasets and re-rankers this feels like it warrants further investigation. Some characteristics of the other retrieval tasks that could also explain some of these results: ArguAna and Touche-2020 , both argument retrieval datasets, present the most challenging tasks for the models we consider here. An interesting related analysis can be found in this paper by Thakur et al. where the authors discuss the reduced effectiveness of neural retrieval models in Touche-2020 especially when compared to BM25. Even though the paper considers a single retrieval step we think that some of the conclusions might also apply to the \"retrieve-and-rerank\" pipeline. More concretely, the authors reveal an inherent bias of neural models towards preferring shorter passages (< 350 words) in contrast to BM25 which retrieves longer documents (>600 words) mimicking the oracle distribution better. In their study, even after \"denoising\" the dataset by removing short docs (less than 20 words) and adding post-hoc relevance judgments to tackle the small labeling rate BM25 continues to outperform all the retrieval models they tested. Scifact and FEVER are two datasets where two of the \"smaller\" models follow \"unimodal\" patterns. Both are fact verification tasks which require knowledge about the claim and reasoning over multiple documents. On Scifact it is quite important for the retriever to be able to access scientific background knowledge and make sense of specialized statistical language in order to support or refute a claim. From that perspective smaller models with less internal \"world\" knowledge might be at disadvantage. According to our previous study TREC-COVID has a large labeling rate i.e. for >90% of the retrieved documents there is a relevance judgment (either positive or negative). So, it’s the only dataset where incomplete markup is not likely a problem. BM25 provides very good ranking for Quora , which is a \"duplicate questions\" identification task. In this particular dataset, queries and documents are very short - 90% of the documents (queries) are less than 19 (14) words - and the Jaccard similarity across queries and their relevant counterparts is quite high, a bit over 43%. This could explain why certain purely semantic re-rankers can fail to add value. Understanding scores as a function of depth So far we treated a re-ranker model as though it were a classifier and discussed its performance in terms of its FN and FP rates. Clearly, this is a simplification since it outputs a score which captures some estimate of the relevance of each document to the query. We return to the process of creating interpretable scores for a model, which is called calibration, in a separate upcoming blog post. However, for our purposes here we would like to understand the general trends in the score as a function of depth because it provides further insight into how the nDCG@10 evolves. In the following figures we split documents by their judgment label and plot the average positive and negative document scores as a function of depth for examples from the three different patterns we identified. We also show one standard deviation confidence intervals to give some sense of the overlap of score distributions. For the Pareto pattern we see positive and negative scores follow a very similar curve as depth increases. (The negative curve is much smoother because there are many more negatives at any given depth.) They start higher, a regime which corresponds to excellent matches and very hard negatives, then largely plateau. Throughout the score distributions remain well separated, which is consistent with a FP fraction which is essentially zero. For the unimodal pattern there is a similar decay of scores with depth, but we also see noticeably more overlap in the score distributions. This would be consistent with a small but non-zero FP fraction. Finally, for the bad fit pattern we see that scores are not separated. Also there is no significant decrease in both the positive and negative scores with depth. This is consistent with the re-ranker being a bad fit for that particular retrieval task since it appears to be unable to reliably differentiate positives and negatives sampled from any depth. Figure 7 : Positive and negative scores as a function of re-ranking depth for Elastic Rerank on HotpotQA. The bars correspond to ± 1 standard deviation intervals Figure 8 : Positive and negative scores as a function of re-ranking depth for mxbai-rerank-base-v1 on FEVER. The bars correspond to ± 1 standard deviation intervals Figure 9 : Positive and negative scores as a function of re-ranking depth for monot5-large on ArguAna. The bars correspond to ± 1 standard deviation intervals Finally, note that the score curves for the unimodal pattern hints that one may be able to find a cut off score which results in a higher FN fraction but essentially zero FP fraction. If such a threshold can be found it would allow us to avoid the relevance degrading with re-ranking depth while still being able to retain a portion of the extra relevant documents the retriever surfaces. We will return to this observation in an upcoming blog post when we explore model calibration. Efficiency vs effectiveness In this section we focus on the trade-off between efficiency and effectiveness and provide some guidance on picking optimal re-ranking depths. At a high level, effectiveness refers to the overall gain in relevance we attain as we retrieve and re-rank more candidates, while efficiency focuses on minimizing the associated cost. Efficiency can be expressed in terms of different dimensions with some common choices being: Latency, which is usually tied to an SLA on the query duration. In other words, we may only be allowed a fixed upper wall time for re-scoring (query, document) pairs, and Infrastructure cost, which refers to the number of CPUs/GPUs needed to keep up with the query rate or the total compute time required to run all queries in a pay-as-you-go setting. We note that efficiency is also wrapped up with other considerations such as the ability to run the model at lower precision, the ability to use more efficient kernels and so on, which we do not study further. Here, we adopt a simplified setup where we focus solely on the latency dimension. Obviously, in a real-world scenario one could easily trade cost (i.e. by increasing the number of CPUs/GPUs and parallelising inference) to achieve lower latency, but for the rest of the analysis we assume fixed infrastructure. Cohere v3 is excluded from this experimentation as it is an API-based service \"Latency-free\" analysis We start our analysis by considering each (model, dataset) pair in isolation ignoring the latency dimension. We are interested in the evolution of the nDCG gain (nDCG score at depth k minus the nDCG score of BM25) and we pick two data points for further analysis: The maximum gain depth, which is the re-ranking depth where the nDCG gain is maximized, and The 90%-depth, which corresponds to the depth where we first attain 90% of the maximum gain. This can be seen as a trade-off between efficiency and effectiveness as we get most of the latter at a smaller depth. We calculate these two quantities across a selection of datasets. Dataset DBPedia HotpotQA FiQA Quora TREC-COVID Climate-FEVER Model max 90% max 90% max 90% max 90% max 90% max 90% bge-reranker-v2-gemma 300 150 400 180 390 140 350 40 110 50 290 130 monot5-large 350 100 400 100 400 130 80 20 110 60 280 60 MiniLM-L12-v2 340 160 400 120 400 80 20 20 50 50 280 50 mxbai-rerank-base-v1 290 140 90 30 400 70 0* 0* 110 50 290 120 Elastic Rerank 350 140 400 160 400 130 180 30 220 50 400 170 Cohere v3 300 100 400 130 400 130 30 20 270 50 290 70 Table 1: Max-gain and 90%-gain depths for different models and datasets. The \"0* - 0*\" entry for `mxbai-rerank-base-v1` on `Quora` indicates that the model does not provide any gain over BM25. If we group by the re-ranker model type and average, it gives us Table 2. We have omitted the ( Quora , mxbai-rerank-base-v1 ) pair as it corresponds to a bad-fit case. Model Average of maximum gain depth Average of 90%-to-max gain depth bge-reranker-v2-gemma 306.7 115 monot5-large 270 78.3 MiniLM-L12-v2 248.3 80 mxbai-rerank-base-v1 236 82 Elastic Rerank 325 113.3 Cohere v3 281.7 83.3 Table 2: Average values for the depth of maximum gain and depth for 90% of maximum gain per model. We observe that: More effective models such as Elastic Rerank and bge-reranker-v2-gemma reach a peak performance at larger depths, taking advantage of more of the available positives, while less effective models \"saturate\" faster. Obtaining 90% of the maximum gain is feasible at a much smaller depth in all scenarios: on average we have to re-rank 3× fewer pairs. A re-ranking depth of around 100 would be a reasonable choice for all the scenarios considered. Alternatively, if we group by dataset and average we get Table 3. Model Average of maximum gain depth Average of 90%-to-max gain depth DBPedia 321.7 131.7 HotpotQA 348.3 120 FiQA 398.3 113.3 Quora 132 26 TREC-COVID 145 51.7 Climate-FEVER 305 100 Table 3: Average values per dataset. There are two main groups: One group where the maximum gain depth is on average larger than 300. In this category belong DBpedia , HotpotQA , FiQA and Climate-FEVER . Another group where the maximum gain depth is significantly smaller - between 100 and 150 - containing Quora and TREC-COVID . We suggest that this behavior can be attributed to the performance of the first stage retrieval, in this case BM25. To support this claim, we plot the nDCG graphs of the \"oracle\" below. As we know the nDCG metric is affected by a) the recall of relevant documents and b) their position in the result list. Since the \"oracle\" has perfect information regarding the relevance of the retrieved documents, its nDCG score can be viewed as a proxy for the recall of the first-stage retriever. Figure 10 : nDCG@10 curves for the \"oracle\" across different datasets In this figure we see that for Quora and TREC-COVID the nDCG score rises quite fast to the maximum (i.e. 1.0) while in the rest of the datasets the convergence is much slower. In other words when the retriever does a good job surfacing all relevant items at shallower depths then there is no benefit in using a large re-ranking depth. \"Latency-aware\" analysis In this section we show how to perform simultaneous model and depth selection under latency constraints. To collect our statistics we use a VM with 2 NVIDIA T4 GPUs. For each dataset we measure the total re-ranking time and divide it by the number of queries in order to arrive into a single quantity that represents the time it takes to re-score 10 (query, document) pairs We assume the cost is linearly proportional to depth, that is it takes s seconds to re-rank 10 documents, 2×s to re-rank 20 documents and so on. The table below shows examples from HotpotQA and Climate-FEVER with each entry the number of seconds required to re-score 10 (query, document) pairs. Model MiniLM-L12-v2 mxbai-rerank-base-v1 Elastic Rerank monot5-large bge-reranker-v2-gemma HotpotQA 0.02417 0.07949 0.0869 0.21315 0.25214 Climate-FEVER 0.06890 0.23571 0.23307 0.63652 0.42287 Table 4: Average time to re-score 10 (query, doc) pairs on HotpotQA & Climate-FEVER Some notes: mxbai-rerank-base-v1 and Elastic Rerank have very similar running times because they use the same \"backbone\" model, DeBERTa In most datasets monot5-large and bge-reranker-v2-gemma have similar run times even though monot5-large only has 1 / 3 the parameter count. There are two possible contributing factors: For bge-reranker-v2-gemma we used bfloat16 while we kept float precision for monot5-large , and The Gemma architecture is able to better utilize the GPUs. T-shirt sizing The run times for different datasets can vary a lot due to the fact that queries and documents follow different length distributions. In order to establish a common framework we use a \"t-shirt\" approach as follows: We define the \"Small\" size as the time it takes the most efficient model (here MiniLM-L-12-v2 ) to reach 90% of its maximal gain, similar to our proposal in the previous section, We set other sizes in a relative manner, e.g. \"Medium\" and \"Large\" being 3x and 6x times the \"Small\" latency, respectively. The model and depth selection procedure is best understood graphically. We create graphs as follows: On the X-axis we plot the latency and on the Y-axis nDCG@10 The data points correspond to increments of 10 in the re-ranking depth, so more efficient models have a higher density of points in latency The vertical lines show the latency thresholds associated with the different \"t-shirt\" sizes For each model we print the maximum \"permitted\" re-ranking depth. This is the largest depth whose latency is smaller than the threshold For each \"t-shirt size\" we simply pick the model and depth which maximizes the nDCG@10. This is the model whose graph has the highest intercept with the corresponding threshold line. The optimal depth can be determined by interpolation. Figure 11 : nDCG@10 as a function of latency for Climate-FEVER Figure 12 : nDCG@10 as a function of latency for DBPedia Figure 13 : nDCG@10 as a function of latency for FiQA Figure 14 : nDCG@10 as a function of latency for HotpotQA Some observations: There are instances where the larger models are not eligible under the \"Small\" threshold like in the case of bge-reranker-v2-gemma and monot5-large on Climate-FEVER . MiniLM-L-12-v2 provides a great example of how a smaller model can take advantage of its efficiency to \"fill the gap\" in terms of accuracy, especially for a low latency constraint. For example, on FiQA , under the \"Small\" threshold, it achieves a better score compared to bge-reranker-v2-gemma and mxbai-rerank-base-v1 even though both models are more effective eventually. This happens because MiniLM-L-12-v2 can process many more documents (80 vs 10,20 respectively) for the same cost. It’s common for less effective models to saturate faster which makes it feasible for \"stronger\" models to surpass them even when employing a small re-ranking depth. For example, on Climate-FEVER under the \"Medium\" budget the bge-reranker-v2-gemma model can reach a maximum depth of 20, which is enough for it to place second ahead of MiniLM-L-12-v2 and mxbai-rerank-base-v1 . The Elastic Rerank model provides the optimal tradeoff between efficiency and effectiveness when considering latency values larger than a minimum threshold. The table below presents a) the maximum permitted depth and b) the relative increase in the nDCG score (compared to BM25) for the three latency constraints applied to 5 datasets for the Elastic Rerank model. T-shirt size Small Medium Large Dataset Depth nDCG increase (%) Depth nDCG increase (%) Depth nDCG increase (%) DBPedia 70 37.6 210 42.43 400 45.7 Climate-FEVER 10 31.82 40 66.72 80 77.25 FiQA 20 44.42 70 73.13 140 80.49 HotpotQA 30 21.31 100 28.28 200 31.41 Natural Questions 30 70.03 80 88.25 180 95.34 Average 32 41.04 100 59.76 200 66.04 Table 5: The maximum permitted depth & associated nDCG relative increase for the Elastic Rerank model in different scenarios We can see that a tighter budget (\"Small\" size scenario) allows only for the re-ranking of a few tens of documents, but that is enough to give a significant uplift (>40%) on the nDCG score. Conclusions In this last section we summarize the main findings and provide some guidance on how to select the optimal re-ranking depth for a given retrieval task. Selecting a threshold Selecting a proper re-ranking depth can have a large effect on the performance of the end-to-end system. Here, we considered some of the key dimensions that can guide this process. We were interested in approaches where a fixed threshold is applied across all queries, i.e. there is no variable-length candidate generation on a per query basis as for example in this work . For the re-rankers we tested we found that the majority of the gain is obtained with shallow re-ranking. In particular, on average we could achieve 90% of maximum possible nDCG@10 gain re-ranking only 1/3 the number of results. For our benchmark this translated to an average re-ranking around the top 100 documents when using BM25 as a retriever. However, there is some nuance: the better the first stage retriever the fewer the candidates you need to re-rank, conversely better re-rankers benefit more from re-ranking deeper. There are also failure modes: we see effectiveness both increase to a maximum then decrease and also decrease with any re-ranking for certain models and retrieval tasks. In this context, we found more effective models are significantly less likely to ‘misbehave’ after a certain depth. There is other work that reports similar behavior. Computational budget and non-functional requirements We explored the impact of computational budget on re-ranking depth selection. In particular, we defined a procedure to choose the best re-ranking model and depth subject to a cost constraint. In this context, we found that the new Elastic Rerank model provided excellent effectiveness across a range of budgets for our benchmark. Furthermore, based on these experiments we’d suggest re-ranking the top 30 results from BM25 with the Elastic Rerank model when cost is at a premium. With this choice we were able to achieve around a 40% uplift in nDCG@10 on the QA portion of our benchmark. We also have some qualitative observations: Re-ranking deeper with a more efficient model is often the most cost effective strategy. We found `MiniLM-L12-v2 was consistently a strong contender on a budget, More efficient models usually saturate faster which means that more effective models can quickly \"pick-up\". For example, for DBpedia and HotpotQA Elastic Rerank at depth 50 is better or on par with the performance of MiniLM-L12-v2 at depth 400. Relevance dataset Ideally, model and depth selection is based on relevance judgements for your own corpus. The existence of an evaluation dataset allows you to plot the evolution of retrieval metrics, such as nDCG or recall, allowing you to make an informed decision regarding the optimal threshold under desired computational cost constraints. These datasets are usually constructed as manual annotations from domain experts or through proxy metrics based on past observations, such as Click-through Rate (CTR) on historical search results. In our previous blog we also showed how LLMs can be used to produce automated relevance judgments lists that are highly correlated with human annotations for natural language questions. In the absence of an evaluation dataset, whatever your budget, we’d recommend starting with smaller re- ranking depths as for all the model and task combinations we evaluated this achieved the majority of the gain and also avoided some of the pathologies where quality begins to degrade. In this case you can also use the general guidelines we derived from our benchmark since it covers a broad range of retrieval tasks. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Summary The re-rankers Oracle Main patterns \"Pareto\" curve Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Exploring depth in a 'retrieve-and-rerank' pipeline - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-semantic-reranker-part-3",
+    "meta_description": "Learn how to select an optimal reranking depth for your model and dataset. We'll recommend reasonable defaults for the different models we tested. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blogs Developer insights and practical how-to articles from our experts to inspire and empower your search experience Articles Series Advanced RAG techniques In this series, we'll discuss and implement techniques that may increase RAG performance. Elasticsearch geospatial search This series covers how to use the new geospatial search features in ES|QL, including how to ingest geospatial data and how to use it in ES|QL queries. Elasticsearch in JavaScript the proper way Learn how to create a production-ready Elasticsearch backend in JavaScript, follow best practices, and run the Elasticsearch Node.js client in Serverless environments. Evaluating search relevance Blog posts discussing how to think about evaluating your own search systems in the context of better understanding the BEIR benchmark. We will introduce specific tips and techniques to improve your search evaluation processes. GenAI for customer support This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! How to ingest data from AWS S3 into Elastic Cloud Learn about different ways you can ingest data from AWS S3 into Elastic Cloud. Improving information retrieval in the Elastic Stack This series explores steps to improve search relevance, benchmarking passage retrieval, ELSER, and hybrid retrieval. Indexing OneLake data into Elasticsearch Learn how to connect to OneLake and index documents into Elasticsearch. Then, take the configuration one step further by developing your own OneLake connector. Integration tests using Elasticsearch This series demonstrates improvements for integration tests using Elasticsearch and advanced techniques to further reduce execution time in Elasticsearch integration tests. Introducing LangChain4j: Building RAG apps in plain Java Introducing LangChain4j (LangChain for Java). Discover how to use it to build your RAG application in plain Java. Jira connector tutorials Learn how to integrate Elasticsearch with Jira using Elastic’s Jira native connector and explore optimization techniques. Semantic reranking & the Elastic Rerank model Introducing the concept of semantic reranking and Elastic Rerank, Elastic's new semantic re-ranker model. The ColPali model series Introducing the ColPali model, its implementation in Elasticsearch, and how to scale late interaction models for large-scale vector search. The Spotify Wrapped series Here's how to create your own Spotify Wrapped in Kibana and dive deep into your data. Using the Elasticsearch Go client for keyword search, vector search & hybrid search This series explains how to use the Elasticsearch Go client for traditional keyword search, vector search and hybrid search. Vector search introduction and implementation This series dives into the intricacies of vector search, how it is implemented in Elasticsearch, and how to run hybrid search queries in Elasticsearch. Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Blogs - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog?tab=series",
+    "meta_description": "Blog articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Using CrewAI with Elasticsearch Learn how to create an Elasticsearch agent with CrewAI for your agent team and perform market research. Integrations How To JR By: Jeffrey Rengifo On April 8, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. CrewAI is a framework for orchestrating agents that uses role-playing for them to work together on complex tasks. If you want to read more about agents and how they work, I recommend you read this article. Image Source: https://github.com/crewAIInc/crewAI CrewAI claims to be faster and simpler than similar frameworks like LangGraph since it does not need as much boilerplate code or additional code to orchestrate agents, like Autogen. Additionally, Langchain tools are compatible with CrewAI, opening many possibilities. CrewAI has a variety of use cases, including research agents, stock market analysis, lead catchers, contract analysis, website generation, travel recommendations, etc. In this article, you´ll create an agent that uses Elasticsearch as a data search tool to collaborate with other agents and conduct market research on our Elasticsearch products. Based on a concept like summer clothes , an expert agent will search in Elasticsearch for the most semantically similar products, while a researcher agent will search online for websites and products. Finally, a writer agent will combine everything into a market analysis report. You can find a Notebook with the complete example here . To get the crew agent functioning, complete the following steps: Steps Install and import packages Prepare data Create Elasticsearch CrewAI tool Configure Agents Configure tasks Install and import packages We import SerperDevTool to search on the internet for websites related to our queries using the Serper API, and WebsiteSearchTool to do a RAG search within the found content. Serper provides 2,500 free queries you can claim here. Prepare data Elasticsearch client Create inference endpoint To enable semantic search capabilities, you need to create an inference endpoint using ELSER: Create mappings Now, we are going to apply the ELSER model into a single semantic_text field to enable the agent to run hybrid queries. Index data We are going to store some data about clothes so we can compare our source with the information the researcher agent can find on the internet. Create Elasticsearch CrewAI tool The CrewAI’s tool decorator simplifies turning regular Python functions into tools that agents can use. Here's how we create an Elasticsearch search tool: Import other needed tools and credentials Now, we instantiate the tools we prepared at the beginning to search on the internet and then do RAG within the found content. You also need an OpenAI API Key for the LLM communication. Configure Agents Now, you need to define the agents: Retriever : able to search in Elasticsearch using the tool created before. Researcher : uses search_tool to search on the internet. Writer : summarizes the info from the other two agents into a Markdown blog file. Configure tasks Now that you have defined the agents and tools, you need to create tasks for each agent. You will specify the different tasks to include content sources so the writer agent can quote them to make sure both the retriever and researcher agent are contributing with information. Now, you only need to instance the crew with all agents and tasks and run them: We can see the result in the new_post.md file: “**Short Report on Fashion Trends and Product Alignment** In this report, we will explore how the current fashion trends for summer 2025 align with the offerings in our store, as evidenced by product listings from [Elasticsearch]. The analysis focuses on five prominent trends and identifies specific products that reflect these aesthetics. **1. Romantic Florals with Kitsch Twist** The resurgence of floral patterns, particularly those that are intricate rather than large, embodies a whimsical approach to summer fashion. While our current inventory lacks offerings specifically featuring floral designs, there is an opportunity to curate products that align with this trend, potentially expanding into tops or dresses adorned with delicate floral patterns. [Source: Teen Vogue] **2. High Waisted and Baggy Silhouettes** High-waisted styles are a key trend for summer 2025, emphasizing comfort without sacrificing style. Among our offerings, the following products fit this criterion: - **Baggy Fit Cargo Shorts** ($20.99): These cargo shorts present a relaxed, generous silhouette, complementing the cultural shift towards practical fashion that allows ease of movement. - **Twill Cargo Shorts** ($20.99): These fitted options also embrace the high-waisted trend, providing versatility for various outfits. **3. Bold Colors: Turquoise and Earthy Tones** This summer promises a palette of vibrant turquoise alongside earthy tones. While our current collection does not showcase products that specifically reflect these colors, introducing pieces such as tops, dresses, or accessories in these hues could strategically cater to this emerging aesthetic. [Source: Heuritech] **4. Textured Fabrics** As textured fabrics gain popularity, we recognize an opportunity in our offerings: - **Oversized Lyocell-blend Dress** ($38.99): This dress showcases unique fabric quality with gathered seams and balloon sleeves, making it a textural delight that speaks to the trend of tactile experiences in fashion. - **Twist-Detail Crop Top** ($34.99): Featuring gathered side seams and a twist detail, it embraces the layered, visually engaging designs consumers are seeking. **5. Quiet Luxury** Quiet luxury resonates with those prioritizing quality and sustainability over fast fashion. Our offerings in this category include: - **Relaxed Fit Linen Resort Shirt** ($17.99): This piece’s breathable linen fabric and classic design underline a commitment to sustainable, timeless pieces that exemplify understated elegance. In conclusion, our current product listings from [Elasticsearch] demonstrate alignment with several key summer fashion trends for 2025. There are unique opportunities to further harness these trends by expanding our collection to include playful floral designs and vibrant colors. Additionally, leveraging the existing offerings that emphasize comfort and quality can enhance our customer appeal in the face of evolving consumer trends. We are well positioned to make strategic enhancements to our inventory, ensuring we stay ahead in the fast-evolving fashion landscape.” Conclusion CrewAI simplifies the process of instantiating an agent workflow with role-playing and supports Langchain tools, including custom tools, making their creation easier with abstractions like the tool decorator. This agent crew demonstrates the ability to execute complex tasks that combine local data sources and internet searches. If you want to continue improving this workflow, you could try creating a new agent to write the writer_agent results into Elasticsearch! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Install and import packages Prepare data Elasticsearch client Create inference endpoint Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using CrewAI with Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/using-crewai-with-elasticsearch",
+    "meta_description": "Learn how to create an Elasticsearch agent with CrewAI for your agent team and perform market research."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Indexing OneLake data into Elasticsearch - Part II Second part of a two-part article to index and search OneLake data into Elastic using a Custom connector. Integrations Ingestion How To GL JR By: Gustavo Llermaly and Jeffrey Rengifo On January 24, 2025 Part of Series Indexing OneLake data into Elasticsearch Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In this article, we'll put to use what we learned in part 1 to create a OneLake custom Elasticsearch connector. We have already uploaded some OneLake documents and indexed them into Elasticsearch for search. However, this only works with a one time upload. If we want to have synchronized data, then we need to develop a more complex system. Luckily, Elastic has a connectors framework available to develop custom connectors to fit our needs: We'll make now make a OneLake connector based on this article: How to create custom connectors for Elasticsearch Steps Connector bootstrapping Implementing BaseDataSource class Authentication Running the connector Configuring schedule Connector bootstrapping For context, there are two types of Elastic connectors: Elastic managed connector : Fully managed and run in Elastic Cloud Self managed connector : Self-managed. Must be hosted in your infrastructure Custom connectors fall into the “Connector Client” category, so we need to download and deploy the connectors framework. Let's begin by cloning the connectors repository: Now add the dependencies you will use at the end of the requirements/framework.txt file. In this case: With this, the repository is done and we can begin to code. Implementing BaseDataSource class You can find the full working code in this repository. We will go through the core pieces in the onelake.py file. After the imports and class declaration, we must define our __init__ method which will capture the configuration parameters. Then, you can configure the form the UI will show to fill those parameters using the get_default_configuration method which returns a configuration dictionary. Then we configure the methods to download, and extract the content from the OneLake documents. To make our connector visible to the framework, we need to declare it in the connectors/config.py file. For this, we add the following code to sources: Authentication Before testing the connector, we need to get the client_id , tenant_id , and client_secret that we'll use to access the Workspace from the connector. We will use service principals as authentication method. An Azure service principal is an identity created for use with applications, hosted services, and automated tools to access Azure resources. The steps are: Creating an application, and gathering client_id , tenant_id , and client_secret Enabling service principal in your workspace Adding the service principal to your workspace You can follow this tutorial step by step. Ready? Now it's time to test the connector! Running the connector With the connector ready, we can now connect to our Elasticsearch instance. Go to : Search > Content > Connectors > New connector and choose Customized Connector Choose a name to create, and then select Create and attach an index to create a new index with the same name as the connector. You can now run it using Docker or run it from source. In this example, we'll use \"Run from source\". Click on Generate Configuration and paste the content from the box on the file config.yml file at the project's root. On the field service_type you must match the connector's name in connectors/config.py . In this case, replace changeme with onelake . Now you can run the connector with these commands: If the connector was correctly initialized, you should see a message like this in the console: Note: If you get a compatibility error,check your connectors/VERSION file and compare with your Elasticsearch cluster version: Version compatibility with Elasticsearch We recommend keeping the connector version and Elasticsearch version in sync. For this article we are using Elasticsearch and connector version 8.15. If everything went fine, our local connector will communicate with our Elasticsearch cluster and we'll be able to configure it using our OneLake credentials: We'll now index the documents from OneLake. To do this, run a Full Content Sync by clicking on Sync > Full Content : Once the sync is over, you should see this in the console: At the Enterprise Search UI, you can click Documents to see the indexed documents: Configure schedule You can schedule recurring content syncs using the UI based on your needs to keep your index updated and in sync with OneLake. To configure scheduled syncs go to Search > Content > Connectors and select your connector. Then click on scheduling : As an alternative, you can use the Update connector scheduling API which allows CRON expressions. Conclusion In this second part, we took our configuration one step further by using the Elastic connectors framework and developing our own to easily communicate with our Elastic Cloud instance. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Connector bootstrapping Implementing BaseDataSource class Authentication Running the connector Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Indexing OneLake data into Elasticsearch - Part II - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/ingesting-data-with-onelake-part-ii",
+    "meta_description": "Learn to index & search OneLake data into Elastic with a custom connector. We’ll show you how to create a OneLake Elasticsearch connector to sync data."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch new semantic_text mapping: Simplifying semantic search Learn how to use the new semantic_text field type and semantic query for simplifying semantic search in Elasticsearch. Vector Database CD MP By: Carlos Delgado and Mike Pellegrini On June 24, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. semantic_text - You know, for semantic search! Do you want to start using semantic search for your data, but focus on your model and results instead of on the technical details? We’ve introduced the semantic_text field type that will take care of the details and infrastructure that you need. Semantic search is a sophisticated technique designed to enhance the relevance of search results by utilizing machine learning models . Unlike traditional keyword-based search, semantic search focuses on understanding the meaning of words and the context in which they are used. This is achieved through the application of machine learning models that provide a deeper semantic understanding of the text. These models generate vector embeddings , which are numeric representations capturing the text meaning. These embeddings are stored alongside your document data, enabling vector search techniques that take into account the word meaning and context instead of pure lexical matches. How to perform semantic search To perform semantic search, you need to go through the following steps: Choose an inference mode l to create embeddings, both for indexing documents and performing queries. Create your index mapping to store the inference results, so they can be efficiently searched afterwards. Setting up indexing so inference results are calculated for new documents added to your index. Automatically handle long text documents , so search can be accurate and cover the entire document. Querying your data to retrieve results. Configuring semantic search from the ground up can be complex. It requires setting up mappings, ingestion pipelines, and queries tailored to your chosen inference model. Each step offers opportunities for fine-tuning and optimization, but also demands careful configuration to ensure all components work together seamlessly. While this offers a great degree of control, it makes using semantic search a detailed and deliberate process, requiring you to configure separate pieces that are all related to each other and to the inference model. semantic_text simplifies this process by focusing on what matters: the inference model. Once you have selected the inference model, semantic_text will make it easy to start using semantic search by providing sensible defaults, so you can focus on your search and not on how to index, generate, or query your embeddings. Let's take a look at each of these steps, and how semantic_text simplifies this setup. Choosing an inference model The inference model will generate embeddings for your documents and queries. Different models have different tradeoffs in terms of: Accuracy and relevance of the results Scalability and performance Language and multilingual support Cost Elasticsearch supports both internal and external inference services: Internal services are deployed in the Elasticsearch cluster. You can use already included models like ELSER and E5 , or import external models into the cluster using eland . External services are deployed by model providers. Elasticsearch supports the following: Cohere Hugging Face Mistral OpenAI Azure AI Studio Azure OpenAI Google AI Studio Once you have chosen the inference mode, create an inference endpoint for it. The inference endpoint identifier will be the only configuration detail that you will need to set up semantic_text . Creating your index mapping Elasticsearch will need to index the embeddings generated by the model so they can be efficiently queried later. Before semantic_text, you needed to understand about the two main field types used for storing embeddings information: sparse_vector : It indexes sparse vector embeddings, like the ones generated by ELSER. Each embedding consists of pairs of tokens and weights. There is a small number of tokens generated per embedding. dense_vector : It indexes vectors of numbers, which contains the embedding information. A model produces vectors of a fixed size, called the vector dimension. The field type to use is conditioned by the model you have chosen. If using dense vectors, you will need to configure the field to include the dimension count, the similarity function used to calculate vectors proximity, and storage customizations like quantization or the specific data type used for each element. Now, if you're using semantic_text, you define a semantic_text field mapping by just specifying the inference endpoint identifier for your model: That's it. No need for you to define other mapping options, or to understand which field type you need to use. Setting up indexing Once your index is ready to store the embeddings, it's time to generate them. Before semantic_text , to generate embeddings automatically on document ingestion you needed to set up an ingestion pipeline . Ingestion pipelines are used to automatically enrich or transform documents when ingested into an index, or when explicitly specified as part of the ingestion process. You need to use the inference processor to generate embeddings for your fields. The processor needs to be configured using: The text fields from which to generate the embeddings The output fields where the generated embeddings will be added Specific inference configuration for text embeddings or sparse embeddings, depending on the model type With semantic_text , you simply add documents to your index. semantic_text fields will automatically calculate the embeddings using the specified inference endpoint. This means there's no need to create an inference pipeline to generate the embeddings. Using bulk, index, or update APIs will do that for you automatically: Inference requests in semantic_text fields are also batched. If you have 10 documents in a bulk API request, and each document contains 2 semantic_text fields, then that request will perform a single inference request with 20 texts to your inference service in one go, instead of making 10 separate inference requests of 2 texts each. Automatically handling long text passages Part of the challenge of selecting a model is the number of tokens that the model can generate embeddings for. Models have a limited number of tokens they can process. This is referred to as the model’s context window. If the text you need to work with is longer than the model’s context window, you may truncate the text and use just part of it to generate embeddings. This is not ideal as you'll lose information; the resulting embeddings will not capture the full context of the input text. Even if you have a long context window, having a long text means a lot of content will be reduced to a single embedding, making it an inaccurate representation. Also, returning a long text will be difficult for the users to understand, as they will have to scan the text to check it's what they are looking for. Using smaller snippets would be preferable instead. Another option is to use chunking to divide long texts into smaller fragments. These smaller chunks are added to each document to provide a better representation of the complete text. You can then use a nested query to search over all the individual fragments and retrieve the documents that contain the best-scoring chunks. Before semantic_text , chunking was not done out of the box - the inference processor did not support chunking. If you needed to use chunking, you needed to do it before ingesting your documents or use the script processor to perform the chunking in Elasticsearch. Using semantic_text means that chunking will be done on your behalf when indexing. Long documents will be split into 250-word sections with a 100-word overlap so that each section shares 100 words with the previous section. This overlap ensures continuity and prevents vital contextual information in the input text from being lost by a hard break. If the model and inference service support batching the chunked inputs are automatically batched together into as few requests as possible, each optimally sized for the Inference Service. The resulting chunks will be stored in a nested object structure so you can check the text contained in each chunk. Querying your data Now that the documents and their embeddings are indexed in Elasticsearch, it's time to do some queries! Before semantic_text , you needed to use a different query depending on the type of embeddings the model generates (dense or sparse). A sparse vector query is needed to query sparse_vector field types, and either a knn search or a knn query can be used to search dense_vector field types. The query process can be further customized for performance and relevance. For example, sparse vector queries can define token pruning to avoid considering irrelevant tokens. Knn queries can specify the number of candidates to consider and the top k results to be returned from each shard. You don't need to deal with those details when using semantic_text . You use a single query type to search your documents: Just include the field and the query text. There’s no need to decide between sparse vector and knn queries, semantic text does this for you. Compare this with using a specific knn search with all its configuration parameters: Under the hood: How semantic_text works To understand how semantic_text works, you can create a semantic_text index and check what happens when you ingest a document. When the first document is ingested, the inference endpoint calculates the embeddings. When indexed, you will notice changes in the index mapping: Now there is additional information about the model settings. Text embedding models will also include information like the number of dimensions or the similarity function for the model. You can check the document already includes the embedding results: The field does not just contain the input text, but also a structure storing the original text, the model settings, and information for each chunk the input text has been divided into. This structure consists of an object with two elements: text : Contains the original input text inference : Inference information added by the inference endpoint, that consists of: inference_id of the inference endpoint model_settings that contain model properties chunks : Nested object that contains an element for each chunk that has been created from the input text. Each chunk contains: The text for the chunk The calculated embeddings for the chunk text Customizing semantic_text semantic_text simplifies semantic search by making default decisions about indexing and querying your data: uses sparse_vector or dense_vector field types depending on the inference model type Automatically defines the number of dimensions and similarity according to the inference results Uses int8_hnsw index type for dense vector field types to leverage scalar quantization . Uses query defaults. No token pruning is applied for sparse_vector queries, nor custom k and num_candidates are set for knn queries. Those are sensible defaults and allow you to quickly and easily start working with semantic search. Over time, you may want to customize your queries and data types to optimize search relevance, index and query performance, and index storage. Query customization There are no customization options - yet - for semantic queries. If you want to customize queries against semantic_text fields, you can perform advanced semantic_text search using explicit knn and sparse vector queries. We're planning to add retrievers support for semantic_text , and adding configuration options to the semantic_text field so they won't be needed at query time. Stay tuned! Data type customization If you need deeper customization for the data indexing, you can use the sparse_vector or dense_vector field types. These field types give you full control over how embeddings are generated, indexed, and queried. You need to create an ingest pipeline with an inference processor to generate the embeddings. This tutorial walks you through the process. What's next with semantic_text ? We're just getting started with semantic_text ! There are quite a few enhancements that we will keep working on, including: Better inference error handling Customize the chunking strategy Hiding embeddings in _source by default, to avoid cluttering the search responses Inner hits support, to retrieve the relevant chunks of information for a query Filtering and retrievers support Kibana support Try it out! semantic_text is available on Elasticsearch Serverless now! It will be available soon on Elasticsearch 8.15 version for Elastic Cloud and on Elasticsearch downloads . If you already have an Elasticsearch serverless cluster, you can see a complete example for testing semantic search using semantic_text in this tutorial , or try it with this notebook . We'd love to hear about your experience with semantic_text ! Let us know what you think in the forums , or open an issue in the GitHub repository . Let's make semantic search easier together! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to semantic_text - You know, for semantic search! How to perform semantic search Choosing an inference model Creating your index mapping Setting up indexing Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch new semantic_text mapping: Simplifying semantic search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text",
+    "meta_description": "Learn how to use the new semantic_text field type and semantic query for simplifying semantic search in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. Integrations Ingestion How To JR By: Jeffrey Rengifo On March 7, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. BigQuery is a Google platform that allows you to centralize data from their different sources and services into one repository. It also enables you to do data analysis and use GenAI and ML tools. Below are the ways to bring data into BigQuery: Indexing data from all of these sources into Elasticsearch allows you to centralize your data sources for a better observability experience. In this article, you'll learn how to index data from BigQuery into Elasticsearch using Python, enabling you to unify data from different systems for search and analysis. You can use the example from this article in this Google Colab notebook . Steps Prepare BigQuery Configure the BigQuery Python client Index data to Elasticsearch Search data Prepare BigQuery To use BigQuery, you need to access Google Cloud Console and create a project . Once done, you'll be redirected to this view: BigQuery allows you to transfer data from Google Drive and Google Cloud Storage, and to upload local files. To upload data to BigQuery you must first create a dataset . Create one and name it \"server-logs\" so we can upload some files. For this article, we'll upload a local dataset that includes different types of articles. Check BigQuery’s official documentation to learn how to upload local files . Dataset The file we will upload to BigQuery has data from a server log with HTTP responses and their descriptions in a ndjson format. The ndjson file includes these fields: ip_address , _timestamp , http_method , endpoint , status_code , response_time and status_code_description . BigQuery will extract data from this file. Then, we'll consolidate it with Python and index it to Elasticsearch. Create a file named logs.ndjson and populate it with the following: We upload this file to the dataset we've just created (shown as \"server_logs\") and use \"logs\" as table name (shown as \"table id\"). Once you're done, your files should look like this: Configure the BigQuery Python client Below, we'll learn how to use the BigQuery Python client and Google Colab to build an app. 1. Dependencies First, we must install the following dependencies: The google-cloud-bigquery dependency has the necessary tools to consume the BigQuery data, elasticsearch allows it to connect to Elastic and index the data, and getpass lets us enter sensitive variables without exposing them in the code. Let's import all the necessary dependencies: We also need to declare other variables and initialize the Elasticsearch client for Python: 2. Authentication To get the necessary credentials to use BigQuery, we'll use auth. Run the command line below and choose the same account you used to create the Google Cloud project: Now, let's see the data in BigQuery: This should be the result you see: With this simple code, we've extracted the data from BigQuery. We've stored it in the logs_data variable and can now use it with Elasticsearch. Index data to Elasticsearch We'll begin by defining the data structure from the Kibana Devtools console : The match_only_text field is a variant of the text field type that saves disk space by not storing the metadata to calculate scores. We use it since logs are usually time-centric, i.e. the date is more important than the match quality in the text field. Queries that use a textfield are compatible with the ones that use a match_only_text field. We'll index the files using the Elasticsearch _bulk api : Search data We can now run queries using the data from the bigquery-logs index. For this example, we'll run a search using the error descriptions from the server in the ( status_code_description field). In addition, we'll sort them by date and get the IP addresses of the errors: This is the result: Conclusion Tools like BigQuery, which help to centralize information, are very useful for data management. In addition to search, using BigQuery with Elasticsearch allows you to leverage the power of ML and data analysis to detect or analyze issues in a simpler and faster way. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Prepare BigQuery Dataset Configure the BigQuery Python client 1. Dependencies Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Ingesting data with BigQuery - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/big-query-data-ingestion",
+    "meta_description": "Learn how to index and search Google BigQuery data in Elasticsearch using Python."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. Integrations Ingestion How To AK By: Amit Khandelwal On February 19, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In this article, we will cover how to avoid critical performance mistakes, why the Elasticsearch default solution doesn’t cut it, and important implementation considerations. All modern-day websites have autocomplete features (search as you type) on their search bar to improve the user experience (no one wants to type entire search terms…). It’s imperative that the autocomplete be faster than the standard search, as the whole point of autocomplete is to start showing the results while the user is typing. If the latency is high, it will lead to a subpar user experience. Below is an autocomplete search example on the famous question-and-answer site, Quora. This is a good example of autocomplete: when searching for “elasticsearch auto”, the following posts begin to show in their search bar: Note that in the search results, there are questions relating to the auto-scaling, auto-tag and autocomplete features of Elasticsearch. Users can further type a few more characters to refine the search results. Various approaches for autocomplete in Elasticsearch / search as you type There are multiple ways to implement the autocomplete feature which broadly fall into four main categories: Search-as-you-type Query time Completion suggester Index time 1. Search as you type It is a data type intended to facilitate the autocomplete queries without prior knowledge of custom analyzer setup. Elasticsearch internally stores the various tokens (edge n-gram, shingles) of the same text, and therefore can be used for both prefix and infix completion. It can be convenient if you are not familiar with the advanced features of Elasticsearch, which the other three approaches require. Not much configuration is required in Search as you type to make it work with simple use cases and code samples.ore details are available in our documentation . 2. Query time Autocomplete can be achieved by changing match queries to prefix queries . While match queries work on token (indexed) to token (search query tokens) match, prefix queries (as their name suggests) match all the tokens starting with search tokens, hence the number of documents (results) matched is high. As explained, prefix query is not an exact token match, rather it’s based on character matches in the string which is very costly and fetches a lot of documents. Elasticsearch internally uses a B+ tree kind of data structure to store its tokens. It’s useful to understand the internals of the data structure used by inverted indices and how different types of queries impact the performance and results. Elasticsearch also introduced Match boolean prefix query in ES 7.2 version. This is a combination of Match and Prefix queries and has the best of both worlds. It’s especially useful when you have multiple search terms. For example, if you have foo bar baz , then instead of running a prefix search on all the search terms (which is costly and produces fewer results), this query would prefix search only on the last term and match previous terms in any order. Doing this improves the speed and relevance of the search results. 3. Completion suggester This is useful if you are providing suggestions for search terms like on e-commerce and hotel search websites. The search bar offers query suggestions, as opposed to the suggestions appearing in the actual search results, and after selecting one of the suggestions provided by the completion suggester, it provides the search results. For the completion suggester to work, suggestions must be indexed as any other field. You can also optionally add a weight field to rank the suggestions. This approach is ideal if you have an external source of autocomplete suggestions, like search analytics. Code samples Index definition Response Indexing suggestions Response Searching Response 4. Index time Sometimes the requirements are just prefix completion or infix completion in autocomplete. It’s not uncommon to see autocomplete implementation using custom-analyzers , which involves indexing the tokens in such a way that it matches the user’s search term. If we continue with our example, we are looking at documents that consist of “elasticsearch autocomplete”, “elasticsearch auto-tag”, “elasticsearch auto scaling” and “elasticsearch automatically”. The default analyzer won’t generate any partial tokens for “autocomplete”, “autoscaling” and “automatically”, and searching “auto” wouldn’t yield any results. To overcome the above issue, edge ngram or n-gram tokenizer are used to index tokens in Elasticsearch, as explained in the documentation , and search time analyzer to get the autocomplete results. The above approach uses Match queries, which are fast as they use a string comparison (which uses hashcode), and there are comparatively less exact tokens in the index. Performance consideration Almost all the above approaches work fine on smaller data sets with lighter search loads, but when you have a massive index getting a high number of auto suggest queries, then the SLA and performance of the above queries is essential. The following bullet points should assist you in choosing the approach best suited for your needs: Ngram or edge Ngram tokens increase index size significantly, providing the limits of min and max gram according to application and capacity. Planning would save significant trouble in production. Allowing empty or few character prefix queries can bring up all the documents in an index and has the potential to bring down an entire cluster . It’s always a better idea to do a prefix query only on the term (on a few fields) and limit the minimum characters in prefix queries. This can be now solved by using the boolean Match prefix query as explained above. ES provided “search as you type” data type tokenizes the input text in various formats. As it is an ES-provided solution which can’t address all use-cases, it’s always a better idea to check all the corner cases required for your business use-case. In addition, as mentioned it tokenizes fields in multiple formats which can increase the Elasticsearch index store size. Completion suggests separately indexing the suggestions and doesn’t address the use-case of fetching the search results. Index time approaches are fast as there is less overhead during query time, but they involve more grunt work, like re-indexing, capacity planning and increased disk cost. Query time is easy to implement, but search queries are costly. This is very important to understand as most of the time users need to choose one of them and to understand this trade-off can help with many troubleshooting performance issues. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Various approaches for autocomplete in Elasticsearch / search as you type 1. Search as you type 2. Query time 3. Completion suggester Code samples Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch autocomplete search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-autocomplete-search",
+    "meta_description": "Learn about Elasticsearch autocomplete search and how to handle it with search as you type, query time, completion suggester and index time."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. ML Research Python GC KS By: Gus Carlock and Kirti Sodhi On February 5, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this article, we’ll demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging a sample text dataset, streamlining the workflow within Elastic’s ecosystem. You can follow along to create a simple clustering pipeline with this Jupyter notebook . Clustering prologue The Machine Learning App in Kibana provides a comprehensive suite of advanced capabilities, including anomaly and outlier detection, as well as classification and regression models. It supports the integration of custom models from the scikit-learn library via the eland Python client. While Kibana offers robust machine learning capabilities, it currently does not support clustering analysis in both prebuilt and custom models. Clustering algorithms are crucial for enhancing search relevance by grouping similar queries and for security, where they help identify patterns in data to detect potential threats and anomalies. Elastic provides the flexibility to leverage custom scikit-learn models, such as k-means, for tasks like clustering—for example, grouping news articles by similarity. While these algorithms aren’t officially supported, you can use the model’s cluster centers as input for the ingest pipeline to integrate these capabilities seamlessly into your Elastic workflow. In the following sections, we’ll guide you through implementing this approach. Dataset overview for clustering workflow For this proof of concept, we utilized the 20 Newsgroups dataset , a popular benchmark for text classification and clustering tasks. This dataset consists of newsgroup posts organized into 20 distinct categories, covering topics such as sports, technology, religion, and science. It is widely available through the scikit-learn library. In our experiments, we focused on a subset of 5 categories: rec.sport.baseball rec.sport.hockey comp.sys.ibm.pc.hardware talk.religion.misc sci.med These categories were chosen to ensure a mix of technical, casual, and diverse topics for effective clustering analysis. Feature extraction and generating text embeddings The text documents were cleaned by removing stop words, punctuation, and irrelevant tokens using scikit-learn’s feature_extraction utility, ensuring that the text vectors captured meaningful patterns. These features were then used to generate text embeddings using OpenAI’s language model “text-embedding-ada-002”. The model, text-embedding-ada-002 stands among the most advanced models for generating dense vector representations of text, capturing the nuanced semantic meaning inherent in textual data. We utilized the Azure OpenAI endpoint to generate the embeddings for our analysis. Instructions to use this endpoint with Elasticsearch can be found at Elasticsearch open inference API adds Azure AI Studio support . The embeddings were normalized before training the k-means clustering model to standardize vector magnitudes. Normalization is a critical preprocessing step for k-means since it calculates clusters based on Euclidean distances. Standardized embeddings eliminate magnitude discrepancies, ensuring that clustering decisions rely purely on semantic proximity, thereby enhancing the accuracy of the clustering results. We trained the k-means model using k=5 to match the dataset’s categories and extracted the cluster centers. These centers served as inputs for Kibana’s ingest pipeline, facilitating real-time clustering of incoming documents. We’ll discuss this further in the next section. Dynamic clustering with ingest pipeline’s script processor After the model is trained in Scikit-learn, an ingest pipeline is used to assign cluster numbers to each record. This ingest pipeline takes three configurable parameters: clusterCenters – a nested list with one list for each cluster center vector. For this blog, they were generated with Scikit-learn. analysisField – the field which contains dense vectorized data. normalize – normalizes the analysisField vectors. Once the ingest pipeline is added to an index or datastream, all new ingested data will be assigned a closest cluster number. The image below illustrates the end-to-end workflow of importing clustering in Kibana. The full ingest pipeline script can be generated using Python, an example is in the “Add clustering ingest pipeline” section of the notebook . We’ll dive into the specifics of the ingest pipeline below. The cluster_centers are then loaded as a nested list of floats, with one list for each cluster center. In the first part of the Painless script, two functions are defined. The first is euclideanDistance , which returns a distance between two arrayLists as a float. The second, l2NormalizeArray , scales an arrayLists so that the sum of its squared elements is equal to one. Then the inference step of k-Means is performed. For every cluster center, the distance is taken between a new incoming document vector using the ingest pipeline context (ctx) and the analysisField parameter, which selects the field containing the OpenAI text-ada-002 vector. The closestCluster number is then assigned to the document based on the closest cluster center, that is, the document which has the shortest distance. Additionally, if the normalize parameter is set to true, the L2 norm of the incoming document vector is taken before doing the distance calculation. Then the closestCluster and minDistance value to that cluster are passed back to the document through the ingest pipeline context. There are a few configurable parameters, which are described above but included here for reference. The first is the clusterCenters , a nested array of floats, with one array for each cluster center. The second is the analysisField , the field which contains the text-ada-002 vectors. Lastly, normalize which will L2 normalize the document vector. Note that the normalize parameter should only be set to True if the vectors are also normalized before training the k-Means model. Finally, once the pipeline is configured, assign an ID and put it on the cluster. Clustering results We expect the clustering results to show each category forming a distinct cluster. While baseball and hockey might overlap due to their shared sports context; technical, religious, and medical categories should form separate and clearly defined clusters. When OpenAI text-ada-002 vectors are viewed with the t-SNE dimensionality reduction algorithm, they show that there is clear separation between these clusters, and that the sports topics are close together: Actual newsgroup labels; 2D t-SNE trained on OpenAI text-ada-002 vectors The location of the points indicates clear separation between the groupings, which indicates that the vectorization is capturing the semantic meaning of each article. As a result, the zero-shot classification results are excellent. Even though no labels were provided in training data to the model, with only the number of clusters provided, on in-sample data a k-means model provides greater than 94% accuracy when assigning cluster numbers: Predicted cluster labels; 2D t-SNE trained on OpenAI text-ada-002 vectors Comparing the actual newsgroup labels to the in-sample predicted labels, there is very little difference between the actual newsgroup labels and those predicted by the clustering model. This is represented by the confusion matrix: Zero-shot Classification Confusion Matrix on OpenAI text-ada-002 vectors The diagonal on the confusion matrix represents the in-sample accuracy of each category, the model is predicting the correct label more than 94% of the time for each category. Detecting outliers in the clusters The k-means model can be viewed as an approximation of a Gaussian Mixture Model (GMM) without capturing covariance, where the quantiles of distances from the nearest cluster being an approximation of the distribution quantile. This means that a k-mean model can capture an approximation of the data distribution. With this approach, a large number of clusters can be chosen, in this case 100, and a new model trained. The higher number of clusters, the more flexible the fit of the distribution. So in this case, the goal is not to learn the internal groupings of the data, but rather capture the distribution of the data overall. The distance quantiles can be computed with a query. In this case, a model was trained with 100 clusters and the 75th percentile distances were chosen as the cutoff for outliers. Starting with the same graph above showing the t-SNE representation of the actual newsgroup labels: Actual newsgroup labels; 2D t-SNE trained on OpenAI text-ada-002 vectors When adding data in from newsgroups which were not in the training set, the 2D t-SNE representation shows a good fit for the data. Here, orange datapoints are not considered outliers, while those which are dark grey are labelled as outliers: Outlier results for k=100; 2D t-SNE trained on OpenAI text-ada-002 vectors Bringing it all together In this blog, we demonstrated how to integrate custom clustering models into the Elastic Stack. We developed a workflow that imports scikit-learn clustering models, such as k-means, into the Elastic Stack, enabling clustering analysis directly within Kibana. By using the 20 Newsgroups dataset, we demonstrated how to apply this workflow to group similar documents, while also discussing the use of advanced text embedding models such as OpenAI's “text-embedding-ada-002” to create semantic representations essential for efficient clustering. The results section showcased clear cluster separation, indicating that the “text-embedding-ada-002” model captures semantic meaning effectively. The k-means model achieved over 94% accuracy in zero-shot classification, with the confusion matrix showing minimal discrepancies between predicted and actual labels, confirming its strong performance. With this workflow, Elastic users can apply clustering techniques to their own datasets, whether for grouping similar queries in search or detecting unusual patterns for security applications. The solution presented here provides an easy way to integrate advanced clustering functionality into Elastic. We hope this inspires you to explore these capabilities and apply them to your own use cases. What’s next? The clustering results above show that the Painless implementation accurately clusters similar topics, achieving 94% accuracy in performance. Moving forward, our goal is to test the pipeline on a less structured dataset with significantly more noise and a larger number of clusters. This will help evaluate its performance in more challenging scenarios. While k-means has shown decent clustering results, exploring alternatives like Gaussian Mixture Models or Mean Shift for outlier detection might yield better outcomes. These methods could also be implemented using a Painless script or an ingest pipeline. In the future, we think this workflow can be enhanced with ELSER , as we could use ELSER to first retrieve relevant features from the dataset, which would then be used for clustering, further improving the model’s performance and relevance in the analysis. Additionally, we would like to address how to properly set the correct number of clusters, and how to effectively deal with model drift. In the meantime, if you have similar experiments or use cases to share, we’d love to hear about them! Feel free to provide feedback or connect with us through our community Slack channel or discussion forums . Report an issue Related content Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey Jump to Clustering prologue Dataset overview for clustering workflow Feature extraction and generating text embeddings Dynamic clustering with ingest pipeline’s script processor Clustering results Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Implementing clustering workflows in Elastic to enhance search relevance - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-clustering-workflows",
+    "meta_description": "Explore clustering workflows and learn how to integrate custom clustering models into the Elastic Stack through an example."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. Vector Database Lucene TV MS By: Thomas Veasey and Mayya Sharipova On April 7, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the past, we discussed some of the challenges of having to search multiple HNSW graphs and how we were able to mitigate them. At that time we alluded to some further improvements we had planned. This post is the culmination of that work. You might ask, why use multiple graphs at all? This is a side effect of an architectural choice in Lucene: immutable segments. As with most architectural choices there are pros and cons. For example, we’ve recently GA’d Serverless Elasticsearch. In this context, we’ve gained very significant benefits from immutable segments including efficient index replication and the ability to decouple index and query compute and autoscale them independently. For vector quantization, segment merges give us the opportunity to update parameters to adapt them to data characteristics. Along these lines, we think there are other advantages that having opportunities to measure data characteristics and revisit indexing choices affords. In this post we will discuss the work we’ve been doing to significantly reduce the overhead of building multiple HNSW graphs and in particular to reduce the cost of merging graphs. Background In order to maintain a manageable number of segments Lucene periodically checks to see if it should merge segments. This amounts to checking if the current segment count exceeds a target segment count, which is determined by the base segment size and the merge policy. If the count is exceeded, Lucene merges groups of segments while the constraint is violated. This process has been described in detail elsewhere . Lucene elects to merge similar sized segments because this achieves logarithmic growth in the write amplification. In the case of a vector index, write amplification is the number of times a vector will be inserted into a graph. Lucene will try to merge segments in groups of approximately 10. Consequently, vectors are inserted into a graph roughly 1 + 9 10 log ⁡ 10 ( n n 0 ) 1+\\frac{9}{10}\\log_{10}\\left(\\frac{n}{n_0}\\right) 1 + 10 9 ​ lo g 10 ​ ( n 0 ​ n ​ ) times, where n n n is the index vector count and n 0 n_0 n 0 ​ is the expected base segment vector count. Because of the logarithmic growth, write amplification is single digits even for huge indices. However, the total time spent merging graphs is linearly proportional to the write amplification. When merging HNSW graphs we already make a small optimization: retaining the graph for the largest segment and inserting vectors from the other segments into it. This is the reason for the 9/10 factor above. Below we show how we are able to do significantly better by using information from all the graphs we are merging. Graph Merging Previously we retained the largest graph and inserted vectors from the others ignoring the graphs that contain them. The key insight we make use of below is that each graph we discard contains important proximity information about the vectors it contains. We would like to use this information to accelerate inserting, at least some, of the vectors. We focus on the problem of inserting a smaller graph G s = ( V s , E s ) G_s=(V_s,E_s) G s ​ = ( V s ​ , E s ​ ) into a larger graph G l = ( V l , E l ) G_l=(V_l,E_l) G l ​ = ( V l ​ , E l ​ ) , since this is an atomic operation we can use to build any merge policy. The strategy is to find a subset of vertices of J ⊂ V s J\\subset V_s J ⊂ V s ​ to insert into the large graph. We then use the connectivity of these vertices in the small graph to accelerate inserting the remaining vertices V s ∖ J V_s \\setminus J V s ​ ∖ J . In the following, we use N s ( u ) N_s(u) N s ​ ( u ) and N l ( u ) N_l(u) N l ​ ( u ) to denote the neighbors of a vertex u u u in the small and large graph, respectively. Schematically the process is as follows. MERGE-HNSW Inputs G s G_s G s ​ and G l G_l G l ​ 1 \\;\\; Find J ⊂ V s J\\subset V_s J ⊂ V s ​ to insert into G l G_l G l ​ using COMPUTE-JOIN-SET 2 \\;\\; Insert each vertex u ∈ J u\\in J u ∈ J into G l G_l G l ​ 3 \\;\\; for u ∈ V s ∖ J u\\in V_s \\setminus J u ∈ V s ​ ∖ J do 4 J u ← J ∩ N s ( u ) \\;\\;\\;\\;J_u\\leftarrow J\\cap N_s(u) J u ​ ← J ∩ N s ​ ( u ) 5 E u ← ∪ v ∈ J u N l ( u ) \\;\\;\\;\\;E_u\\leftarrow \\cup_{v\\in J_u} N_l(u) E u ​ ← ∪ v ∈ J u ​ ​ N l ​ ( u ) 6 W ← \\;\\;\\;\\;W\\leftarrow\\, W ← FAST-SEARCH-LAYER ( J u , E u ) (J_u, E_u) ( J u ​ , E u ​ ) 7 n e i g h b o r s ← \\;\\;\\;\\;neighbors \\leftarrow\\, n e i g hb ors ← SELECT-NEIGHBORS-HEURISTIC ( u , W ) (u, W) ( u , W ) 8 J ← J ∪ { u } \\;\\;\\;\\;J\\leftarrow J \\cup \\{u\\} J ← J ∪ { u } We compute the set J J J using a procedure we discuss below (line 1). Then we insert every vertex in J J J into the large graph using the standard HNSW insertion procedure (line 2). For each vertex we haven’t inserted we find its neighbors that we have inserted and their neighbors in the large graph (lines 4 and 5). We use a FAST-SEARCH-LAYER procedure seeded with this set (line 6) to find the candidates for the SELECT-NEIGHBORS-HEURISTIC from the HNSW paper (line 7). In effect, we’re replacing SEARCH-LAYER to find the candidate set in the INSERT method (Algorithm 1 from the paper), which is otherwise unchanged. Finally, we add the vertex we just inserted into J J J (line 8). It's clear that for this to work every vertex in V s ∖ J V_s \\setminus J V s ​ ∖ J must have at least one neighbor in J J J . In fact, we require that for every vertex in u ∈ V s ∖ J u\\in V_s \\setminus J u ∈ V s ​ ∖ J that ∣ J ∩ N s ( u ) ∣ ≥ k u |J\\cap N_s(u) |\\geq k_u ∣ J ∩ N s ​ ( u ) ∣ ≥ k u ​ for some k u < M k_u<M k u ​ < M , the maximum layer connectivity. We observe that in real HNSW graphs we see quite a spread of vertex degrees. The figure below shows a typical cumulative density function of vertex degree for the bottom layer of an Lucene HNSW graph. Example vertex degree distribution We explored using a fixed value for k u k_u k u ​ as well as making it a function of the vertex degree. This second choice leads to larger speedups with minimal impact on graph quality so went with the following k u = max ⁡ ( 2 , 1 4 ∣ N s ( u ) ∣ ) k_u=\\max\\left(2,\\frac{1}{4}|N_s(u)|\\right) k u ​ = max ( 2 , 4 1 ​ ∣ N s ​ ( u ) ∣ ) Note that ∣ N s ( u ) |N_s(u) ∣ N s ​ ( u ) | is equal to the degree of the vertex u u u in the small graph by definition. Having a lower limit of two means that we will insert every vertex whose degree is less than two. A simple counting argument suggests that if we choose J J J carefully we need only insert around 1 5 ∣ V s ∣ \\frac{1}{5}|V_s| 5 1 ​ ∣ V s ​ ∣ into G l G_l G l ​ directly. Specifically, we color an edge of the graph if we insert exactly one of its end vertices into J J J . Then we know that for every vertex in V s ∖ J V_s \\setminus J V s ​ ∖ J to have at least k u k_u k u ​ neighbors in J J J we need to color at least ∑ u ∈ V s ∖ J k u \\sum_{u\\in V_s\\setminus J} k_u ∑ u ∈ V s ​ ∖ J ​ k u ​ edges. Furthermore, we expect that ∑ u ∈ V s ∖ J k u ≈ ( ∣ V s ∣ − ∣ J ∣ ) 1 4 E U [ ∣ N s ( U ) ∣ ] \\sum_{u\\in V_s\\setminus J} k_u\\approx \\left(|V_s|-|J|\\right)\\frac{1}{4}\\mathbb{E}_U\\left[|N_s(U)|\\right] u ∈ V s ​ ∖ J ∑ ​ k u ​ ≈ ( ∣ V s ​ ∣ − ∣ J ∣ ) 4 1 ​ E U ​ [ ∣ N s ​ ( U ) ∣ ] Here, E U [ N s ( U ) ∣ ] \\mathbb{E}_U\\left[N_s(U)|\\right] E U ​ [ N s ​ ( U ) ∣ ] is the average vertex degree in the small graph. For each vertex u ∈ J u\\in J u ∈ J we color at most ∣ N s ( u ) ∣ |N_s(u)| ∣ N s ​ ( u ) ∣ edges. Therefore, the total number of edges we expect to color is at most ∣ J ∣ E U [ ∣ N s ( U ) ∣ ] |J|\\, \\mathbb{E}_U\\left[|N_s(U)|\\right] ∣ J ∣ E U ​ [ ∣ N s ​ ( U ) ∣ ] . We hope by choosing J J J carefully we will color close to this number of edges and so in order to cover all the vertices ∣ J ∣ |J| ∣ J ∣ needs to satisfy ∣ J ∣ E U [ ∣ N s ( U ) ∣ ] = ( ∣ V s ∣ − ∣ J ∣ ) 1 4 E U [ ∣ N s ( U ) ∣ ] |J|\\, \\mathbb{E}_U\\left[|N_s(U)|\\right] =\\left(|V_s|-|J|\\right)\\frac{1}{4}\\mathbb{E}_U\\left[|N_s(U)|\\right] ∣ J ∣ E U ​ [ ∣ N s ​ ( U ) ∣ ] = ( ∣ V s ​ ∣ − ∣ J ∣ ) 4 1 ​ E U ​ [ ∣ N s ​ ( U ) ∣ ] This implies that ∣ J ∣ = 1 4 ( ∣ V s ∣ − ∣ J ∣ ) = 4 5 1 4 ∣ V s ∣ = 1 5 ∣ V s ∣ |J|=\\frac{1}{4}(|V_s|-|J|)=\\frac{4}{5}\\frac{1}{4}|V_s|=\\frac{1}{5}|V_s| ∣ J ∣ = 4 1 ​ ( ∣ V s ​ ∣ − ∣ J ∣ ) = 5 4 ​ 4 1 ​ ∣ V s ​ ∣ = 5 1 ​ ∣ V s ​ ∣ . Providing SEARCH-LAYER dominates the runtime this suggests we could achieve up to a 5 × 5\\times 5 × speed up in merge time. Given the logarithmic growth of the write amplification, this means even for very large indices we would typically only double the build time compared to building one graph. The risk in this strategy is that we damage the graph quality. We initially tried with a no-op FAST-SEARCH-LAYER . We found this degrades graph quality to the extent that recall as a function of latency was impacted, particularly when merging down to a single segment. We then explored various alternatives using a limited search of the graph. In the end, the most effective choice was the simplest. Use SEARCH-LAYER but with a low ef_construction . With this parameterisation we were able to achieve excellent quality graphs and still decrease the merge time by a little over 30% on average. Computing the Join Set Finding a good join set can be formulated as a graph covering problem. A greedy heuristic is a simple and effective heuristic for approximating optimal graph covers. The approach we take picks vertices one at a time to add to J J J in decreasing gain order. The gain is defined as follows: G a i n ( v ) = max ⁡ ( k v − c ( v ) , 0 ) + ∑ u ∈ N s ( v ) ∖ J 1 { c ( u ) < k u } Gain(v)=\\max(k_v-c(v),0)+\\sum_{u\\in N_s(v)\\setminus J} 1\\left\\{c(u)<k_u\\right\\} G ain ( v ) = max ( k v ​ − c ( v ) , 0 ) + u ∈ N s ​ ( v ) ∖ J ∑ ​ 1 { c ( u ) < k u ​ } Here, c ( v ) c(v) c ( v ) denotes the count of neighbors of a vector v v v in J J J and 1 { ⋅ } 1\\{\\cdot\\} 1 { ⋅ } is the indicator function. The gain includes the change in the count of the vertex we added to J J J , i.e. max ⁡ ( k v − c ( v ) , 0 ) \\max(k_v-c(v),0) max ( k v ​ − c ( v ) , 0 ) , since we get closer to our goal by adding a less covered vertex. The gain calculation is illustrated in the figure below for the central orange vertex. Vertex gain to add to join set J We maintain the following state for each vertex v v v : Whether it is stale, Its gain G a i n ( v ) Gain(v) G ain ( v ) , The count of adjacent vertices in J J J denoted c ( v ) c(v) c ( v ) , A random number in the range [0,1] that is used for tie breaking. The pseudo code for computing the join set is as follows. COMPUTE-JOIN-SET Inputs G s G_s G s ​ 1 C ← ∅ \\;\\;C\\leftarrow\\emptyset C ← ∅ 2 G a i n e x i t ← 0 \\;\\;Gain_{exit}\\leftarrow 0 G ai n e x i t ​ ← 0 3 \\;\\; for u ∈ G s u\\in G_s u ∈ G s ​ do 4 C ← C ∪ { ( false , k u + d e g ( u ) , 0 , rand in [ 0 , 1 ] ) } \\;\\;\\;\\;C\\leftarrow C \\cup \\{(\\text{false}, k_u+deg(u), 0, \\text{rand in }[0,1])\\} C ← C ∪ {( false , k u ​ + d e g ( u ) , 0 , rand in [ 0 , 1 ])} 5 G a i n e x i t ← G a i n e x i t + k u \\;\\;\\;\\;Gain_{exit}\\leftarrow Gain_{exit}+k_u G ai n e x i t ​ ← G ai n e x i t ​ + k u ​ 6 G a i n t o t ← 0 \\;\\;Gain_{tot}\\leftarrow 0 G ai n t o t ​ ← 0 7 \\;\\; while G a i n t o t < G a i n e x i t Gain_{tot}<Gain_{exit} G ai n t o t ​ < G ai n e x i t ​ do 8 v ∗ ← \\;\\;\\;\\;v^*\\leftarrow\\, v ∗ ← maximum gain vertex in C C C 9 \\;\\;\\;\\; Remove the state for v ∗ v^* v ∗ from C C C 10 \\;\\;\\; if v ∗ v^* v ∗ is not stale then 11 J ← J ∪ { v ∗ } \\;\\;\\;\\;\\;J\\leftarrow J\\cup\\{v^*\\} J ← J ∪ { v ∗ } 12 G a i n t o t ← G a i n t o t + G a i n ( v ∗ ) \\;\\;\\;\\;\\;Gain_{tot}\\leftarrow Gain_{tot}+Gain(v^*) G ai n t o t ​ ← G ai n t o t ​ + G ain ( v ∗ ) 13 \\;\\;\\;\\;\\; for u ∈ N s ( v ∗ ) u \\in N_s(v^*) u ∈ N s ​ ( v ∗ ) do 14 \\;\\;\\;\\;\\;\\;\\; mark u u u as stale if c ( v ∗ ) < k v ∗ c(v^*)<k_{v^*} c ( v ∗ ) < k v ∗ ​ 15 \\;\\;\\;\\;\\;\\;\\; mark neighbors of u u u stale if c ( u ) = k u − 1 c(u)=k_u-1 c ( u ) = k u ​ − 1 16 c ( u ) ← c ( u ) + 1 \\;\\;\\;\\;\\;\\;\\;c(u)\\leftarrow c(u)+1 c ( u ) ← c ( u ) + 1 17 \\;\\;\\; else 18 G a i n ( v ∗ ) ← max ⁡ ( k v − c ( v ∗ ) , 0 ) + ∑ u ∈ N s ( v ∗ ) ∖ J 1 { c ( u ) < k u } \\;\\;\\;\\;\\;Gain(v^*)\\leftarrow \\max(k_v-c(v^*),0)+\\sum_{u\\in N_s(v^*)\\setminus J} 1\\left\\{c(u)<k_u\\right\\} G ain ( v ∗ ) ← max ( k v ​ − c ( v ∗ ) , 0 ) + ∑ u ∈ N s ​ ( v ∗ ) ∖ J ​ 1 { c ( u ) < k u ​ } 19 \\;\\;\\;\\;\\; if G a i n ( v ∗ ) > 0 Gain(v^*)>0 G ain ( v ∗ ) > 0 then 20 C ← C ∪ { ( false , G a i n ( v ∗ ) , c ( v ∗ ) , copy rand ) } \\;\\;\\;\\;\\;\\;\\;C\\leftarrow C \\cup \\{(\\text{false},Gain(v^*),c(v^*),\\text{copy rand})\\} C ← C ∪ {( false , G ain ( v ∗ ) , c ( v ∗ ) , copy rand )} 21 \\;\\; return J J J We first initialize the state in lines 1-5. In each iteration of the main loop we initially extract the maximum gain vertex (line 8), breaking ties at random. Before making any change, we need to check if the vertex’s gain is stale. In particular, each time we add a vertex into J J J we affect the gain of other vertices: Since all its neighbors have an additional neighbor in J J J their gains can change (line 14) If any of its neighbors are now fully covered all their neighbors’ gains can change (lines 14-16) We recompute gains in a lazy fashion, so we only recompute the gain of a vertex if we want to insert it into J J J (lines 18-20). Since gains only ever decrease we can never miss a vertex we should insert. Note that we simply need to keep track of the total gain of vertices we’ve added to J J J to determine when to exit. Furthermore, whilst G a i n t o t < G a i n e x i t Gain_{tot}<Gain_{exit} G ai n t o t ​ < G ai n e x i t ​ at least one vertex will have non-zero gain so we always make progress. Results We ran experiments on four datasets that together cover our three supported distance metrics (Euclidean, cosine and inner product): quora-E5-small: 522931 documents, 384 dimensions and uses cosine similarity, cohere-wikipedia-v2: 1M documents, 768 dimensions and uses cosine similarity, gist: 1M documents, 960 dimensions and uses Euclidean distance, and cohere-wikipedia-v3: 1M documents, 1024 dimensions and uses maximum inner product. For each dataset we evaluate two quantization levels: int8 – which uses a 1-byte integer per dimension and BBQ – which uses a single bit per dimension. Finally, for each experiment we evaluated search quality at two retrieval depths and examine after building the index and then after force merging to a single segment. In summary, we achieve consistent substantial speedups in indexing and merging while maintaining graph quality and so search performance in all cases. Experiment 1: int8 quantization The average speedups from the baseline to the candidate, the proposed changes, are: Index Time Speedup: 1.28 × \\times × Force Merge Speedup: 1.72 × \\times × This corresponds to the following breakdown in run times Index and merge times for the baseline and candidate merge strategies For completeness the exact times are Index Merge Dataset baseline candidate build candidate quora-E5-small 112.41s 81.55s 113.81s 70.87s wiki-cohere-v2 158.1s 122.95s 425.20s 239.28s gist 141.82s 119.26s 536.07s 279.05s wiki-cohere-v3 211.86s 168.22s 654.97s 414.12s Below we show the recall vs latency graphs that compare the candidate (dashed lines) to the baseline at two retrieval depths: recall@10 and recall@100 for indices with multiple segments (the final result of our default merge strategy after indexing all vectors) and after force merging to a single segment. A curve that is higher and further to the left is better, which means higher recall at lower latency. As you can see, for multiple segment indices the candidate is better for the Cohere v3 dataset and slightly worse, but almost comparable, for all other datasets. After merging to a single segment recall curves are almost identical for all cases. Recall @10 and @100 vs latency after building the index Recall @10 and @100 vs latency after merging to a single segment Experiment 2: BBQ quantization The average speedups from the baseline to the candidate are: Index Time Speedup: 1.33 × \\times × Force Merge Speedup: 1.34 × \\times × This corresponds to the following breakdown in run times Index and merge time for the baseline and candidate merge strategies For completeness the exact times are Index Merge Dataset baseline candidate build candidate quora-E5-small 70.71s 58.25s 59.38s 40.15s wiki-cohere-v2 203.08s 142.27s 107.27s 85.68s gist 110.35s 105.52s 323.66s 202.2s wiki-cohere-v3 313.43s 190.63s 165.98s 159.95s For multiple segment indices the candidate is better for almost all datasets, except cohere v2 where the baseline is slightly better. For the single segment indices recall curves are almost identical for all cases. Recall @10 and @100 vs latency after building the index Recall @10 and @100 vs latency having merged to a single segment Conclusion The algorithm discussed in this blog will be available in the upcoming Lucene 10.2, and in the Elasticsearch release that is based on it. Users will be able to take advantage of the improved merge performance and reduced index build time in these new versions. This change is a part of our continuous effort to make Lucene and Elasticsearch fast and efficient for vector and hybrid search. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Jump to Background Graph Merging Computing the Join Set Results Experiment 1: int8 quantization Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Speeding up merging of HNSW graphs - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/hnsw-graphs-speed-up-merging",
+    "meta_description": "Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it Lucene BT AL By: Benjamin Trent and Ao Li On February 7, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Yep, another bug fixing blog. But this one has a twist, an open-source hero swoops in and saves the day. Debugging concurrency bugs is no picnic, but we're going to get into it. Enter Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, that turns flaky failures into reliably reproducible ones. Thanks to Fray’s clever shadow lock design and precise thread control, we tracked down a tricky Lucene bug and finally squashed it. This post explores how open-source heroes and tools are making concurrency debugging less painful—and the software world a whole lot better. Concurrency bugs: software engineers' bane Concurrency bugs are the worst. Not only are they difficult to fix, simply getting them to fail reliably is the hardest part. Take this test failure, TestIDVersionPostingsFormat#testGlobalVersions , as an example. It spawns multiple document writing and updating threads, challenging Lucene’s optimistic concurrency model. This test exposed a race condition in the optimistic concurrency control. Meaning, a document operation may falsely claim to be the latest in a sequence of operations 😱. Meaning, in certain conditions, an update or delete operation might actually succeed when it should have failed given optimistic concurrency constraints. Apologies for those who hate Java stack traces. Note, delete doesn’t necessarily mean “delete”. It can also indicate a document “update”, as Lucene’s segments are read-only. Apache Lucene manages each thread that is writing documents through the DocumentsWriter class. This class will create or reuse threads for document writing and each write action controls its information within the DocumentsWriterPerThread (DWPT) class. Additionally, the writer keeps track of what documents are deleted in the DocumentsWriterDeleteQueue (DWDQ). These structures keep all document mutation actions in memory and will periodically flush, freeing up in-memory resources and persisting structures to disk. In an effort to prevent blocking threads and ensuring high throughput in concurrent systems, Apache Lucene tries to only synchronize in very critical sections. While this can be good in practice, like in any concurrent systems, there are dragons. A false hope My initial investigation pointed me to a couple of critical sections that were not appropriately synchronized. All interactions to a given DocumentsWriterDeleteQueue are controlled by its enclosing DocumentsWriter . So while individual methods may not be appropriately synchronized in the DocumentsWriterDeleteQueue , their access to the world is (or should be). (Let’s not delve into how this muddles ownership and access—it’s a long-lived project written by many contributors. Cut it some slack.) However, I found one place during a flush that was not synchronized. These actions aren’t synchronized into a single atomic operation. Meaning, between newQueue being created, and calling getMaxSeqNo , other code could have executed incrementing the sequence number in the documentsWriter class. I found the bug! If only it were that easy. But, as with most complex bugs, finding the root cause wasn't simple. That's when a hero stepped in. A hero in the fray Enter our hero: Ao Li and his colleagues at the PASTA Lab. I will let him explain how they saved the day with Fray. Fray is a deterministic concurrency testing framework developed by researchers at the PASTA Lab , Carnegie Mellon University. The motivation behind building Fray stems from a noticeable gap between academia and industry: while deterministic concurrency testing has been extensively studied in academic research for over 20 years, practitioners continue to rely on stress testing—a method widely acknowledged as unreliable and flaky—to test their concurrent programs. Thus, we wanted to design and implement a deterministic concurrency testing framework with generality and practical applicability as the primary goal. The core idea At its heart, Fray leverages a straightforward yet powerful principle: sequential execution. Java’s concurrency model provides a key property —if a program is free of data races, all executions will appear sequentially consistent. This means the program’s behavior can be represented as a sequence of program statements. Fray operates by running the target program in a sequential manner: at each step, it pauses all threads except one, allowing Fray to precisely control thread scheduling. Threads are selected randomly to simulate concurrency, but the choices are recorded for subsequent deterministic replay. To optimize execution, Fray only performs context-switches when a thread is about to execute a synchronizing instruction such as locking or atomic/volatile access. A nice property about data-race freedom is that this limited context switching is sufficient to explore all observable behaviors due to any thread interleaving ( our paper has a proof sketch). The challenge: controlling thread scheduling While the core idea seems simple, implementing Fray presented significant challenges. To control thread scheduling, Fray must manage the execution of each application thread. At first glance, this might seem straightforward—replacing concurrency primitives with customized implementations. However, concurrency control in the JVM is intricate, involving a mix of bytecode instructions , high-level libraries , and native methods . This turned out to be a rabbit hole: For example, every MONITORENTER instruction must have a corresponding MONITOREXIT in the same method. If Fray replaces MONITORENTER with a method call to a stub/mock, it also needs to replace MONITOREXIT . In code that makes use of object.wait/notify , If MONITORENTER is replaced, the corresponding object.wait must also be replaced. This replacement chain extends to object.notify and beyond. JVM invokes certain concurrency-related methods (e.g., object.notify when a thread ends) within native code. Replacing these operations would require modifying the JVM itself. JVM functions, such as class loaders and garbage collection (GC) threads, also use concurrency primitives. Modifying these primitives can create mismatches with those JVM functions. Replacing concurrency primitives in the JDK often results in JVM crashes during its initialization phase. These challenges made it clear that a comprehensive replacement of concurrency primitives was not feasible. Our solution: shadow lock design To address these challenges, Fray uses a novel shadow lock mechanism to orchestrate thread execution without replacing concurrency primitives. Shadow locks act as intermediaries that guide thread execution. For example, before acquiring a lock, an application thread must interact with its corresponding shadow lock. The shadow lock determines whether the thread can acquire the lock. If the thread cannot proceed, the shadow lock blocks it and allows other threads to execute, avoiding deadlocks and allowing controlled concurrency. This design enables Fray to control thread interleaving transparently while preserving the correctness of concurrency semantics. Each concurrency primitive is carefully modeled within the shadow lock framework to ensure soundness and completeness. More technical details can be found in our paper. Moreover, this design is intended to be future-proof. By requiring only the instrumentation of shadow locks around concurrency primitives, it ensures compatibility with newer versions of JVM. This is feasible because the interfaces of concurrency primitives in the JVM are relatively stable and have remained unchanged for years. Testing Fray After building Fray, the next step was evaluation. Fortunately, many applications, such as Apache Lucene, already include concurrency tests. Such concurrency tests are regular JUnit tests that spawn multiple threads, do some work, then (usually) wait for those threads to finish, and then assert some property. Most of the time, these tests pass because they exercise only one interleaving. Worse yet, some tests only fail occasionally in the CI/CD environment, as described earlier, making these failures extremely difficult to debug. When we executed the same tests with Fray, we uncovered numerous bugs. Notably, Fray rediscovered previously reported bugs that had remained unfixed due to the lack of a reliable reproduction, including this blog’s focus: TestIDVersionPostingsFormat.testGlobalVersions . Luckily, with Fray, we can deterministically replay them and provide developers with detailed information, enabling them to reliably reproduce and fix the issue. Next steps for Fray We are thrilled to hear from developers at Elastic that Fray has been helpful in debugging concurrency bugs. We will continue to work on Fray to make it available to more developers. Our short-term goals include enhancing Fray’s ability to deterministically replay the schedule, even in the presence of other non-deterministic operations such as a random-value generator or the use of object.hashcode . We also aim to improve the usability of Fray, enabling developers to analyze and debug existing concurrency tests without any manual intervention. Most importantly, if you are facing challenges debugging or testing concurrency issues in your program, we’d love to hear from you. Please don’t hesitate to create an issue in the Fray Github repository . Time to fix the concurrency bug Thanks to Ao Li and the PASTA lab, we now have a reliably failing instance of this test! We can finally fix this thing. The key issue resided in how DocumentsWriterPerThreadPool allowed for thread and resource reuse. Here we can see each thread being created, referencing the initial delete queue at generation 0. Then the queue advance will occur on flush, correctly seeing the previous 7 actions in the queue. But, before all the threads can finish flushing, two are reused for an additional document: These will then increment the seqNo above the assumed maximum, which was calculated during the flush as 7. Note the additional numDocsInRAM for segments _3 and _0 Thus causing Lucene to incorrectly account for the sequence of document actions during a flush and tripping this test failure. Like all good bug fixes, the actual fix is about 10 lines of code . But took two engineers multiple days to actually figure out: Some lines of code take longer to write than others. And even require the help of some new friends. Not all heroes wear capes Yes, it's cliche – but it's true. Concurrent program debugging is incredibly important. These tricky concurrency bugs take an inordinate amount of time to debug and work through. While new languages like Rust have built in mechanisms to help prevent race conditions like this, the majority of software in the world is already written, and written in something other than Rust . Java, even after all these years, is still one of the most used languages. Improving debugging on JVM based languages makes the software engineering world better. And given how some folks think that code will be written by Large Language Models, maybe our jobs as engineers will eventually just be debugging bad LLM code instead of just our own bad code. But, no matter the future of software engineering, concurrent program debugging will remain critical for maintaining and building software. Thank you Ao Li and his colleagues from the PASTA Lab for making it that much better. Report an issue Related content Lucene December 27, 2024 Lucene bug adventures: Fixing a corrupted index exception Sometimes, a single line of code takes days to write. Here, we get a glimpse of an engineer's pain and debugging over multiple days to fix a potential Apache Lucene index corruption. BT By: Benjamin Trent Lucene January 3, 2025 Lucene Wrapped 2024 2024 has been another major year for Apache Lucene. In this blog, we’ll explore the key highlights. CH By: Chris Hegarty Vector Database Lucene June 26, 2024 Elasticsearch vs. OpenSearch: Vector Search Performance Comparison Elasticsearch is out-of-the-box 2x–12x faster than OpenSearch for vector search US By: Ugo Sangiorgi Lucene ML Research November 11, 2023 Understanding scalar quantization in Lucene Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights. BT By: Benjamin Trent Vector Database February 6, 2025 A quick introduction to vector search This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. VC By: Valentin Crettaz Jump to Concurrency bugs: software engineers' bane A false hope A hero in the fray The core idea The challenge: controlling thread scheduling Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Concurrency bugs in Lucene: How to fix optimistic concurrency failures - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/concurrency-bugs-lucene-debugging",
+    "meta_description": "Exploring how to fix optimistic concurrency failures and bugs in Lucene using Fray. We'll be making concurrency debugging less painful by using open-source tools."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Introducing LangChain4j: Building RAG apps in plain Java Introducing LangChain4j (LangChain for Java). Discover how to use it to build your RAG application in plain Java. Part1 Integrations Java +1 September 23, 2024 Introducing LangChain4j to simplify LLM integration into Java applications LangChain4j (LangChain for Java) is a powerful toolset to build your RAG application in plain Java. DP By: David Pilato Part2 Integrations Java +1 October 8, 2024 LangChain4j with Elasticsearch as the embedding store LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Discover how to use it to build your RAG application in plain Java. DP By: David Pilato Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing LangChain4j: Building RAG apps in plain Java - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/langchain4j-for-building-rag-apps-in-plain-java",
+    "meta_description": "Introducing LangChain4j (LangChain for Java). Discover how to use it to build your RAG application in plain Java."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series How to ingest data from AWS S3 into Elastic Cloud Learn about different ways you can ingest data from AWS S3 into Elastic Cloud. Part1 Ingestion How To October 2, 2024 How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder Learn how to ingest data from AWS S3 using Elastic Serverless Forwarder (ESF). HL By: Hemendra Singh Lodhi Part2 Ingestion How To October 10, 2024 How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent Learn about different options to ingest data from AWS S3 into Elastic Cloud. This blog covers how to ingest data from AWS S3 using Elastic Agent. HL By: Hemendra Singh Lodhi Part3 Ingestion Integrations +1 November 5, 2024 How to ingest data from AWS S3 into Elastic Cloud - Part 3: Elastic S3 Connector Learn about different options to ingest data from AWS S3 into Elastic Cloud. This time we will focus on Elastic S3 Connector. HL By: Hemendra Singh Lodhi Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to ingest data from AWS S3 into Elastic Cloud - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/how-to-ingest-data-from-aws-s3-into-elastic-cloud",
+    "meta_description": "Learn about different ways you can ingest data from AWS S3 into Elastic Cloud."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Understanding scalar quantization in Lucene Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights. Lucene ML Research BT By: Benjamin Trent On November 11, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Automatic byte quantization in Lucene While HNSW is a powerful and flexible way to store and search vectors, it does require a significant amount of memory to run quickly. For example, querying 1MM float32 vectors of 768 dimensions requires roughly 1 , 000 , 000 ∗ 4 ∗ ( 768 + 12 ) = 3120000000 b y t e s ≈ 3 G B 1,000,000 * 4 * (768 + 12) = 3120000000 bytes \\approx 3GB 1 , 000 , 000 ∗ 4 ∗ ( 768 + 12 ) = 3120000000 b y t es ≈ 3 GB of ram. Once you start searching a significant number of vectors, this gets expensive. One way to use around 75 % 75\\% 75% less memory is through byte quantization. Lucene and consequently Elasticsearch has supported indexing b y t e byte b y t e vectors for some time, but building these vectors has been the user's responsibility. This is about to change, as we have introduced i n t 8 int8 in t 8 scalar quantization in Lucene. Scalar quantization 101 All quantization techniques are considered lossy transformations of the raw data. Meaning some information is lost for the sake of space. For an in depth explanation of scalar quantization, see: Scalar Quantization 101 . At a high level, scalar quantization is a lossy compression technique. Some simple math gives significant space savings with very little impact on recall. Exploring the architecture Those used to working with Elasticsearch may be familiar with these concepts already, but here is a quick overview of the distribution of documents for search. Each Elasticsearch index is composed of multiple shards . While each shard can only be assigned to a single node, multiple shards per index gives you compute parallelism across nodes. Each shard is composed as a single Lucene Index . A Lucene index consists of multiple read-only segments. During indexing, documents are buffered and periodically flushed into a read-only segment. When certain conditions are met, these segments can be merged in the background into a larger segment. All of this is configurable and has its own set of complexities. But, when we talk about segments and merging, we are talking about read-only Lucene segments and the automatic periodic merging of these segments. Here is a deeper dive into segment merging and design decisions. Quantization per segment in Lucene Every segment in Lucene stores the following: the individual vectors, the HNSW graph indices, the quantized vectors, and the calculated quantiles. For brevity's sake, we will focus on how Lucene stores quantized and raw vectors. For every segment, we keep track of the raw vectors in the v e c vec v ec file, quantized vectors and a single corrective multiplier float in v e q veq v e q , and the metadata around the quantization within the v e m q vemq v e m q file. Figure 1: Simplified layout of raw vector storage file. Takes up d i m e n s i o n ∗ 4 ∗ n u m V e c t o r s dimension * 4 * numVectors d im e n s i o n ∗ 4 ∗ n u mV ec t ors of disk space since f l o a t float f l o a t values are 4 bytes. Because we are quantizing, these will not get loaded during HNSW search. They are only used if specifically requested (e.g. brute-force secondary via rescore ), or for re-quantization during segment merge. Figure 2: Simplified layout of the . v e q .veq . v e q file. Takes up ( d i m e n s i o n + 4 ) ∗ n u m V e c t o r s (dimension + 4)*numVectors ( d im e n s i o n + 4 ) ∗ n u mV ec t ors of space and will be loaded into memory during search. The + 4 + 4 + 4 bytes is to account for the corrective multiplier float, used to adjust scoring for better accuracy and recall. Figure 3: The simplified layout of the metadata file. Here is where we keep track of quantization and vector configuration along with the calculated quantiles for this segment. So, for each segment, we store not only the quantized vectors, but the quantiles used in making these quantized vectors and the original raw vectors. But, why do we keep the raw vectors around at all? Quantization that grows with you Since Lucene periodically flushes to read only segments, each segment only has a partial view of all your data. This means the quantiles calculated only directly apply for that sample set of your entire data. Now, this isn't a big deal if your sample adequately represents your entire corpus. But Lucene allows you to sort your index in various ways. So, you could be indexing data sorted in a way that adds bias for per-segment quantile calculations. Also, you can flush the data whenever you like! Your sample set could be tiny, even just one vector. Yet another wrench is that you have control over when merges occur. While Elasticsearch has configured defaults and periodic merging, you can ask for a merge whenever you like via _force_merge API. So how do we still allow all this flexibility, while providing good quantization that provides good recall? Lucene's vector quantization will automatically adjust over time. Because Lucene is designed with a read-only segment architecture, we have guarantees that the data in each segment hasn't changed and clear demarcations in the code for when things can be updated. This means during segment merge we can adjust quantiles as necessary and possibly re-quantize vectors. Figure 4: Three example segments with different quantiles. But isn't re-quantization expensive? It does have some overhead, but Lucene handles quantiles intelligently, and only fully-requantizes when necessary. Let's use the segments in Figure 4 as an example. Let's give segments A A A and B B B 1 , 000 1,000 1 , 000 documents each and segment C C C only 100 100 100 documents. Lucene will take a weighted average of the quantiles and if that resulting merged quantile is near enough to the segments original quantiles, we don't have to re-quantize that segment and will utilize the newly merged quantiles. Figure 5: Example of merged quantiles where segments A A A and B B B have 1000 1000 1000 documents and C C C only has 100 100 100 . In the situation visualized in figure 5, we can see that the resulting merged quantiles are very similar to the original quantiles in A A A and B B B . Thus, they do not justify quantizing the vectors. Segment C C C , seems to deviate too much. Consequently, the vectors in C C C would get re-quantized with the newly merged quantile values. There are indeed extreme cases where the merged quantiles differ dramatically from any of the original quantiles. In this case, we will take a sample from each segment and fully re-calculate the quantiles. Quantization performance & numbers So, is it fast and does it still provide good recall? The following numbers were gathered running the experiment on a c3-standard-8 GCP instance. To ensure a fair comparison with f l o a t 32 float32 f l o a t 32 we used an instance large enough to hold raw vectors in memory. We indexed 400 , 000 400,000 400 , 000 Cohere Wiki vectors using maximum-inner-product. Figure 6: Recall@10 for quantized vectors vs raw vectors. The search performance of quantized vectors is significantly faster than raw, and recall is quickly recoverable by gathering just 5 more vectors; visible by q u a n t i z e d @ 15 quantized@15 q u an t i ze d @15 . Figure 6 shows the story. While there is a recall difference, as to be expected, it's not significant. And, the recall difference dissappears by gathering just 5 more vectors. All this with 2 × 2\\times 2 × faster segment merges and 1/4 of the memory of f l o a t 32 float32 f l o a t 32 vectors. Conclusion Lucene provides a unique solution to a difficult problem. There is no “training” or “optimization” step required for quantization. In Lucene, it will just work. There is no worry about having to “re-train” your vector index if your data shifts. Lucene will detect significant changes and take care of this automatically over the lifetime of your data. Look forward to when we bring this capability into Elasticsearch! Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Jump to Automatic byte quantization in Lucene Scalar quantization 101 Exploring the architecture Quantization per segment in Lucene Quantization that grows with you Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Understanding scalar quantization in Lucene - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/scalar-quantization-in-lucene",
+    "meta_description": "Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Evaluating search relevance part 2 - Phi-3 as relevance judge Using the Phi-3 language model as a search relevance judge, with tips & techniques to improve the agreement with human-generated annotation. ML Research Python TP TV By: Thanos Papaoikonomou and Thomas Veasey On September 19, 2024 Part of Series Evaluating search relevance Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our previous post we discussed the importance of obtaining deep markups of search results for effective evaluation. We showed that even widely used benchmarks, like those comprising the MTEB retrieval challenge, have relatively few judgments in the top 10 result sets returned by many state of the art retrieval models. Furthermore, we did some exploration regarding how missing judgments can lead to incorrect evaluation of model quality, identifying a high proportion of false negatives in the MS MARCO benchmark for the queries we analyzed. Large generative models (LLMs) hold the promise of providing many capabilities in a zero- or few-shot fashion. In the context of search relevance, this amounts to providing expert judgments of the relevance of supplied documents to a query. This can help to significantly lower the effort needed to obtain high quality benchmark datasets for your own search systems. However, as with all tasks involving LLMs there are many design choices and these can significantly affect the quality of results one obtains. In this post, we present a case study of tuning a Phi-3-mini pipeline to provide the best possible agreement with human relevance judgments on the same sample of 100 queries we used to explore missing markup in MS MARCO in our previous post. Phi models: A short introduction Phi-3 is a recent generation of Small Language Models (SLMs) from Microsoft . It was shown that properly curated collections of publicly available web data combined with LLM-created synthetic data, enabled \"small\" language models to match the performance of much larger language models (in some cases 25 times larger) trained on regular data. The release of Phi models under the name \"Phi-3\" has introduced a family of models of varied sizes. More specifically, Phi-3-mini: a 3.8B parameter model trained on approximately 3.3T tokens that follows the transformer decoder architecture. It comes in two different flavors of context sizes, namely: the 4K which is the default and an enlarged version to 128K. Phi-3-mini’s size allows running inference on edge devices like a modern mobile phone. Phi-3-small: a 7B parameter model with a default context length of 8192. Phi-3-medium: a model with 14B parameters which uses the same tokenizer and architecture as Phi-3-mini and is trained on the same data for slightly more epochs. Phi-3-mini has also undergone a post-training process in two stages including supervised finetuning and direct preference optimization to further improve performance and safety of the model. Its strong reasoning capabilities and ability to follow instructions makes it a great choice for our purposes. Typically, the information needed to judge relevance is self-contained in the query and document and so general language skills and reasoning are more important than knowledge for this task. A note on terminology Pairwise and pointwise denote two choices of ranking function. A pointwise ranker gets to see the query and one document at a time and decide how relevant that document is to the query. A pairwise ranking function gets to see the query and two documents at a time and decides which is more relevant. Our experiment: Evaluating search relevance As we alluded, there are multiple choices in the prompt and the way an LLM is used that can influence its effectiveness for any given task. These should be evaluated systematically before it is deployed for real. Although details of the optimal prompt wording may change between LLMs, general principles have emerged that work for a variety of tasks and models. Furthermore, task specific strategies can transfer to different datasets, much as task specific finetuning does. For our annotation task we have clearly defined quality metrics: the number of agreements between the relevance judgment (0 or 1) assigned by the LLM and that we assign ourselves. This constitutes a confusion matrix for each design choice. As well as presenting the optimal choices, we perform some ablation studies for the different choices we found. This guides us on where to expend optimization effort when creating judgment lists for different retrieval tasks. We explore the impact on quality of the following broad design choices: Prompt wording for a zero-shot pointwise approach A pairwise approach where you have one relevant document for each query Few-shot chain-of-thought Ensembling We note in passing that, although we did not explore this, these approaches are often compatible and we expect certain combinations are likely to achieve further improvements. About the annotation task In our previous blog post we touched upon the issue of incomplete markup in the MTEB benchmark and as a proof concept we created a dataset from the dev portion of the MS MARCO dataset. More specifically, we pulled 100 queries and kept the top-5 negative documents (based on the original relevance judgments file) that were retrieved with BM25 and reranked with Cohere-v2 . We then manually scored all 500 (query, document) pairs. Regarding the manual annotation task we decided to go with a strict approach and mark documents as positives when there was high enough confidence. We tried to be consistent by adhering to the following criteria: The retrieved documents are on-topic and contain all the necessary information to address the query. In the case of ambiguous queries we resort to the positive document (labeled through the original relevance judgments supplied with MS MARCO) to infer some additional metadata. Regarding the second point, we can use as an example the query age requirements for name change (MS MARCO query ID 14151) . In this case, most of the returned documents seemed relevant at first glance as they provided on-topic information for name change in different states, but by reviewing the positive document we saw that the age requirements were localized for the state of California. Thus, documents which contained information for other states were finally marked as irrelevant. Given the metadata are not provided to the relevance model, we would not recommend this approach evaluating your own retrieval systems. However, for this task we decided to try as much as possible to stay in agreement with the intentions of the original annotators. Finally, there were 33 documents we decided to remove. These included queries and their associated documents for which the positive document did not appear to provide any useful information, judgments we did not feel qualified to make and documents which appeared to answer the question, but provided contradictory information to the positive example. We felt these would introduce errors to the evaluation process without providing any additional useful differentiation between design choices. In summary, out of the 500 pairs considered, our manual annotation gave us: 288 relevant documents 179 irrelevant documents 33 removed documents We ended up making some revisions to our initial markup based on reviewing the chain-of-thought judgment errors and their rationale and have updated our previous post to reflect this process. This underlines how challenging it can be to determine relevance. We discuss the revision process in the context of that experiment. About the language model We decided to work with the Phi-3-mini-4k-instruct model using the transformers library because: It contains only 3.8B parameters that when loaded with 4-bit quantization (through bitsandbytes it can easily fit into a consumer-grade GPU As noted in the technical report : the model performs quite well on reasoning tasks, and the model does not have enough capacity to store too much \"factual knowledge\" which can be an advantage in the context of retrieval evaluation as we expect a smaller hallucination rate For details on how we use this model see the notebook . Results summary Our baseline performance was obtained by the simple zero-shot pointwise prompt shown below. We choose to measure the quality of the LLM judge using the micro F1 score vs the ground truth labeling. Specifically, this is the fraction of correctly labeled (query, document) pairs or overall accuracy. This prompt achieves an F1 score of 0.7002. This is the baseline prompt where query_text and retrieved_text are placeholders for the query and document to judge The figure below summarizes the accuracy we obtain from the various techniques we evaluated. In practice, one would typically want to apply these techniques to unseen queries and documents (or else you don’t save any effort). By ‘fine-tuning’ the behavior of the model on a smaller (validation) set we implicitly assume that these improvements will ‘transfer’ well to the test dataset. Whilst this is usually justified, if you have more data it can be useful to hold out a portion for testing the optimisations you make. Figure 1. provides a summary of the impact on the accuracy of the various design choices we studied. We discuss these in detail next. The role of word choice on prompt performance Optimizing a prompt for a downstream task can be quite challenging; as others have noted, even small paraphrases can have a drastic effect on performance . Some good general guidelines are that prompts should: Maintain consistent style and naming conventions, Clearly delineate distinct content Unambiguously define the task, and Avoid extra verbiage (except for special cases) Consistent naming of entities minimizes the risk that the LLM becomes confused about task descriptions that reference them. Capitalisation or punctuation can help. In the example below, we always refer to \"Query\" and \"Retrieved Document\" making clear that these are proper names we have defined in the context that we reference them. Keeping separate content clearly delineated helps the attention mechanism avoid confounding content you wish to disambiguate. In the example below we use ##### to surround the supplied \"Query\" and \"Retrieved Document\", keeping them distinct from their definition in the task description. In general, you should strive to make each sentence clear and concise. Clear unambiguous task definitions are particularly important. One can observe dramatic effects, particularly for weaker models, from accidental typos, contradictory content and missing punctuation in key passages. It is always worth printing and proofreading your prompts. Aside from \"bugs\", there is often a trade off in precision and conciseness; being too brief can introduce ambiguity. Furthermore, some tasks are intrinsically nuanced and so lack a simple clear definition. Judging relevance is such a task since it can depend on multiple factors some of which are left unsaid. We will return to this topic later. Aside from certain special phrases, such as those which condition for particular behavior in the LLM, irrelevant content usually simply acts as a distraction and should be avoided. The best zero shot variant of our baseline prompt is shown below. In particular, the variants we compared excluded pairwise and chain-of-thought strategies, which we will discuss later. This is the best zero-shot pointwise prompt The results from this first experiment are shown in the table below: LLM\\Relevant LLM\\Not Relevant Human\\Relevant 244 44 Human\\Not Relevant 70 109 Table 1. Pointwise results The micro F1-score in this case is 0.7559, whereas focusing only on the \"Relevant\" class we get Precision = 0.7771, Recall = 0.8472 and F1(Relevant class) = 0.8106. It’s worth noting that the LLM output could be parsed without any issues, i.e. the output always contained the tokens either \"Relevant\" or \"Not Relevant\". For the high-level design of the prompt we drew some inspiration from this paper that builds a taxonomy for prompt templates in the context of zero-shot LLM re-rankers and we adapt it to our specific task. The taxonomy is built around the following high-level components: Evidence : the query and the associated passage(s) to be judged Task Instruction : The instructions associated with our judging task Output Type : The instructions that specify the desired format for the output Tone Words : Positive or negative expressions that aim to steer the performance of the language model e.g. \"please\", \"good luck\" Role Playing : The instructions asking the LM to \"impersonate\" a role We used the following prompt template to logically group the prompt components (the element tags are for illustrative purposes and were not used to prime the language model). This is the prompt template structure. The element tags are here just for illustrative purposes, they were not part of the prompt sent to the language model. A benefit of adding structure to the prompt template is that it makes it easier to iterate on word choices in a more systematic way. We present some of the results of those experiments below. Tone words We investigated three different choices for the \"Tone words\" section as presented above. We tried the following versions: Good luck which gives a micro-F1 score of 0.7473. A version that urges the LM to produce a good score such as It's absolutely crucial that you perform well on this task as it will have a direct impact on the quality of the search results for our users . The micro F1-score in this case was 0.7259 Removing the tone words section gave us the best score (0.7559) as depicted above. Output instructions In the prompts above we have set the output instruction as You should provide your answer in the form of a boolean value: \"Relevant\" or \"Not Relevant\" . We have found that Phi-3 behaves very well and adheres to the output format that we requested with 0 errors. Removing that guideline leads to unparsable output as expected. \"Take a step back\" reasoning In the prompts presented above we have added a line to steer the model into employing a \"step back\" reasoning ( Take a step back and reflect carefully on how best to solve your task ). Our experiments show that the role of this component is absolutely critical . By removing this line we observed a significant drop in the accuracy as the model entered a failure mode over-predicting one class over the other (in this case the \"Relevant\" class). More specifically, the set of prompts that contained this guideline led to a prediction rate of 68.9% on average for the \"Relevant\" class whereas in prompts with the guideline removed the rate rose to 85.6% rendering the model unusable. (As a reminder from the previous blog post the percentage of \"Relevant\" data points after manual annotation is ~57%.) Role playing For the role-playing section we didn’t observe a clear pattern regarding its effect. Adding or removing this component yielded small differences in absolute terms (mostly positive but also some negative). More precise instructions In the discussion so far we have implicitly assumed that the language model ‘knows’ how to interpret the task instructions in order to assess the relevance between the query and the document. We also investigated the effect of more detailed instructions following the work of Thomas et al . There, the authors elicit a step-by-step reasoning by asking the model to explicitly score two aspects of the problem - namely topicality and trust - prior to deciding on a final score for the query, document pair. Inspired by the methodology of Thomas et al. we incorporated the following guidelines (and variations of it) between the evidence and the output instructions sections: Adding instructions In this case we didn’t observe a benefit from adding these guidelines in the accuracy metric and the best score we achieved was 0.7238. One common pattern was the tendency of the language model to be more 'hesitant' into predicting the 'Relevant' class which can be attributed to the additional criteria that need to be satisfied. In other words, it seems like a useful ‘knob’ that can be tuned to regulate the behavior of a language model if you are more sensitive to certain classes of error. Pairwise Pairwise methods have been considered in the context of LLM reranking, where an LLM is presented with a query and two candidate passages and it is asked to select the better one. After repeating multiple times for different pairs of documents, these intermediate results are aggregated to produce the final ranking of the documents. It has been shown that pairwise approaches are usually quite effective and also exhibit low variability in the generated results. The superiority, especially compared to pointwise methods, can be potentially explained by the fact that the extra document(s) provide additional context for making individual relevance judgments. Pairwise comparisons have been also applied more broadly in LLM evaluation setups such as MT-Bench where the LLM acting as judge is presented with a question and two answers and is asked to determine which one is better or declare a tie. This is a good way to enhance stability according to recent studies . Here, we experiment to see if we can improve accuracy by injecting a positive example. This might sound like it defeats the purpose to reduce manual labeling effort. However, it actually aligns well with the deep mark up scenario: it enables you to \"transfer\" your judgment on one result to label the top-n. The most relevant work is this 1-shot labeling (1SL) study. It explores the effectiveness of extending manual markup with noisy LLM labels using the same pairwise approach. The authors found significant improvements in correlation to full human evaluations of a retrieval system compared to evaluating using single labels alone. This was despite errors in the LLM labels. The prompt template is adapted as follows: Pairwise prompting The results from this adaptation to the prompt are shown in the table below LLM\\Relevant LLM\\Not Relevant Human\\Relevant 246 42 Human\\Not Relevant 58 121 Table 2. Pairwise results where we observe a micro-F1 score of 0.7859 and focusing on the relevant class we get precision equal to 0.8092, a recall score of 0.8542 and a F1 score of 0.8311 From a high level perspective we ask the language model to perform two sub-tasks before responding: Extract the information from the positive document that answers the query Look for that piece of information in the retrieved document and respond with a relevance label Another interesting avenue to explore is to combine this sort of approach with current synthetic data generation techniques like RAGAS to enhance offline evaluation. In these frameworks documents are sampled from the target corpus and are fed to an LLM \"asking\" it to produce a suitable query. These (query, document) pairs form a test collection that can then be used to evaluate the performance of a QA pipeline or a RAG setup more generally. This process seems likely to also benefit from 1SL. Few-shot chain-of-thought In-context learning (ICL) has been a distinctive characteristic of the current state-of-the-art language models where through a few demonstrations the model can generalize to unseen inputs. A recent paper from Microsoft provides some intuition around this behavior by treating ICL as implicit finetuning where the LLMs act as meta-optimizers that produce meta-gradients according to the demonstration examples. Separately, it has been demonstrated in many contexts that getting LLMs to first generate text that provides some analysis of the task can improve performance. This is particularly important for tasks that require some form of reasoning. There are many variants on this theme, the prototype was chain-of-thought prompting. This asks the LLM to break complex tasks into multiple smaller steps and solve them one-by-one. These two techniques can be combined and we study this strategy below. Giving the LLM examples of our preferred relevance judgments, using the gradient descent analogy, should cause it to better align with our preferences. In the spirit of active learning these examples can be tailored to errors the LLM is observed to make. Furthermore, since judgments typically require weighing multiple factors, having the LLM clearly explain these to itself before it makes its judgment is a good candidate for chain-of-thought. We decided to steer the steps the LLM uses for its chain-of-thought as follows: Expand the query to try and infer intent Summarize the document to extract the key information it provides Justify the relevance judgment Make the judgment Note that autoregressive models only get to attend to the tokens they have already generated, so it is vital that the judgment step comes at the end. An additional advantage of generating rationales is that these can be useful to present to human annotators. Recent research suggests these should not be considered as an explanation of the LLMs \"thought process\". However, we found in practice that they are still often useful to diagnose error cases. Furthermore, we found them to be consistently useful to understand a point of view for the relevance judgment and this actually caused us to revise some of our original relevance judgments. There are downsides with this approach: It costs far more in terms of input and output tokens, There is a risk that information is hallucinated in the generation process which misinforms the relevance judgment, Models can place too much emphasis on examples. The below shows the prompt we evaluated. The choice of examples to supply is important. These were designed to demonstrate a mixture of skills we thought would be useful as well as fix point issues we were observing: Example 1 is a case when matching a synonym, which is often useful, is undesirable because the query intent is rather specific. The user is likely looking for the dictionary definition of a word they do not know, so synonym definitions are not useful. This was responsible for a number of mistakes in our test corpus. Example 2 demonstrates inductive reasoning applied to a relevance judgment. This is often useful, with the assessment having to weigh up the probability the document matches the user’s intent. Example 3 is an example of entity resolution. Queries that mention specific entities usually require examples related to those specific entities and this overrides other considerations. Few-shot Chain-of-thought prompt Before discussing the results, it is worth highlighting some of the subtleties involved in making good judgments. One recurring challenge we found in marking up this small dataset is how to treat documents that only provide links to information that would likely answer the question. In the context of a web search engine this is a relevant response; however, if the snippet were being supplied to an LLM to generate a response it may be less useful. We followed the positive examples in such cases (which were in fact inconsistent in this respect!). In general, we advise you to carefully consider what constitutes relevant in the context of your own datasets and retrieval tasks and suggest that this is often best done by trying to generalize from example queries and their retrieved results. Another challenge is where to draw the line in what constitutes relevant information. For example, the main helpline for Amazon may not be directly related to a query for \"amazon fire stick customer service\" but it would likely be useful. Again we were guided by the positive examples when making our own judgments. The LLM false negative errors we observed fell into several categories. For the sake of brevity we mention the two dominant ones: we define these as pedantry and selective blindness. For pedantry the LLM is correct in a pedantic sense, but in our opinion it fails to properly assess the balance of probability that the document is useful. For selective blindness the LLM seems to miss important details in its reading of the document content. We give examples below. pedantry blindness Table 3. Examples of false negative LLM relevance judgments Similarly, we saw several interesting false positive categories: we define two of these as confabulations and invented content. For confabulations the LLM fails to distinguish between entities in the query and document. For invented content it typically invents information about either the query or document which misleads the relevance judgment. We include examples below. Confabulations Invented Content Table 4. Examples of false positive LLM relevance judgments In the first example, it is interesting that Titanic is misinterpreted as Titan. This is likely in part a side effect of the tokenization: neither titan nor titanic are present in the Phi-3 vocabulary. In the second example, we hypothesize that the question in the retrieved document confuses the LLM regarding what is the actual query. It is possible that better delineation of context could help with this specific problem. Finally, the process of reading LLM rationales caused us to revise our judgment of 15 cases and exclude one additional example (on the grounds that the positive example was wrong). A typical example is that the LLM includes information which helps us better evaluate relevance itself. We show such a case below. In this context we strongly advise to double check any knowledge beyond the data that the LLM brings since there is risk of hallucination. The final results with the updated markup are provided in the table below. The micro F1-score in this case is 0.773, whereas focusing only on the \"Relevant\" class we get Precision = 0.7809, Recall = 0.8785 and F1(Relevant class) = 0.8268. In all cases the LLM output could be parsed without any issues, i.e. it always contained either Answer: \"Relevant\" or Answer: \"Not Relevant\" . LLM\\Relevant LLM\\Not Relevant Human\\Relevant 253 35 Human\\Not Relevant 71 108 Table 5. Chain-of-thought results Ensemble One common technique to increase the performance of AI systems in language tasks is to first perform multiple inferences over the same set of input data and then aggregate the intermediate results - usually through majority voting . Here is also an interesting paper discussing how the number of LM calls and the \"hardness\" of the queries affect the performance of systems which aggregate LM responses in a similar manner. We experimented with multiple inferences and majority voting as well: in order to simulate this behavior in our setup we enabled sampling (more in the accompanying notebook ), we increased the temperature to 0.5 and finally we asked the model to return 5 sequences through beam search that were aggregated to output a single response via majority voting. As an extra metric we computed the \"majority rate\" per data point which is simply the percentage of sequences matching the majority response. It is worth noting that temperature only affects token selection. Therefore, it can only affect matters after the first token is decoded. Even so this can affect outputs from all our prompts. For example, it can insert an extra token at the start of the sequence or it can deviate into alternative forms like \"Not sure\". We parse the raw output using a regular expression to match the instructed output and discard votes from runs we were unable to parse. We investigated both pointwise and pairwise prompts: In the pointwise case, we observed a consistent uplift between 0.6 and 1.5 percentage points (in terms of micro-F1) for 'suboptimal' prompts i.e. whose score was below the optimal (~0.76). On the other hand, majority voting did not give us a better top score. In other words, the gains are larger when you are far from the top but as you get closer you reach a plateau in terms of performance. In the pairwise case the increase was much more modest (~0.2 pp) and less robust as we came across cases where the effect was in fact negative. Another interesting aspect came from the analysis of the majority rate: as data points with values less than 1.0 surfaced queries that they were either ambiguous (e.g. query ID 1089846 - the economics is or are ) or \"hard to decide\" i.e. some of the 33 undefined cases that were eliminated from evaluation. Overall, such aggregation techniques are becoming more and more common given the performance boost they offer on certain tasks. Nevertheless, it’s an area that requires some additional exploration considering the extra computational cost tied to the additional inference steps. Also bear in mind that the more constrained the output the less scope it has to affect results. Conclusion We set out to explore whether we can use an LLM to bootstrap a search evaluation dataset. We took as a test case a sample of queries from MS MARCO together with the most relevant documents found by a strong retrieval system. We showed using a strong, small and permissively-licensed language model (in our case Phi-3 mini) that we are able to achieve good correlation with human judgments. At the same time we wanted to explore design choices in the use of the LLM. We tested: The prompt wording, Supplying a known relevant document, Chain-of-thought, and Ensembling multiple generations. We found significant gains could be obtained just by changing the prompt wording. We suggest some general guidelines for writing effective prompts. However, small details can have noticeable effects. Generally, some trial and error is required and automation frameworks can be useful when you have a test set. Chain-of-thought produced further improvements but comes at the cost of expensive generation. Judgment rationales are interesting in their own right and allowed us to improve our original manual markup. Our best result was obtained with a pairwise approach which supplied an example positive document. We postulate that part of the gain comes because it better aligns with our evaluation criteria, which was strongly steered by the supplied relevant document. However, other studies have shown improvements over pointwise approaches and we think it represents an interesting strategy for obtaining deep markup from shallow markup. Finally, we explored using multiple generations, via non-zero temperature. This achieved some modest benefits for our baseline strategy, but we expect the main advantage to be associated with complex outputs such as chain-of-thought. A natural next step would be ensembling of prompts. In this blog, we really just scratched the surface of using LLMs for evaluating search relevance. Longer term we expect them to have a larger and larger role in multiple aspects of building information retrieval systems and this is something we’re actively exploring at Elastic. Report an issue Related content Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Jump to Phi models: A short introduction A note on terminology Our experiment: Evaluating search relevance About the annotation task About the language model Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Evaluating search relevance part 2 - Phi-3 as relevance judge - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/evaluating-search-relevance-part-2",
+    "meta_description": "Using the Phi-3 model for evaluating search relevance, with tips for tuning the Phi-3-mini pipeline to achieve good correlation with human judgments. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Building a search app with Blazor and Elasticsearch Learn how to build a search application using Blazor and Elasticsearch, and how to use the Elasticsearch .NET client for hybrid search. Vector Database .NET Python How To GL By: Gustavo Llermaly On October 9, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. In this article, you will learn how to leverage your C# skills to build a search application using Blazor and Elasticsearch. We are going to use the Elasticsearch .NET client to run full text , semantic and hybrid search queries. NOTE If you are familiar with the older version of the Elasticsearch C# client NEST , read this blog post about NEST client deprecation and new features. NEST was the previous generation of the .NET client which was replaced with the Elasticsearch .NET Client . What is Blazor? What is ESRE? Configuring ELSER Indexing data Building application What is Blazor? Image: ASP.NETCoreBlazorhostingmodels Blazor is an open source HTML, CSS, and C# based web framework created by Microsoft to allow developers to build web applications that run on the client or the server. Blazor also allows you to make reusable components to build applications faster; it enables developers to build the HTML view and actions in C# within the same file, which helps maintain readable and clean code. Additionally, with Blazor Hybrid you can build native mobile apps accessing the native platform capabilities via .NET code. Some of the features that makes Blazor a great framework to work with: Server-side and client-side rendering options Reusable UI components Real-time updates with SignalR Built-in state management Built-in routing system Strong typing and compile-time checks Why Blazor? Blazor offers several advantages over others frameworks and libraries: it allows developers to use C# for both client and server code, providing strong typing and compile-time checking which enhances reliability. It integrates seamlessly with the .NET ecosystem, enabling the reuse of .NET libraries and tools, and offers robust debugging support. What is ESRE? Elasticsearch Relevance Engine™ (ESRE) is a set of tools to build search applications using machine learning and and artificial intelligence on top of the powerful Elasticsearch search engine. To learn more about ESRE, you can read our insightful blog post located here Configuring ELSER To leverage Elastic's ESRE capabilities, we are going to use ELSER as our model provider. Note to use the ELSER models of Elasticsearch, you must to have a Platinum or Enterprise license, and have a minimum dedicated Machine Lerning (ML) node of 4GB of size. Read more about here. Start by creating the inference endpoint: If this is your first time using ELSER, you may encounter a 502 Bad Gateway error as the model loads in the background. You can check the model's status in Machine Learning > Trained Models in Kibana. Once it is deployed, you can proceed to the next step. Indexing data You can download the dataset here and then import the data using Kibana. To do this, go to the homepage and click \"Upload data\". Then, upload the file and click Import . Finally, go to the Advanced tab and paste the following mappings: We are going to create an index capable of running semantic and full text queries. The semantic_text field type will take care of data chunking and embedding. Note we are indexing longDescription as semantic_text , you can use copy_to if you want to index a field as both semantic_text and text` . Building the app with Blazor & Elasticsearch API key The first thing we need to do is create an API key to authenticate our requests to Elasticsearch. The API key should be read-only and allowed only to query the books-blazor index. You will see something like this: Save the value of the encoded response field as it's needed later. If you are running on Elastic Cloud , you will also need your Cloud ID. (You can find it here ). Creating the Blazor project Start by installing Blazor and creating a sample project following the official instructions . One you have the project created, the folder structure and files should look like this: The template application includes Bootstrap v5.1.0 for styling. Finish the project setup by installing Elasticsearch .NET client: Once you finish this step, your page should look like this: Folders structure Now, we are going to organize our folders as follows: Files explained: Components/Pages/Search.razor: main page containing the search bar, the results, and the filters. Components/Pages/Search.razor.css: page styles. Components/Elasticsearch/SearchBar.razor: search bar component. Components/Elasticsearch/Results.razor: results component. Components/Elasticsearch/Facet.razor: filters component. Components/Svg/GlassIcon.razor: search icon. Components/_Imports.razor: this will import all the components. Models/Book.cs: this will store the book field schema. Models/Response.cs: this will store the response schema, including the search results, facets and total hits. Services/ElasticsearchService.cs: Elasticsearch service. It will handle the connection and queries to Elasticsearch. Initial Configuration Let's start with some clean-up. Delete the files: Components/Pages/Counter.razor Components/Pages/Weather.razor Components/Pages/Home.razor Components/Layout/NavMenu.razor Components/Layout/NavMenu.razor.css Check the /Components/_Imports.razor file. You should have the following imports: Integrating Elastic into the project Now, let’s import the Elasticsearch components: We are going to remove the default sidebar to have more space for our application by removing it from the /Components/Layout/MainLayout.razor file: Now let's enter the Elasticsearch credentials for the user secrets : Using this approach, .Net 8 stores sensitive data in a separate location, outside of the project folder and makes it accessible using the IConfiguration interface. These variables will be available to any .Net project that uses the same user secrets. Then, let's modify the Program.cs file to read the secrets and mount the Elasticsearch client: First, import the necessary libraries: BlazorApp.Services: contains the Elasticsearch service. Elastic.Clients.Elasticsearch: imports the Elasticsearch client .Net 8 library. Elastic.Transport: imports the Elasticsearch transport library, which allows us to use the ApiKey class to authenticate our requests. Second, insert the following code before the var app = builder.Build() line: This code will read the Elasticsearch credentials from the user secrets and create an Elasticsearch client instance. After the ElasticSearch client initialization, add the following line to register the Elasticsearch service: The next step will be to build the search logic in the /Services/ElasticsearchService.cs file: First, import the necessary libraries and models: Second, add the class ElasticsearchService , constructor and variables: Configuring search Now, let's build our search logic: BuildFilters will build the filters for the search query using the selected facets by the user. BuildHybridQuery will build a hybrid search query that combines full text and semantic search. Next, add the search method: SearchBooksAsync : will perform the search using the hybrid query and return the results included aggregations for building the facets. FormatFacets : will format the aggregations response into a dictionary. ConvertFacetDictionary : will convert the facet dictionary into a more readable format. The next step is to create the models that will represent the data returned in the hits of the Elasticsearch query that will be printed as the results in our search page. We start by creating the file /Models/Book.cs and adding the following: Then, setting up the Elastic response in the /Models/Response.cs file and adding the following: Configuring a basic UI Next, add the SearchBar component. In the file /Components/Elasticsearch/SearchBar.razor and add the following: This component contains a search bar and a button to perform the search. Blazor provides great flexibility by allowing generating HTML dynamically using C# code within the same file. Afterwards, in the file /Components/Elasticsearch/Results.razor we will build the results component that will display the search results: Finally, we will need to create facets to filter the search results. Note: Facets are filters that allow users to narrow down search results based on specific attributes or categories, such as product type, price range, or brand. These filters are typically presented as clickable options, often in the form of checkboxes, to help users refine their search and find relevant results more easily. In Elasticsearch context, facets are created using aggregations . We set up facets by putting the following code In the file /Components/Elasticsearch/Facet.razor : This component reads from a terms aggregation on the author , categories , and status fields, and then produces a list of filters to send back to Elasticsearch. Now, let's put everything together. In /Components/Pages/Search.razor file: Our page is working! As you can see, the page is functional but lacks styles. Let's add some CSS to make it look more organized and responsive. Let's start replacing the layout styles. In the Components/Layout/MainLayout.razor.css file: Add the styles for the search page in the Components/Pages/Search.razor.css file: Our page starts to look better: Let's give it the final touches: Create the following files: Components/Elasticsearch/Facet.razor.css Components/Elasticsearch/Results.razor.css And add the styles for Facet.razor.css : For Results.razor.css : Final result: To run the application you can use the following command: dotnet watch You did it! Now you can search for books in your Elasticsearch index by using the search bar and filter the results by author, category, and status. Performing full text and semantic search By default our app will perform a hybrid search using both full text and semantic search . You can change the search logic by creating two separate methods, one for full text and another for semantic search, and then selecting one method to build the query based on the user's input. Add the following methods to the ElasticsearchService class in the /Services/ElasticsearchService.cs file: Both methods work similarly to the BuildHybridQuery method, but they only perform full text or semantic search. You can modify the SearchBooksAsync method to use the selected search method: You can find the complete application here Conclusion Blazor is an effective framework that allows you to build web applications using C#. Elasticsearch is a powerful search engine that allows you to build search applications. Combining both, you can easily build robust search applications, leveraging the power of ESRE to create a semantic search experience in a short time. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to What is Blazor? Why Blazor? What is ESRE? Configuring ELSER Indexing data Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Building a search app with Blazor and Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/search-app-with-esre-blazor",
+    "meta_description": "Learn how to build a search app using Blazor and Elasticsearch, including how to add the search bar, configure hybrid search, build search logic, and more."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Integration tests using Elasticsearch This series demonstrates improvements for integration tests using Elasticsearch and advanced techniques to further reduce execution time in Elasticsearch integration tests. Part1 Java How To October 3, 2024 Testing your Java code with mocks and real Elasticsearch Learn how to write your automated tests for Elasticsearch, using mocks and Testcontainers PP By: Piotr Przybyl Part2 How To November 13, 2024 Faster integration tests with real Elasticsearch Learn how to make your automated integration tests for Elasticsearch faster using various techniques for data initialization and performance improvement. PP By: Piotr Przybyl Part3 How To January 31, 2025 Advanced integration tests with real Elasticsearch Mastering advanced Elasticsearch integration testing: Faster, smarter, and optimized. PP By: Piotr Przybyl Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Integration tests using Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/integration-tests-using-elasticsearch",
+    "meta_description": "This series demonstrates improvements for integration tests using Elasticsearch and advanced techniques to further reduce execution time in Elasticsearch integration tests."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Live log and prosper: Elasticsearch newly specialized logsdb index mode Elasticsearch’s latest innovation in log management, logsdb, cuts the storage footprint of log data by up to 65%, enabling observability and security teams to expand visibility without exceeding their budget while keeping all data accessible and searchable. How To MS GK AS By: Mark Settle , George Kobar and Amena Siddiqi On December 12, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch's new index mode, logsdb, reduces log storage needs by up to 65% Today, we announce the general availability of Elasticsearch's new index mode, logsdb, which reduces the storage footprint of log data by up to 65% compared to recent versions of Elasticsearch without logsdb. This dramatic improvement enables observability and security teams to expand visibility without exceeding their budget while keeping all data immediately accessible for analysis. Logsdb optimizes the ordering of data, eliminates duplication by reconstructing non-stored field values on the fly with synthetic _source , and improves compression with advanced algorithms and codecs, leveraging columnar storage within Elasticsearch for efficient log storage and retrieval. Enhance analytics and reduce costs by improving storage efficiency with logsdb index mode Logs provide critical signals for detecting and remediating observability and security issues — and their utility is increasing as AI advancements ease the analysis of text-based data — so efficient storage and performant access matter more than ever. Unfortunately, the growing log volume generated by infrastructure and applications is driving up costs, forcing compromises that hamper analysis: limit collection, reduce retention, or relegate fresh data to siloed archive tiers. Logsdb directly addresses these challenges. With greater storage efficiency, you can collect more data and avoid the hassle of complicated data filtering. You can retain logs longer to support threat hunting, incident response, and compliance requirements. And because all data is always searchable, you can get fast insights, no matter how large your data set grows. Technical innovation behind logsdb index mode Logsdb index mode dramatically reduces the disk footprint of log data with smart index sorting, synthetic _source, and advanced compression. Implementing it can reduce log storage needs by up to 65%, compared to recent versions of Elasticsearch without logsdb. While logsdb currently uses more CPU during indexing, its efficient storage reduces overall costs for most customers. For customers who need long-term retention, we expect total cost of ownership (TCO) reductions of up to 50%. Smart index sorting improves storage efficiency by up to 30% and reduces query latency on some logging data sets by locating similar data close together. By default, it sorts indices by host.name and @timestamp. If your data has more suitable fields, you can specify them instead. Advanced compression significantly reduces storage requirements for text-heavy data like logs through Zstandard compression (Zstd), delta encoding, run-length encoding, and other smart codecs that are automatically chosen. Doc-values, which are stored in a columnar format optimized for compression and performance, enable efficient storage and retrieval of field values for sorting, aggregations, and scripting. Synthetic _source enables organizations to trim storage needs by another 20-40% by discarding the _source field and fully or partially reconstructing it on demand. While the feature sometimes requires more compute for indexing and retrieval, testing shows that it delivers measurable net efficiency improvements. Synthetic _source is built on nearly two years of production usage with metrics, with numerous enhancements for logs, including support for nearly all field types. Resulting storage savings are propagated through the index lifecycle phases. A storage reduction of 65% in the hot tier will result in the same reduction in the warm, cold, and frozen tiers, as well as reduce the footprint for storing snapshots in bucket storage. No visibility compromises: Retain all logs for observability and security Logs are the foundation of visibility into infrastructure and applications, providing the simplest and most essential signal for monitoring and troubleshooting. However, costs are rising as logging volumes grow. This challenge is forcing customers to implement complex filtering and management policies, delete data prematurely, and strand relevant logs in stores that require a day or longer to rehydrate before analysis. Without a complete, easily searchable, and accessible data set, finding and resolving issues is substantially more challenging. Logsdb index mode builds on breakthrough Elasticsearch capabilities like searchable snapshots and Automatic Import to address these pain points for operations and security teams: Reduce costs: Logsdb reduces the storage footprint of logs by up to 65%, enabling organizations to reduce storage expenses while retaining more data. This translates to cost savings across all storage tiers — from hot to frozen — and higher productivity for the observability and security teams who use this data. Preserve valuable data: Logsdb keeps all your log data and improves operational efficiency without relying on extra tools or complicated filters. With features like synthetic _source, preserve the value of data without storing the entire source document. Expand visibility: Logsdb provides efficient access to all data on one platform, without separate silos for observability, security, and historical data. For site reliability engineers (SREs), it accelerates problem resolution by enabling analysis of logs alongside metrics, traces, and business data. Likewise, for security operations center (SOC) teams, it accelerates investigation and remediation by eliminating blind spots. Streamline access to data: Logsdb lets SRE teams efficiently retain actionable data for troubleshooting, trending, and analysis. Similarly, SOC teams can swiftly search all of their data for investigation and threat hunting without incurring exorbitant costs. Logsdb is ready for your environment Elasticsearch logsdb index mode is generally available for Elastic Cloud Hosted and Self-Managed customers starting in version 8.17 and is enabled by default for logs in Elastic Cloud Serverless . Basic logsdb capabilities (including smart index sorting and advanced compression) are available to organizations with Standard, Gold, and Platinum licenses. Complete logsdb capabilities that further reduce storage requirements (including synthetic _source) are available to serverless customers and organizations with an Enterprise license. Elasticsearch logsdb in action Logsdb enables you to keep all your log data and improve operational efficiency without narrowing collection or discarding or siloing data. With capabilities like smart index sorting, advanced compression, and synthetic _source, keep and analyze the data you need within a budget that works for you. Want to experience it for yourself? Try Elastic at no cost . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Elasticsearch's new index mode, logsdb, reduces log storage needs by up to 65% Enhance analytics and reduce costs by improving storage efficiency with logsdb index mode Technical innovation behind logsdb index mode No visibility compromises: Retain all logs for observability and security Logsdb is ready for your environment Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Live log and prosper: Elasticsearch newly specialized logsdb index mode - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-logsdb-index-mode",
+    "meta_description": "Dive into Logsdb, Elasticsearch's new index mode. Discover Logsdb's capabilities and advantages, including how it reduces log storage needs by up to 65%."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Evaluating search relevance part 1 - The BEIR benchmark Learn to evaluate your search system in the context of better understanding the BEIR benchmark, with tips & techniques to improve your search evaluation processes. ML Research Python TP TV By: Thanos Papaoikonomou and Thomas Veasey On July 16, 2024 Part of Series Evaluating search relevance Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This is the first in a series of blog posts discussing how to think about evaluating your own search systems in the context of better understanding the BEIR benchmark. We will introduce specific tips and techniques to improve your search evaluation processes in the context of better understanding BEIR. We will also introduce common gotchas which make evaluation less reliable. Finally, we note that LLMs provide a powerful new tool in the search engineers' arsenal and we will show by example how one can use them to help evaluate search. Understanding the BEIR benchmark in search relevance evaluation To improve any system you need to be able to measure how well it is doing. In the context of search BEIR (or equivalently the Retrieval section of the MTEB leaderboard) is considered the “holy grail” for the information retrieval community and there is no surprise in that. It’s a very well-structured benchmark with varied datasets across different tasks. More specifically, the following areas are covered: Argument retrieval (ArguAna, Touche2020) Open-domain QA (HotpotQA, Natural Questions, FiQA) Passage retrieval (MSMARCO) Duplicate question retrieval (Quora, CQADupstack) Fact-checking (FEVER, Climate-FEVER, Scifact) Biomedical information retrieval (TREC-COVID, NFCorpus, BioASQ) Entity retrieval (DBPedia) Citation prediction (SCIDOCS) It provides a single statistic, nDCG@10, related to how well a system matches the most relevant documents for each task example in the top results it returns. For a search system that a human interacts with relevance of top results is critical. However, there are many nuances to evaluating search that a single summary statistic misses. Structure of a BEIR dataset Each benchmark has three artefacts: the corpus or documents to retrieve the queries the relevance judgements for the queries (aka qrels ). Relevance judgments are provided as a score which is zero or greater. Non-zero scores indicate that the document is somewhat related to the query. Dataset Corpus size #Queries in the test set #qrels positively labeled #qrels equal to zero #duplicates in the corpus Arguana 8,674 1,406 1,406 0 96 Climate-FEVER 5,416,593 1,535 4,681 0 0 DBPedia 4,635,922 400 15,286 28,229 0 FEVER 5,416,568 6,666 7,937 0 0 FiQA-2018 57,638 648 1,706 0 0 HotpotQA 5,233,329 7,405 14,810 0 0 Natural Questions 2,681,468 3,452 4,021 0 16,781 NFCorpus 3,633 323 12,334 0 80 Quora 522,931 10,000 15,675 0 1,092 SCIDOCS 25,657 1,000 4,928 25,000 2 Scifact 5,183 300 339 0 0 Touche2020 382,545 49 932 1,982 5,357 TREC-COVID 171,332 50 24,763 41,663 0 MSMARCO 8,841,823 6,980 7,437 0 324 CQADupstack (sum) 457,199 13,145 23,703 0 0 Table 1 : Dataset statistics. The numbers were calculated on the test portion of the datasets ( dev for MSMARCO ). Table 1 presents some statistics for the datasets that comprise the BEIR benchmark such as the number of documents in the corpus, the number of queries in the test dataset and the number of positive/negative (query, doc) pairs in the qrels file. From a quick a look in the data we can immediately infer the following: Most of the datasets do not contain any negative relationships in the qrels file, i.e. zero scores, which would explicitly denote documents as irrelevant to the given query. The average number of document relationships per query ( #qrels / #queries ) varies from 1.0 in the case of ArguAna to 493.5 ( TREC-COVID ) but with a value < 5 for the majority of the cases. Some datasets suffer from duplicate documents in the corpus which in some cases may lead to incorrect evaluation i.e. when a document is considered relevant to a query but its duplicate is not. For example, in ArguAna we have identified 96 cases of duplicate doc pairs with only one doc per pair being marked as relevant to a query. By “expanding” the initial qrels list to also include the duplicates we have observed a relative increase of ~1% in the nDCG@10 score on average. Example of duplicate pairs in ArguAna. In the qrels file only the first appears to be relevant (as counter-argument) to query (“test-economy-epiasghbf-pro02a”) When comparing models on the MTEB leaderboard it is tempting to focus on average retrieval quality. This is a good proxy to the overall quality of the model, but it doesn't necessarily tell you how it will perform for you. Since results are reported per data set, it is worth understanding how closely the different data sets relate to your search task and rescore models using only the most relevant ones. If you want to dig deeper, you can additionally check for topic overlap with the various data set corpuses. Stratifying quality measures by topic gives a much finer-grained assessment of their specific strengths and weaknesses. One important note here is that when a document is not marked in the qrels file then by default it is considered irrelevant to the query. We dive a little further into this area and collect some evidence to shed more light on the following question: “How often is an evaluator presented with (query, document) pairs for which there is no ground truth information?\". The reason that this is important is that when only shallow markup is available (and thus not every relevant document is labeled as such) one Information Retrieval system can be judged worse than another just because it “chooses” to surface different relevant (but unmarked) documents. This is a common gotcha in creating high quality evaluation sets, particularly for large datasets. To be feasible manual labelling usually focuses on top results returned by the current system, so potentially misses relevant documents in its blind spots. Therefore, it is usually preferable to focus more resources on fuller mark up of fewer queries than broad shallow markup. Leveraging the BEIR benchmark for search relevance evaluation To initiate our analysis we implement the following scenario (see the notebook ): First, we load the corpus of each dataset into an Elasticsearch index. For each query in the test set we retrieve the top-100 documents with BM25. We rerank, the retrieved documents using a variety of SOTA reranking models. Finally, we report the “judge rate” for the top-10 documents coming from steps 2 (after retrieval) and 3 (after reranking). In other words, we calculate the average percentage of the top-10 documents that have a score in the qrels file. The list of reranking of models we used is the following: Cohere's rerank-english-v2.0 and rerank-english-v3.0 BGE-base mxbai-rerank-xsmall-v1 MiniLM-L-6-v2 Retrieval Reranking Dataset BM25 (%) Cohere Rerank v2 (%) Cohere Rerank v3 (%) BGE-base (%) mxbai-rerank-xsmall-v1 (%) MiniLM-L-6-v2 (%) Arguana 7.54 4.87 7.87 4.52 4.53 6.84 Climate-FEVER 5.75 6.24 8.15 9.36 7.79 7.58 DBPedia 61.18 60.78 64.15 63.9 63.5 67.62 FEVER 8.89 9.97 10.08 10.19 9.88 9.88 FiQa-2018 7.02 11.02 10.77 8.43 9.1 9.44 HotpotQA 12.59 14.5 14.76 15.1 14.02 14.42 Natural Questions 5.94 8.84 8.71 8.37 8.14 8.34 NFCorpus 31.67 32.9 33.91 30.63 32.77 32.45 Quora 12.2 10.46 13.04 11.26 12.58 12.78 SCIDOCS 8.62 9.41 9.71 8.04 8.79 8.52 Scifact 9.07 9.57 9.77 9.3 9.1 9.17 Touche2020 38.78 30.41 32.24 33.06 37.96 33.67 TREC-COVID 92.4 98.4 98.2 93.8 99.6 97.4 MSMARCO 3.97 6.00 6.03 6.07 5.47 6.11 CQADupstack (avg.) 5.47 6.32 6.87 5.89 6.22 6.16 Table 2 : Judge rate per (dataset, reranker) pairs calculated on the top-10 retrieved/reranked documents From Table 2 , with the exception of TREC-COVID (>90% coverage), DBPedia (~65%), Touche2020 and nfcorpus (~35%), we see that the majority of the datasets have a labeling rate between 5% and a little more than 10% after retrieval or reranking. This doesn’t mean that all these unmarked documents are relevant but there might be a subset of them -especially those placed in the top positions- that could be positive. With the arrival of general purpose instruction tuned language models, we have a new powerful tool which can potentially automate judging relevance. These methods are typically far too computationally expensive to be used online for search, but here we are concerned with offline evaluation. In the following we use them to explore the evidence that some of the BEIR datasets suffer from shallow markup. In order to further investigate this hypothesis we decided to focus on MSMARCO and select a subset of 100 queries along with the top-5 reranked (with Cohere v2) documents which are currently not marked as relevant. We followed two different paths of evaluation: First, we used a carefully tuned prompt (more on this in a later post) to prime the recently released Phi-3-mini-4k model to predict the relevance (or not) of a document to the query. In parallel, these cases were also manually labeled in order to also assess the agreement rate between the LLM output and human judgment. Overall, we can draw the following two conclusions † \\dag † : The agreement rate between the LLM responses and human judgments was close to 80% which seems good enough as a starting point in that direction. In 57.6% of the cases (based on human judgment) the returned documents were found to be actually relevant to the query. To state this in a different way: For 100 queries we have 107 documents judged to be relevant, but at least 0.576 x 5 x 100 = 288 extra documents which are actually relevant! Here, some examples drawn from the MSMARCO / dev dataset which contain the query, the annotated positive document (from qrels ) and a false negative document due to incomplete markup: Example 1: Example 2: Manually evaluating specific queries like this is a generally useful technique for understanding search quality that complements quantitive measures like nDCG@10. If you have a representative set of queries you always run when you make changes to search, it gives you important qualitative information about how performance changes, which is invisible in the statistics. For example, it gives you much more insight into the false results your search returns: it can help you spot obvious howlers in retrieved results, classes of related mistakes, such as misinterpreting domain-specific terminology, and so on. Our result is in agreement with relevant research around MSMARCO evaluation. For example, Arabzadeh et al. follow a similar procedure where they employ crowdsourced workers to make preference judgments: among other things, they show that in many cases the documents returned by the reranking modules are preferred compared to the documents in the MSMARCO qrels file. Another piece of evidence comes from the authors of the RocketQA reranker who report that more than 70% of the reranked documents were found relevant after manual inspection. † \\dag † Update - September 9th: After a careful re-evaluation of the dataset we identified 15 more cases of relevant documents, increasing their total number from 273 to 288 Main takeaways & next steps The pursuit for better ground truth is never-ending as it is very crucial for benchmarking and model comparison. LLMs can assist in some evaluation areas if used with caution and tuned with proper instructions More generally, given that benchmarks will never be perfect, it might be preferable to switch from a pure score comparison to more robust techniques capturing statistically significant differences. The work of Arabzadeh et al. provides a nice of example of this where based on their findings they build 95% confidence intervals indicating significant (or not) differences between the various runs. In the accompanying notebook we provide an implementation of confidence intervals using bootstrapping . From the end-user perspective it’s useful to think about task alignment when reading benchmark results. For example, for an AI engineer who builds a RAG pipeline and knows that the most typical use case involves assembling multiple pieces of information from different sources, then it would be more meaningful to assess the performance of their retrieval model on multi-hop QA datasets like HotpotQA instead of the global average across the whole BEIR benchmark In the next blog post we will dive deeper into the use of Phi-3 as LLM judge and the journey of tuning it to predict relevance. Report an issue Related content Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Jump to Understanding the BEIR benchmark in search relevance evaluation Structure of a BEIR dataset Leveraging the BEIR benchmark for search relevance evaluation Main takeaways & next steps Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Evaluating search relevance part 1 - The BEIR benchmark - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/evaluating-search-relevance-part-1",
+    "meta_description": "Learn to evaluate your search system in the context of better understanding the BEIR benchmark, with tips & techniques to improve your search evaluation processes."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Improved inference performance with ELSER v2 Learn about the improvements we've made to the inference performance of ELSER v2, achieving a 60% to 120% speed increase over ELSER v1. ML Research TV QH VK By: Thomas Veasey , Quentin Herreros and Valeriy Khakhutskyy On October 17, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. It is well known that modern transformer based approaches to information retrieval often come with significantly higher resource costs when compared with traditional statistical approaches, such as BM25. This can make it challenging to apply these techniques in production. At large scale, at least as much attention needs to be paid to the resource usage of any retrieval solution as to its relevance in order to produce something practically useful. In this final two part blog of our series, we discuss some of the work we did for retrieval and inference performance for the release of version 2 of our Elastic Learned Sparse EncodeR model (ELSER), which we introduced in this previous blog post. In 8.11 we are releasing two versions of the model: one portable version which will run on any hardware and one version which is optimized for the x86 family of architectures. We're still making the deployment process easy though, by defaulting to the most appropriate model for your cluster's hardware. In this first part we focus on inference performance. In the second part we discuss the ongoing work we're doing to improve retrieval performance. However, first we briefly review the relevance we achieve for BEIR with ELSER v2. Improved relevance for BEIR with ELSER v2 For this release we extended our training data, including around 390k high quality question and answer pairs to our fine tune dataset, and improved the FLOPS regularizer based on insights we discussed in the past . Together these changes gave us a bump in relevance measured with our usual set of BEIR benchmark datasets. We plan to follow up with a full description of our training data set composition and the innovations we have introduced, such as improvements to cross-encoder distillation and the FLOPS regularizer at a later date. Since this blog post mainly focuses on performance considerations, we simply give the new NDCG@10 for ELSER v2 model in the table below. NDCG@10 for BEIR data sets for ELSER v1 and v2 (higher is better). The v2 results use the query pruning method described below Quantization in ELSER v2 Model inference in the Elastic Stack is run on CPUs. There are two principal factors which affect the latency of transformer model inference: the memory bandwidth needed to load the model weights and the number of arithmetic operations it needs to perform. ELSER v2 was trained from a BERT base checkpoint. This has just over 100M parameters, which amounts to about 418 MB of storage for the weights using 32 bit floating point precision. For production workloads for our cloud deployments we run inference on Intel Cascade Lake processors. A typical midsize machine would have L1 data, L2 and L3 cache sizes of around 64 KiB, 2 MiB and 33 MiB, respectively. This is clearly much smaller than model weight storage (although the number of weights which are actually used for any given inference is a function of text length). So for a single inference call we get cache misses all the way up to RAM. Halving the weight memory means we halve the memory bandwidth we need to serve an inference call. Modern processors support wide registers which let one perform the same arithmetic operations in parallel on several pieces of data, so called SIMD instructions. The number of parallel operations one can perform is a function of the size of each piece of data. For example, Intel processors allow one to perform 8 bit integer multiplication in 16 bit wide lanes. This means one gets roughly twice as many operations per cycle for int8 versus float32 multiplication and this is the dominant compute cost in an inference call. It is therefore clear if one were able to perform inference using int8 tensors there are significant performance improvements available. The process of achieving this is called quantization. The basic idea is very simple: clip outliers, scale the resulting numbers into the range 0 to 255 and snap them to the nearest integer. Formally, a floating point number x x x is transformed using ⌊ 255 u − l ( clamp ( x , l , u ) − l ) ⌉ \\left\\lfloor\\frac{255}{u - l}(\\text{clamp}(x, l, u) - l)\\right\\rceil ⌊ u − l 255 ​ ( clamp ( x , l , u ) − l ) ⌉ . One might imagine that the accuracy lost in this process would significantly reduce the model accuracy. In practice, large transformer model accuracy is fairly resilient to the errors this process introduces. There is quite a lot of prior art on model quantization. We do not plan to survey the topic in this blog and will focus instead on the approaches we actually used. For background and insights into quantization we recommend these two papers. For ELSER v2 we decided to use dynamic quantization of the linear layers. By default this uses per tensor symmetric quantization of activations and weights. Unpacking this, it rescales values to lie in an interval that is symmetric around zero - which makes the conversion slightly more compute efficient - before snapping. Furthermore, it uses one such interval for each tensor. With dynamic quantization the interval for each activation tensor is computed on-the-fly from their maximum absolute value . Since we want our model to perform well in a zero-shot setting, this has the advantage that we don't suffer from any mismatch in the data used to calibrate the model quantization and the corpus where it is used for retrieval. The maximum absolute weight for each tensor is known in advance, so these can be quantized upfront and stored in int8 format. Furthermore, we note that attention is itself built out of linear layers. Therefore, if the matrix multiplications in linear layers are quantized the majority of the arithmetic operations in the model are performed in int8. Our first attempt at applying dynamic quantization to every linear layer failed: it resulted in up to 20% loss in NDCG@10 for some of our BEIR benchmark data sets. In such cases, it is always worthwhile investigating hybrid quantization schemes. Specifically, one often finds that certain layers introduce disproportionately large errors when converted to int8. Typically, in such cases one performs layer by layer sensitivity analysis and greedily selects the layers to quantize while the model meets accuracy requirements. There are many configurable parameters for quantization which relate to exact details of how intervals are constructed and how they are scoped. We found it was sufficient to choose between three approaches for each linear layer for ELSER v2: Symmetric per tensor quantization, Symmetric per channel quantization and Float32 precision. There are a variety of tools which can allow one to observe tensor characteristics which are likely to create problems for quantization. However, ultimately what one always cares about is the model accuracy on the task it performs. In our case, we wanted to know how well the quantized model preserves the text representation we use for retrieval, specifically, the document scores. To this end, we quantized each layer in isolation and calculated the score MAPE of a diverse collection of query relevant document pairs. Since this had to be done on CPU and separately for every linear layer we limited this set to a few hundred examples. The figure below shows the performance and error characteristics for each layer; each point shows the percentage speed up in inference (x-axis) and the score MAPE (y-axis) as a result of quantizing just one layer. We run two experiments per layer: per tensor and per channel quantization. Relevance scores MAPE for layerwise quantization of ELSER v2 Note that the performance gain is not equal for all layers. The feed forward layers that separate attention blocks use larger intermediate representations so we typically gain more by quantizing their weights. The MLM head computes vocabulary token activations. Its output dimension is the vocabulary size or 30522. This is the outlier on the performance axis; quantizing this layer alone increases throughput by nearly 13%. Regarding accuracy, we see that quantizing the output of the 10 th feed forward module in the attention stack has a dramatic impact and many layers have almost no impact on the scores (< 0.5% MAPE). Interestingly, we also found that the MAPE is larger when quantizing higher feed forward layers. This is consistent with the fact that dropping feed forward layers altogether at the bottom of the attention stack has recently been found to be an effective performance accuracy trade off for BERT. In the end, we chose to disable quantization for around 20% of layers and use per channel quantization for around 15% of layers. This gave us a 0.1% reduction in average NDCG@10 across the BEIR suite and a 2.5% reduction in the worst case. So what does this yield in terms of performance improvements in the end? Firstly, the model size shrank by a little less than 40%, from 418 MB to 263MB. Secondly, inference sped up by between 40% and 100% depending on the text length. The figure below shows the inference latency on the left axis for the float32 and hybrid int8 model as a function of the input text length. This was calculated from 1000 different texts ranging for around 200 to 2200 characters (which typically translates to around the maximum sequence length of 512 tokens). For the short texts in this set we achieve a latency of around 50 ms or 20 inferences per second single threaded for an Intel Xeon CPU @ 2.80GH. Referring to the right axis, the speed-up for these short texts is a little over 100%. This is important because 200 characters is a long query so we expect similar improvements in query latency. We achieved a little under 50% throughput improvement for the data set as a whole. Speed up per thread from hybrid int8 dynamic quantisation of ELSER v2 using an Intel Xeon CPU Block layout of linear layers in ELSER v2 Another avenue we explored was using the Intel Extension for PyTorch (IPEX) . Currently, we recommend our users run Elasticsearch inference nodes on Intel hardware and it makes sense to optimize the models we deploy to make best use of it. As part of this project we rebuilt our inference process to use the IPEX backend. A nice side effect of this was that ELSER inference with float32 is 18% faster in 8.11 and we see increased throughput advantage from hyperthreading. However, the primary motivation was the latest Intel cores have hardware support for bfloat16 format, which makes better performance accuracy tradeoffs for inference than float32. We wanted to understand how this performs. We saw around 3 times speedup using bfloat16, but only with the latest hardware support; so until this is well enough supported in the cloud environment the use of bfloat16 models is impractical. We instead turned our attention to other features of IPEX. The IPEX library provides several optimizations which can be applied to float32 layers. This is handy because, as discussed, we retain around 20% of the model in float32 precision. Transformers don't afford simple layer folding opportunities, so the principal optimization is blocking of linear layers. Multi-dimensional arrays are usually stored flat to optimize cache use. Furthermore, to get the most out of SIMD instructions one ideally loads memory from contiguous blocks into the wide registers which implement them. The operations performed on the model weights in inference alter their access patterns. For any given compute graph one can in theory work out the weight layout which maximizes performance. The optimal arrangement also depends on the instruction set available and the memory bandwidth; usually this amounts to reordering weights into blocks for specific tensor dimensions. Fortunately, the IPEX library has implemented the optimal strategy for Intel hardware for a variety of layers, including linear layers. The figure below shows the effect of applying optimal block layout for float32 linear layers in ELSER v2. The performance was averaged over 5 runs. The effect is small however we verified it is statistically significant (p-value < 0.05). Also, it is consistently slightly larger for longer sequences, so for our representative collection of 1000 texts it translated to a little under 1% increase in throughput. Speed up per thread from IPEX optimize on ELSER v2 using an Intel Xeon CPU Another interesting observation we made is that the performance improvements are larger when using intra-op parallelism . We consistently achieved 2-5% throughput improvement across a range of text lengths using both our VM's allotted physical cores. In the end, we decided not to enable these optimisations. The performance gains we get from them are small and they significantly increase the model memory: our script file increased from 263MB to 505MB. However, IPEX and particularly hardware support for bfloat16 yield significant improvements for inference performance on CPU. This work got us a step closer to enabling this for Elasticsearch inference in the future. Conclusion In this post, we discussed how we were able to achieve between a 60% and 120% speed-up in inference compared to ELSER v1 by upgrading the libtorch backend in 8.11 and optimizing for x86 architecture. This is all while improving zero-shot relevance. Inference performance is the critical factor in the time to index a corpus. It is also an important part of query latency. At the same time, the index performance is equally important for query latency, particularly at large scale. We discuss this in part 2 . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid Retrieval Part 5: Improved inference performance with ELSER v2 Part 6: Optimizing retrieval with ELSER v2 Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Improved relevance for BEIR with ELSER v2 Quantization in ELSER v2 Block layout of linear layers in ELSER v2 Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving information retrieval in the Elastic Stack: Improved inference performance with ELSER v2 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-1",
+    "meta_description": "Learn about the improvements we've made to the inference performance of ELSER v2, achieving a 60% to 120% speed increase over ELSER v1."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Quickly create RAG apps with Vertex AI Gemini models and Elasticsearch playground Quickly create a RAG app with Vertex AI Gemini models and Elasticsearch playground Generative AI Integrations How To JV By: Jeff Vestal On September 27, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this blog, we will connect Elasticsearch to Google’s Gemini 1.5 chat model using Elastic’s Playground and Vertex AI API. The addition of Gemini models to Playground enables Google Cloud developers to quickly ground LLMs, test retrieval, tune chunking, and ship gen AI search apps to prod with Elastic. You will need an Elasticsearch cluster up and running. We will use a Serverless Project on Elastic Cloud. If you don’t have an account, you can sign up for a free trial . You will also need a Google Cloud account with Vertex AI Enabled. If you don’t have a Google Cloud account, you can sign up for a free trial . Steps to create RAG apps with Vertex AI Gemini models & Playground 1. Configuring Vertex AI First, we will configure a Vertex AI service account, which will allow us to make API calls securely from Elasticsearch to the Gemini model. You can follow the detailed instructions on Google Cloud’s doc page here , but we will cover the main points. Go to the Create Service Account section of the Google Cloud console. There, select the project which has Vertex AI enabled. Next, give your service account a name and optionally, a description. Click “Create and Continue”. Set the access controls for your project. For this blog, we used the “Vertex AI User” role, but you need to ensure your access controls are appropriate for your project and account. Click Done. The final setup in Google Cloud is to create an API key for the service account and download it in JSON format. Click “KEYS” in your service account then “ADD KEY” and “Create New”. Ensure you select “json” as the key type then click “CREATE”. The key will be created and automatically downloaded to your computer. We will need this key in the next section. 2. Connect to your LLM from Playground With Google Cloud configured, we can continue configuring the Gemini LLM connection in Elastic’s Playground. This blog assumes you already have data in Elasticsearch you want to use with Playground. If not, follow the Search Labs Blog Playground: Experiment with RAG applications with Elasticsearch in minutes to get started. In Kibana, Select Playground from the side navigation menu. In Serverless, this is under the “Build” heading. When that opens for the first time, you can select “Connect to an LLM”. Select “Google Gemini”: Fill out the form to complete the configuration. Open the JSON credentials file created and downloaded from the previous section, copy the complete JSON, and paste it into the “Credentials JSON” section. Then click “Save” 3. It’s Playground Time! Elastic’s Playground allows you to experiment with RAG context settings and system prompts before integrating into full code. By changing settings while chatting with the model, you can see which settings will provide the optimal responses for your application. Additionally, configure which fields in your Elasticsearch data are searched to add context to your chat completion request. Adding context will help ground the model and provide more accurate responses. This step uses Elastic’s ELSER sparse embeddings model , available built-in, for retrieving context via semantic search, that is passed on to the Gemini model. That’s it (for now) Conversational search is an exciting area where powerful large language models, such as those offered by Google Vertex AI are being used by developers to build new experiences. Playground simplifies the the process of prototyping and tuning, enabling you to ship your apps more quickly. Explore more ideas to build with Elasticsearch and Google Vertex AI, and happy searching! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps to create RAG apps with Vertex AI Gemini models & Playground 1. Configuring Vertex AI 2. Connect to your LLM from Playground 3. It’s Playground Time! That’s it (for now) Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Quickly create RAG apps with Vertex AI Gemini models and Elasticsearch playground - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/vertex-ai-elasticsearch-playground-fast-rag-apps",
+    "meta_description": "Learn how to quickly build a RAG app with Vertex AI Gemini models and Elasticsearch playground."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing kNN Query: An expert way to do kNN search Explore how the kNN query in Elasticsearch can be used and how it differs from top-level kNN search, including examples. Vector Database How To MS BT By: Mayya Sharipova and Benjamin Trent On December 7, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. kNN search as a top-level section kNN search in Elasticsearch is organized as a top level section of a search request. We have designed it this way so that: It can always return global k nearest neighbors regardless of a number of shards These global k results are combined with a results from other queries to form a hybrid search The global k results are passed to aggregations to form facets. Here is a simplified diagram how kNN search is executed internally (some phases are omitted) : Figure 1: The steps for the top level kNN search are: A user submits a search request The coordinator node sends a kNN search part of the request to data nodes in the DFS phase Each data node runs kNN search and sends back the local top-k results to the coordinator The coordinator merges all local results to form the global top k nearest neighbors. The coordinator sends back the global k nearest neighbors to the data nodes with any additional queries provided Each data node runs additional queries and sends back the local size results to the coordinator The coordinator merges all local results and sends a response to the user We first run kNN search in the DFS phase to obtain the global top k results. These global k results are then passed to other parts of the search request, such as other queries or aggregations. Even the execution looks complex, from a user’s perspective this model of running kNN search is simple, as the user can always be sure that kNN search returns the global k results. Introducing kNN query in Elasticsearch With time we realized there is also a need to represent kNN search as a query. Query is a core component of a search request in Elasticsearch, and representing kNN search as a query allows for flexibility to combine it with other queries to address more complex requests. kNN query, unlike the top level kNN search, doesn’t have a k parameter. The number of results (nearest neighbors) returned is defined by the size parameter, as in other queries. Similar to kNN search, the num_candidates parameter defines how many candidates to consider on each shard while executing a kNN search. kNN query is executed differently from the top level kNN search. Here is a simplified diagram that describes how a kNN query is executed internally (some phases are omitted): Figure 2: The steps for query based kNN search are: A user submits a search request The coordinator sends to the data nodes a kNN search query with additional queries provided Each data node runs the query and sends back the local size results to the coordinator node The coordinator node merges all local results and sends a response to the user We run kNN search on a shard to get num_candidates results; these results are passed to other queries and aggregations on a shard to get size results from the shard. As we don’t collect the global k nearest neighbors first, in this model the number of nearest neighbors collected and visible for other queries and aggregations depend on the number of shards. kNN query API examples Let’s look at API examples that demonstrate differences between the top level kNN search and kNN query. We create an index of products and index some documents: kNN query similar to the top level kNN search, has num_candidates and an internal filter parameter that acts as a pre-filter. kNN query can get more diverse results than kNN search for collapsing and aggregations. For the kNN query below, on each shard we execute kNN search to obtain 10 nearest neighbors which are then passed to collapse to get 3 top results. Thus, we will get 3 diverse hits in a response. The top level kNN search first gets the global top 3 results in the DFS phase, and then passes them to collapse in the query phase. We will get only 1 hit in a response, as all the global 3 nearest neighbors happened to be from the same brand. Similarly for aggregations, a kNN query allows us to get 3 distinct buckets, while kNN search only allows 1. Now, let’s look at other examples that show the flexibility of the kNN query. Specifically, how it can be flexibly combined with other queries. kNN can be a part of a boolean query (with a caveat that all external query filters are applied as post-filters for kNN search). We can use a _name parameter for kNN query to enhance results with extra information that tells if the kNN query was a match and its score contribution. kNN can also be a part of complex queries, such as a pinned query. This is useful when we want to display the top nearest results, but also want to promote a selected number of other results. We can even make the kNN query a part of our function_score query. This is useful when we need to define custom scores for results returned by kNN query: ​ kNN query being a part of dis_max query is useful when we want to combine results from kNN search and other queries, so that a document’s score comes from the highest ranked clause with a tie breaking increment for any additional clause. ​ kNN search as a query has been introduced with the 8.12 release. Please try it out, and we would appreciate any feedback. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to kNN search as a top-level section Introducing kNN query in Elasticsearch kNN query API examples Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing kNN Query: An expert way to do kNN search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/knn-query-elasticsearch",
+    "meta_description": "Explore how the kNN query in Elasticsearch can be used and how it differs from top-level kNN search, including examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus Use Elasticsearch Query Language (ES|QL) to run statistical analysis on demographic data index in Elasticsearch. ES|QL Python BA By: Baha Azarmi On August 20, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch Query Language (ES|QL) is designed for fast, efficient querying of large datasets. It has a straightforward syntax which will allow you to write complex queries easily, with a pipe based language, reducing the learning curve. We're going to use ES|QL to run statistical analysis and compare different odds. If you are reading this, you probably want to know how rich you can get before actually reaching the same odds of being hit by a bus. I can't blame you, I want to know too. Let's work out the odds so that we can make sure we win the lottery rather than get in an accident! What we are going to see in this blog is figuring out the probability of being hit by a bus and the probability of achieving wealth. We'll then compare both and understand until what point your chances of getting rich are higher, and when you should consider getting life insurance. So how are we going to do that? This is going to be a mix of magic numbers pulled from different articles online, some synthetics data and the power of ES|QL, the new Elasticsearch Query Language. Let's get started. Data for the ES|QL analysis The magic number The challenge starts here as the dataset is going to be somewhat challenging to find. We are then going to assume for the sake of the example that ChatGPT is always right. Let’s see what we get for the following question: Cough Cough… That sounds about right, this is going to be our magic number. Generating the wealth data Prerequisites Before running any of the scripts below, make sure to install the following packages: Now, there is one more thing we need, a representative dataset with wealth distribution to compute wealth probability. There is definitely some portion of it here and there, but again, for the example we are going to generate a 500K line dataset with the below python script. I am using python 3.11.5 in this example: It should take some time to run depending on your configuration since we are injecting 500K documents here! FYI, after playing with a couple of versions of the script above and the ESQL query on the synthetic data, it was obvious that the net worth generated across the population was not really representative of the real world. So I decided to use a log-normal distribution (np.random.lognormal) for income to reflect a more realistic spread where most people have lower incomes, and fewer people have very high incomes. Net Worth Calculation: Used a combination of random multipliers (np.random.uniform(0.5, 5)) and additional noise (np.random.normal(0, 10000)) to calculate net worth. Added a check to ensure no negative net worth values by using np.maximum(0, net_worths). Not only have we generated 500K documents, but we also used the Elasticsearch python client to bulk ingest all these documents in our deployment. Please note that you will find the endpoint to pass in as hosts Cloud ID in the code above. For the deployment API key, open Kibana, and generate the key in Stack Management / API Keys: The good news is that if you have a real data set, all you will need to do is to change the above code to read your dataset and write documents with the same data mapping. Ok we're getting there! The next step is pouring our wealth distribution. ES|QL wealth analysis Introducing ES|QL: A powerful tool for data analysis The arrival of Elasticsearch Query Language (ES|QL) is very exciting news for our users. It largely simplifies querying, analyzing, and visualizing data stored in Elasticsearch, making it a powerful tool for all data-driven use cases. ES|QL comes with a variety of functions and operators, to perform aggregations, statistical analyses, and data transformations. We won’t address them all in this blog post, however our documentation is very detailed and will help you familiarize with the language and the possibilities. To get started with ES|QL today and run the blog post queries, simply start a trial on Elastic Cloud , load the data and run your first ES|QL query. Understanding the wealth distribution with our first query To get familiar with the dataset, head to Discover in Kibana and switch to ES|QL in the dropdown on the left hand side: Let’s fire our first request: As you could expect from our indexing script earlier, we are finding the documents we bulk ingested, notice the simplicity of pulling data from a given dataset with ES|QL where every query starts with the From clause, then your index. In the query above given we have 500K lines, we limited the amount of returned documents to 10. To do this, we are passing the output of the first segment of the query via a pipe to the limit command to only get 10 results. Pretty intuitive, right? Alright, what would be more interesting is to understand the wealth distribution in our dataset, for this we will leverage one of the 30 functions ES|QL provides, namely percentile. This will allow us to understand the relative position of each data point within the distribution of net worth. By calculating the median percentile (50th percentile), we can gauge where an individual’s net worth stands compared to others. Like our first query, we are passing the output of our index to another function, Stats, which combined with the percentile function will output the median net worth: The median is about 54K, which unfortunately is probably optimistic compared to the real world, but we are not going to solve this here. If we go a little further, we can look at the distribution in more granularity by computing more percentiles: With the below output: The data reveals a significant disparity in wealth distribution, with the majority of wealth being concentrated among the richest individuals. Specifically, the top 5% (95th percentile) possess a disproportionately large portion of the total wealth, with a net worth starting at $852,988.26 and increasing dramatically in the higher percentiles. The 99th percentile individuals hold a net worth exceeding $2 million, highlighting the skewed nature of wealth distribution. This indicates that a substantial portion of the population has modest net worth, which is probably what we want for this example. Another way to look at this is to augment the previous query and grouping by age to see if there is, (in our synthetic dataset), a relation between wealth and age: This could be visualized in a Kibana dashboard. Simply: Navigate to Dashboard Add a new ES|QL visualization Copy and paste our query Move the age field to the horizontal axis in the visualization configuration Which will output: The above suggests that the data generator randomized wealth uniformly across the population age, there is no specific trend pattern we can really see. Median Absolute Deviation (MAD) We calculate the median absolute deviation (MAD) to measure the variability of net worth in a robust manner, less influenced by outliers. With a median net worth of 53 , 787.22 a n d a M A D o f 53,787.22 and a MAD of 53 , 787.22 an d a M A Do f 44,205.44, we can infer the typical range of Net Worth: Most individuals’ net worth falls within a range of 44 , 205.44 a b o v e a n d b e l o w t h e m e d i a n . T h i s g i v e s a t y p i c a l r a n g e o f a p p r o x i m a t e l y 44,205.44 above and below the median. This gives a typical range of approximately 44 , 205.44 ab o v e an d b e l o wt h e m e d ian . T hi s g i v es a t y p i c a l r an g eo f a pp ro x ima t e l y 9,581.78 to $97,992.66. The statistical showdown between Net Worth and Bus Collision Alright, this is the moment to understand how rich we can get, based on our dataset, before getting hit by a bus. To do that, we are going to leverage ES|QL to pull our entire dataset in chunks and load it into a pandas dataframe to build a net worth probability distribution. Finally, we will determine where the ends meet between the net worth and bus collision probabilities. The entire Python notebook is available here . I also recommend you read this blog post which walks you through using ES|QL with pandas dataframes. Helper functions As you can see in the previously referred blog post, we introduced support for ES|QL since version 8.12 of the Elasticsearch python client. Thus our notebook first defines the below functions: The first function is straightforward and executes an ES|QL query, the second is fetching the entire dataset from our index. Notice the trick in there that I am using a counter built-in to a field in my index to paginate through the data. This is workaround I am using while our engineering team is working on the support for pagination in ES|QL . Next, knowing that we have 500K documents in our index, we simply call these function to load the data in a data frame: Fit Pareto distribution Next, we fit our data to a Pareto distribution, which is often used to model wealth distribution because it reflects the reality that a small percentage of the population controls most of the wealth. By fitting our data to this distribution, we can more accurately represent the probabilities of different net worth levels. We can visualize the pareto distribution with the code below: `` Breaking point Finally, with the calculated probability, we determine the target net worth corresponding to the bus hit probability and visualize it. Remember, we use the magic number ChatGPT gave us for the probability of getting hit by a bus: Conclusion Based on our synthetic dataset, this chart vividly illustrates that the probability of amassing a net worth of approximately $12.5 million is as rare as the chance of being hit by a bus. For the fun of it, let’s ask ChatGPT what the probability is: Okay… $439 million? I think ChatGPT might be hallucinating again. Report an issue Related content Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Jump to Data for the ES|QL analysis The magic number Generating the wealth data Prerequisites ES|QL wealth analysis Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-query-language-esql-statistical-analysis",
+    "meta_description": "Learn how to use Elasticsearch Query Language (ES|QL) for statistical analysis through a practical example."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog From ES|QL to native Pandas dataframes in Python Learn how to export ES|QL queries as native Pandas dataframes in Python through practical examples. ES|QL Python How To QP By: Quentin Pradet On September 5, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Since Elasticsearch 8.15 or with Elasticsearch Serverless, ES|QL responses support the Apache Arrow streaming format . This blog post will show you how to take advantage of it in Python. In an earlier blog post , I demonstrated how to convert ES|QL queries to Pandas dataframes using CSV as an intermediate representation. Unfortunately, CSV requires explicit type declarations, is slow (especially for larger datasets) and does not handle nested arrays and objects. Apache Arrow lifts all these limitations. ES|QL to Pandas dataframes in Python Importing test data First, let's import some test data. As before, we will be using the employees sample data and mappings . The easiest way to load this dataset is to run these two Elasticsearch API requests in the Kibana Console . Converting dataset to a Pandas DataFrame object OK, with that out of the way, let's convert the full employees dataset to a Pandas DataFrame object using the ES|QL Arrow export: Even though this dataset only contains 100 records, we use a LIMIT command to avoid ES|QL warning us about potentially missing records. This prints the following dataframe: OK, so what actually happened here? Given format=\"arrow\" , Elasticsearch returns binary Arrow streaming data The Elasticsearch Python client looks at the Content-Type header and creates a PyArrow object Finally, PyArrow's Pandas integration converts the PyArrow object to a Pandas dataframe. Note that the types_mapper=pd.ArrowDtype parameter asks Pandas to use a PyArrow backend instead of a NumPy backend, since the source data is PyArrow. While this backend is not enabled by default for compatibility reasons, it has many advantages : it handles missing values, is faster, more interopable and supports more types. (This is not a zero copy conversion , however.) For this example to work, the Pandas and PyArrow optional dependencies need to be installed. If you want to use another dataframe library such as Polars instead, you don't need Pandas and can directly use polars.from_arrow to create a Polars DataFrame from the PyArrow table returned by the Elasticsearch client. One limitation is that Elasticsearch does not currently handle multi-valued fields, which is why we had to drop the is_rehired , job_positions and salary_change columns. This limitation will be lifted in a future version of Elasticsearch. Anyway, you now have a Pandas dataframe that you can use to analyze your data further. But you can also continue massaging the data using ES|QL, which is particularly useful when queries return more than 10,000 rows, the current maximum number of rows that ES|QL queries can return. More complex queries In the next example, we're counting how many employees are speaking a given language by using STATS ... BY (not unlike GROUP BY in SQL). And then we sort the result with the languages column using SORT : Unlike with CSV, we did not have to specify any types, as Arrow data already includes types. Here's the result: 21 employees speak 5 languages, wow! And 10 employees did not declare any spoken language. The missing value is denoted by <NA> , which is consistently used for missing data with the PyArrow backend. If we had used the NumPy backend instead, this column would have been converted to floats and the missing value would have been a confusing NaN , as NumPy integers don't have any sentinel value for missing data . Queries with parameters Finally, suppose that you want to expand the query from the previous section to only consider employees that speak N or more languages, with N being a variable parameter. For this we can use ES|QL's built-in support for parameters , which eliminates the risk of an injection attack associated with manually assembling queries with variable parts: which prints the following: Conclusion As we saw, ES|QL's native Arrow support makes working with Pandas and other DataFrame libraries even nicer than using CSV and it will continue to improve over time, with the multi-value support coming in a future version of Elasticsearch. Additional resources If you want to learn more about ES|QL, the ES|QL documentation is the best place to start. You can also check out this other Python example using Boston Celtics data . To know more about the Python Elasticsearch client itself, you can refer to the documentation , ask a question on Discuss with the language-clients tag or open a new issue if you found a bug or have a feature request. Thank you! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to ES|QL to Pandas dataframes in Python Importing test data Converting dataset to a Pandas DataFrame object More complex queries Queries with parameters Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "From ES|QL to native Pandas dataframes in Python - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-pandas-native-dataframes-python",
+    "meta_description": "Learn how to export ES|QL queries as native Pandas dataframes in Python through practical examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch piped query language, ES|QL, now generally available Elasticsearch Query Language (ES|QL) is now GA. Explore ES|QL's capabilities, learn about ES|QL in Kibana and discover future advancements. ES|QL How To CL GK By: Costin Leau and George Kobar On June 5, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Today, we are pleased to announce the general availability of ES|QL (Elasticsearch Query Language), a dynamic language designed from the ground up to transform, enrich, and simplify data investigations. Powered by a new query engine , ES|QL delivers advanced search using simple and familiar query syntax with concurrent processing, enhancing speed and efficiency regardless of the data source and structure. With ES|QL's piped syntax, users can easily chain multiple operations, simplifying complex data investigations and making querying more intuitive and iterative. To security and observability users, ES|QL will feel both familiar and innovative for exposing Elasticsearch's advanced search capabilities with an easy-to-use query language. Integrated with Kibana, ES|QL enhances the data visualization and analysis experience enabling users to conduct their entire investigation on one screen, without switching between multiple windows. With continuous development, we aim to establish ES|QL as a versatile language for all Elasticsearch use cases, including retrieval augmented generation (RAG). Integrating RAG with geospatial capabilities and ES|QL will enhance query accuracy from diverse data sources. The combination of ES|QL and the new Search AI Lake architecture provides enhanced scalability, cost efficiency, and simplified management by automatically adjusting resources based on demand. Decoupling compute from storage and index from search improves performance and flexibility, ensuring faster data retrieval and investigations across vast amounts of data. ES|QL will be a differentiator for teams facing increasing observability and security demands. This article will dive into the various benefits and ways you can use ES|QL for your own use cases. Advancements in Elasticsearch For over 14 years, QueryDSL has served as the foundational language in Elasticsearch, delivering search , observability , and security to numerous organizations . As user needs evolved, it became clear that they required more than what QueryDSL alone could provide. They sought a query language that could not only simplify and streamline data investigations but also enhance the querying experience by integrating searching, enrichment, aggregation, and visualization into a singular, efficient interface. They desired advanced search capabilities, including lookups with concurrent processing to handle vast data volumes from varied sources and structures. In response, we developed the Elasticsearch Query Language (ES|QL), drawing inspiration from vectorized query execution and other database technologies. With ES|QL, users can utilize a familiar pipe ('|') syntax to chain operations, allowing for transformative and detailed data analysis. Powered by a robust query engine, ES|QL offers advanced search capabilities with concurrent processing across cores and nodes, enabling users to query across diverse data sources and structures seamlessly. There is no translation or transpilation to Query DSL; each ES|QL query is parsed, analyzed, and validated for semantics and optimized into an execution plan executed in parallel on the relevant nodes holding the data. The target nodes handle the query, making on-the-fly adjustments to the execution plan using the framework provided by ES|QL. The result is lightning-fast queries that you get out of the box. The road to GA Since its introduction in 8.11 , ES|QL has been on a journey of refinement and enhancement. The beta phase allowed our engineering team to gather valuable feedback from the community, enabling us to iterate and address the top needs of our users. Throughout this process, we enhanced ES|QL's capabilities while ensuring stability, performance, and seamless integration into core data exploration and visualization UX and workflows you use daily. Here are some features that brought ES|QL to general availability. Stability and performance We have been busy enhancing the dedicated ES|QL query engine to ensure it maintains robust performance under load, safeguarding the stability of the running node. To wit, see below the improvements in grouping in the last 6 months (for more tests and exact details about the underlying change see the dedicated benchmark page). Additionally, we've implemented memory tracking for precise resource management and conducted thorough stress tests, including the rigorous HeapAttack , to ensure that memory usage is carefully monitored during resource-intensive queries. Our circuit breakers are also in place to prevent OutOfMemoryErrors (OOMEs) on large and small heap sizes nodes. Visualize data in Kibana Discover in a whole new way with ES|QL ES|QL together with Elastic AI assistant We are excited about bringing generative AI and ES|QL together by first integrating them into the Observability and Security AI assistant, allowing users to input natural language translated into ES|QL commands for an easy, iterative, and smooth workflow. Visualize and perform ES|QL queries or edit them using the inline editing flyout, and seamlessly embed them into dashboards. This enhancement shortens the workflow by allowing in-line visualization editing when creating charts, making it easier for users to manage and save their visualizations directly within the assistant. Delivering significant improvements in query generation and performance. Users can now use natural language to visualize ES|QL queries, edit them using the inline editing flyout, and seamlessly embed them into dashboards. This enhancement shortens the workflow by allowing in-line visualization editing when creating charts, making it easier for users to manage and save their visualizations directly within the assistant. Create and edit ES|QL charts directly from the Kibana dashboard Streamline your workflow and deliver quick insights into your data by creating and modifying charts built with ES|QL directly from within the Kibana Dashboard . You can also perform inline editing of the ES|QL query while in the chart to adapt to changes in troubleshooting or threat hunting quickly. ES|QL query history It can be frustrating to repeat yourself and equally annoying if you need to rerun a query you executed a few moments ago. Now with ES|QL, you can quickly access recent queries with ES|QL query history. View, re-run your last 20 ES|QL queries directly within Kibana Discover , ES|QL visualizations , Kibana alerts , or Kibana maps for quick and easy access. Hybrid planning and dynamic data reduction For large Elasticsearch deployments, we have been testing ES|QL across hundreds of nodes and up to hundreds of thousands of shards and fields to ensure that query performance consistently remains performant as the cluster grows and more nodes are added. We have extended ES|QL ability to perform hybrid planning to better deal with the dynamic nature of the data (whether it’s new fields added or new segments) and exploit the local data patterns particular to each node: After the coordinating node (that receives the ES|QL query and drives its execution) performs global planning based on the global view of the data, it broadcasts the plan to all data nodes that can execute the plan. However, before executing, each node changes the plan locally based on the actual storage statistics individual to each node. A common scenario is early filter evaluation in sparse mappings due to the schema evolution. We are proactively developing a dynamic data reduction technique for scenarios with large shard sizes that minimize I/O traffic between the coordinator and data nodes, as well as reducing the duration that Lucene readers remain open during queries. This approach, which includes sharing intermediate results, shows great promise in enhancing the efficiency and runtime of queries across multiple shards. Stay tuned for more information about query execution and architecture in future blogs. Async querying Async querying empowers users to run long-running ES|QL queries asynchronously. Clients no longer have to wait idly for results; instead, they can monitor progress and retrieve data once it's ready. By utilizing the wait_for_completion_timeout parameter, users can tailor their experience, choosing whether to wait synchronously or switch to asynchronous mode after a specified timeout. This enhancement not only offers greater flexibility but also optimizes resource management, ensuring a smoother and more efficient querying process for our users Long-running ES|QL queries can be executed asynchronously so the client can monitor the progress and retrieve the results when available instead of blocking for them: Through the wait_for_completion_timeout clients can pick a comfortable timeout to wait for the result (and have synchronous behavior) before switching to an asynchronous one. Improved language and ergonomics We've streamlined the STATS command to offer greater flexibility and simplicity in data analysis. Previously, users had to resort to additional EVAL commands for arbitrary computations alongside aggregations and groupings which required a separate EVAL command: This restriction is no longer necessary as aggregations accept expressions (and themselves can also be combined) directly inside the STATS command, eliminating the need for extra EVALs and column pollution due to temporary fields: Date time units ES|QL now boasts improved support for datetime filtering. Recognizing the common need for date-time arithmetic in filtering tasks, ES|QL now supports abbreviated units, making queries more intuitive and efficient. For example, users can now easily specify date ranges using familiar abbreviations like 'year,' 'month,' and 'week.' This update simplifies query construction, enabling users to express datetime conditions more succinctly and accurately. Implicit data type conversion for string literals To minimize the friction of creating dedicated types (such as dates) from string declarations, ES|QL now performs implicit conversions of string constants to their target type by using the built-in conversion functions: Note that Only constants (or literals) are candidates for conversions, columns are ignored - the user has to use conversion functions for those explicitly. Converting string literals to their numeric equivalent is NOT supported, as these can be directly declared as such; that is “1” + 2 will throw an error, simply declare the expression as 1+2 instead. Native ES|QL clients While ES|QL is universally available through the _query REST endpoint, work is underway for offering rich, opinionated APIs for accessing ES|QL natively in various popular languages. While completing all the items above will take several releases, one can use ES|QL already through the regular Elasticsearch clients , for example, to access ES|QL results as Java or PHP objects and manipulate them as dataframes in Python ; Jupyter users should refer to the dedicated getting started guide notebook . Since the initial release as technical preview in 8.11, ES|QL has been making its way through various parts of the Elasticsearch ecosystem. Such as observability where it is used to streamline OTel operations using a specialized AI assistant . And if we had more time, we’d also mention the many other functions introduced, like multi-value scalar fields, geo-spatial analysis (both scalar and aggregate functions) and date time handling. ES|QL in cross-cluster search in technical preview Cross-cluster search in Elasticsearch enables users to query data across multiple Elasticsearch clusters as if it were stored in a single cluster, delivering unified querying, global insights, and many other efficiencies. Now, in technical preview, ES|QL with cross-cluster search capabilities extends its querying power to span across distributed clusters, empowering users to leverage ES|QL for querying and analyzing data regardless of its location all from a single UI. While ES|QL is available as a basic license at no cost, using ES|QL in cross cluster search will require an Enterprise level license. To use ES|QL in cross-cluster search, use the FROM command with the format <remote_cluster_name>:<target>, to retrieve data from my-index-000001 on the remote cluster. Looking to the future Search, embeddings and RAG We are thrilled to share an exciting development: leveraging ES|QL for advanced information retrieval, including full-text search and AI/ML-powered exploration. Our team is dedicated to making ES|QL the optimal tool for scoring, hybrid ranking, and integrating with Large Language Models (LLMs) within Elasticsearch. This dedicated command will streamline the retrieval process, enabling users to filter and score results. In the below example, we showcase a comprehensive search scenario, combining range filters, fast queries, and hybrid search techniques. This is a preview of how it might look like, naming TBD (SEARCH or RETRIEVAL): For instance, the query above demonstrates retrieving the top 5 most popular images by rating, featuring the terms 'mountain lake' in their description and resembling a user-defined image vector. Behind the scenes, the engine intelligently manages filters, rearranges queries, and applies reranking strategies, ensuring optimal search performance. This advancement promises to revolutionize information retrieval in Elasticsearch, offering users unparalleled control and efficiency in exploring and discovering relevant content. Timeseries, metrics and O11y Elasticsearch provides a dedicated solution for metrics through the timeseries data streams (TSDS), a powerful concept that can reduce disk storage by up to 70% by using specialized types and routing. We plan on leveraging fully these capabilities in ES|QL - first by introducing a dedicated command: Inline stats - aggregations without data reduction The STATS command in ES|QL is invaluable for summarizing statistics, but it often poses a challenge when users want to aggregate data without losing its original context. For instance, if you wish to display the average category price alongside each individual t-shirt price, traditional aggregation methods can obscure the original data. Enter INLINESTATS: a feature designed to address this issue by performing 'inline' statistics. With INLINESTATS, users can compute statistics within each group and seamlessly integrate the results back into the original dataset, preserving the context of the originating groups. This powerful capability enhances the clarity and depth of statistical analysis in ES|QL, empowering users to derive meaningful insights while maintaining the integrity of their data. Get started today The introduction of ES|QL marks a significant stride forward in Elastic's capabilities, offering users a powerful and intuitive tool for data querying and analysis. With its streamlined syntax, robust functionality, and innovative features, ES|QL opens up new avenues for users to unlock insights and derive value from their data. Whether you're a seasoned Elasticsearch user or just getting started, ES|QL invites you to explore, experiment, and experience the power of Elasticsearch Query Language firsthand. Be sure to check out our demo playground full of examples or try on Elastic Cloud . Already have Elasticsearch running? Just upgrade your clusters to 8.14 and give it a try. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Advancements in Elasticsearch The road to GA Stability and performance Visualize data in Kibana Discover in a whole new way with ES|QL ES|QL together with Elastic AI assistant Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch piped query language, ES|QL, now generally available - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-piped-query-language-goes-ga",
+    "meta_description": "Elasticsearch Query Language (ES|QL) is now GA. Explore ES|QL's capabilities, learn about ES|QL in Kibana and discover future advancements."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog ES|QL queries to Java objects Learn how to perform ES|QL queries with the Java client. Follow this guide for step-by-step instructions, including examples. ES|QL Java How To LT By: Laura Trotta On May 2, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. ES|QL overview ES|QL is a new query language introduced by Elasticsearch that combines a simplified syntax with the pipe operator to enable users to intuitively extrapolate and manipulate data. The new version 8.13.0 of the official Java client introduced support for ES|QL queries, with a new API that allows for easy query execution and automatic translation of the results to java objects. How to perform ES|QL queries with the Java client Prerequisites Elasticsearch version >= 8.11.0 Java version >= 17 Ingesting data Before we start querying we need to have some data available: we're going to store this csv file into Elasticsearch by using the BulkIngester utility class available in the Java client. The csv lists books from the Amazon Books Reviews dataset , categorizing them using the following header row: First of all, we have to create the index to map the fields correctly: Then the Java class for the books: We're going to use Jackson's CSV mapper to read the file, so let's configure it: Then we'll read the csv file line by line and optimize the ingestion using the BulkIngester: The indexing will take around 15 seconds, but when it's done we'll have the books index filled with ~80K documents, ready to be queried. ES|QL Now it's time to extract some information from the books data. Let's say we want to find the latest reprints of Asimov's works: Thanks to the ObjectsEsqlAdapter using Book.class as the target, we can ignore what the json result of the ES|QL query would be, and just focus on the more familiar list of books that is automatically returned by the client. For those who are used to SQL queries and the JDBC interface, the client also provides the ResultSetEsqlAdapter , which can be used in the same way and instead returns a java.sql.ResultSet Another example, we now want to find out the top-rated books from Penguin Books: The Java code to retrieve the data stays the same since the result is again a list of books. There are exceptions of course, for example if a query uses the eval command to add a new column, the Java class should be modified to represent the new result. The full code for this article can be found in the official client repository . Feel free to reach out on Discuss for any questions or issues. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to ES|QL overview How to perform ES|QL queries with the Java client Prerequisites Ingesting data ES|QL Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "ES|QL queries to Java objects - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/esql-queries-to-java-objects",
+    "meta_description": "Learn how to perform ES|QL queries with the Java client. Follow this guide for step-by-step instructions, including examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Vector search introduction and implementation This series dives into the intricacies of vector search, how it is implemented in Elasticsearch, and how to run hybrid search queries in Elasticsearch. Part1 Vector Database February 6, 2025 A quick introduction to vector search This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. VC By: Valentin Crettaz Part2 Vector Database How To February 10, 2025 How to set up vector search in Elasticsearch Learn how to set up vector search and execute k-NN searches in Elasticsearch. VC By: Valentin Crettaz Part3 Vector Database How To February 17, 2025 Elasticsearch hybrid search Learn about hybrid search, the types of hybrid search queries Elasticsearch supports, and how to craft them. VC By: Valentin Crettaz Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Vector search introduction and implementation - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/vector-search-introduction-and-implementation",
+    "meta_description": "This series dives into the intricacies of vector search, how it is implemented in Elasticsearch, and how to run hybrid search queries in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Lucene Wrapped 2024 2024 has been another major year for Apache Lucene. In this blog, we’ll explore the key highlights. Lucene CH By: Chris Hegarty On January 3, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Apache Lucene has seen significant activity in 2024, with numerous releases including the first major update in three years, packed with exciting improvements and new features. Let’s explore some of the key highlights. Lucene & the community A project is only as strong as the community that supports it. Despite more than 20 years of development, the Lucene project remains vibrant and thrives thanks to its passionate and active contributors. In 2024, the Lucene project has seen more than 2,000 commits from 98 unique contributors, and almost 800 pull requests. The number of contributors continues to grow, with new committers and PMC members joining the project and helping drive its success. Lucene 10 2024 saw the first major release in almost 3 years - Lucene 10, with more than 2,000 commits from 185 unique contributors. While the development model that Lucene follows allows to deliver many improvements and features in minor releases, a major release affords the opportunity to bring larger features and modernizations. For example, Lucene 10 requires a minimum of Java 21. Bumping the minimum Java version ensures that Lucene can continue to take advantage of improvements that modern Java provides. The primary focus of Lucene 10 is to better utilize the hardware on which it runs. Let's take a quick look at some of the main highlights: More search parallelism - while search execution is already parallelized across segments, we now go further, parallelizing within segments. This decouples on-disk representation from the execution performance, allowing even single segments to benefit from the number of cores on modern systems. Better I/O parallelism - the straightforward synchronous I/O model that Lucene uses has been enhanced with a prefetch stage. This informs the OS that a region of an index file will be needed in the very near future, while not blocking the calling thread. Better CPU and storage efficiency with sparse indexing - Lucene 10 introduces support for sparse indexing, sometimes called primary-key indexing or zone indexing in other data stores. For more information about Lucene 10, check out the dedicated article on Lucene 10. Lucene research and innovation In 2024, Lucene has seen a surge of research and innovation, particularly in the areas of machine learning integration, vector search, and optimization for large-scale datasets, with reference form 10 separate research papers and publications . Some of the key research areas and developments include: Vector Search and Embedding Support - Lucene provides a powerful and scalable solution for vector-based search, enabling semantic retrieval at scale. By leveraging Lucene's robust indexing and search infrastructure, users can combine the best of traditional text search with the advanced capabilities of modern vector search, making Lucene a comprehensive solution for a wide range of search and information retrieval tasks. Hybrid Search Models - Research has also delved into hybrid search techniques, where Lucene combines traditional keyword-based search with modern vector-based retrieval. By merging term-based indexes with dense vector representations, Lucene can deliver more accurate and contextually relevant search results, bridging the gap between the precision of traditional search engines and the flexibility of semantic search. The ongoing research efforts in 2024 demonstrate Lucene’s adaptability to the evolving needs of modern search technologies, particularly in the context of AI, semantic search, and big data applications. The project continues to grow as a powerful, flexible, and efficient platform for both traditional and cutting-edge search use cases. 2024 Lucene releases Although not an exact reflection, the sheer volume of releases highlights the ongoing dedication and energy of the community. These updates include major enhancements to vector search performance and efficiency, support for madvise, optimizations for postings list decoding, further speed improvements through SIMD, and much more. Here’s the full list of releases: 10.1.0 (2024-12-20) 9.12.1 (2024-12-13) 10.0.0 (2024-10-14) 9.12.0 (2024-09-28) 8.11.4 (2024-09-24) 9.11.1 (2024-06-27) 9.11.0 (2024-06-06) 9.10.0 (2024-02-20) 8.11.3 (2024-02-08) 9.9.2 (2024-01-29) You can find more information and release notes at the Lucene Core page. Additionally, there are equivalent PyLucene releases. Wrapping up As Lucene matures, it continues to flourish thanks to its dedicated and vibrant community. As we’ve seen, 2024 has been an incredibly productive year, and we now look ahead to the exciting developments that 2025 will bring. Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Lucene & the community Lucene 10 Lucene research and innovation 2024 Lucene releases Wrapping up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Lucene Wrapped 2024 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/apache-lucene-wrapped-2024",
+    "meta_description": "Explore the key highlights, improvements and features of Apache Lucene, including an overview of Lucene 10."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial .NET Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Generative AI .NET +3 December 6, 2024 How to use Elasticsearch Vector Store Connector for Microsoft Semantic Kernel for AI Agent development Microsoft Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase. With the release of Semantic Kernel Elasticsearch Vector Store Connector, developers using Semantic Kernel for building AI agents can now plugin Elasticsearch as a scalable enterprise-grade vector store while continuing to use Semantic Kernel abstractions. FB SM By: Florian Bernd and Srikanth Manvi Developer Experience .NET October 15, 2024 NEST lifetime extended & Elastic.Clients.Elasticsearch (v8) Roadmap Announcing the extension of the NEST (v7) lifetime and providing a high level overview of the Elastic.Clients.Elasticsearch (v8) roadmap. FB By: Florian Bernd Vector Database .NET +2 October 9, 2024 Building a search app with Blazor and Elasticsearch Learn how to build a search application using Blazor and Elasticsearch, and how to use the Elasticsearch .NET client for hybrid search. GL By: Gustavo Llermaly .NET How To April 16, 2024 Elasticsearch .NET client evolution: From NEST to Elastic.Clients.Elasticsearch Learn about the evolution of the Elasticsearch .NET client and the transition from NEST to Elastic.Clients.Elasticsearch. FB By: Florian Bernd Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": ".NET - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/dot-net-programming",
+    "meta_description": ".NET articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to set up vector search in Elasticsearch Learn how to set up vector search and execute k-NN searches in Elasticsearch. Vector Database How To VC By: Valentin Crettaz On February 10, 2025 Part of Series Vector search introduction and implementation Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. This article is the second in a series of three that dives into the intricacies of vector search also known as semantic search and how it is implemented in Elasticsearch. The first part was focused on providing a general introduction to the basics of embeddings (aka vectors) and how vector search works under the hood. Armed with all the vector search knowledge learned in the first article, this second part will guide you through the meanders of how to set up vector search and execute k-NN searches in Elasticsearch. In the third part , we will leverage what we learned in the first two parts and build upon that knowledge by delving into how to craft powerful hybrid search queries in Elasticsearch. Some background first Even though Elasticsearch did not support vector search up until version 8.0 with the technical preview of the _knn_search API endpoint, it has been possible to store vectors using the dense_vector field type since the 7.0 release. At that point, vectors were simply stored as binary doc values but not indexed using any of the algorithms that we presented in our first article . Those dense vectors constituted the premises of the upcoming vector search features in Elasticsearch. If you’re interested in diving more into the discussions that led to the current implementation of vector search in Elasticsearch, you can refer to this issue , which details all the hurdles that Elastic had to jump over in order to bring this feature to market. Very briefly, since Elasticsearch already made heavy use of Lucene as their underlying search engine, we also decided to utilize the same technology as our vector engine, and we explained the rationale behind that decision in a very transparent way. With history matters out of the way, let’s now get to work. How to set up k-NN Vector search is available natively in Elasticsearch, and there’s nothing specific to install. We only need to create an index that defines at least one field of type dense_vector , which is where your vector data will be stored and/or indexed. The mapping below shows a dense_vector field called title_vector of dimension 3. Dense vectors stored and indexed in this field will use the dot_product similarity function that we introduced in the first article in this series. It is worth noting that up to 8.11 hnsw (i.e., Hierarchical Navigable Small Worlds ) was the only algorithm supported by Apache Lucene for indexing dense vectors. Since then, other algorithms have been added and in the future, Elasticsearch might provide additional methods for indexing and searching dense vectors, but since it fully relies on Apache Lucene that will depend on what unfolds on that front . The table below summarizes all available configuration parameters for the dense_vector field type provided by Elasticsearch: Table 1: The different configuration parameters for dense vectors Parameter Required Description dims Yes (<8.11) No (8.11+) The number of vector dimensions, which can’t exceed 1024 until 8.9.2, 2048 since 8.10.0 and 4096 since 8.11.0. Also, as of 8.11, this parameter is not required anymore and will default to the dimension of the first indexed vector. element_type No The data type of the vector element values. If unspecified, the default type is `float` (4 bytes), `byte` (1 byte) and `bit` are also available. index No Indicates whether to index vectors (if `true`) in a dedicated and optimized data structure or simply store them as binary doc values (if `false`). Until 8.10, the default value was `false` if not specified. As of 8.11, the default value is `true` if not specified. similarity Yes (<8.11) No (8.11+) Until 8.10, this parameter is required if `index` is `true` and defines the vector similarity metric to use for k-NN search. The available metrics are: a) `l2_norm`: L2 distance b) `dot_product`: dot product similarity c) `cosine`: cosine similarity d) `max_inner_product`: maximum inner product similarity. Also note that `dot_product` should be used only if your vectors are already normalized (i.e., they are unit vectors with magnitude 1), otherwise use `cosine` or `max_inner_product`. As of 8.11, if not specified, this parameter defaults to `l2_norm` if element_type is `bit` and to `cosine` otherwise. index_options No Here are the possible values for the `type` parameter depending on the version: a) Up until 8.11, only `hnsw` was supported. b) In 8.12, scalar quantization enabled `int8_hnsw` c) In 8.13, `flat` was added along with its scalar-quantized `int8_flat` sibling d) In 8.15, `int4_hnsw` and `int4_flat` were added e) In 8.18, binary quantization enabled `bbq_hnsw` and `bbq_flat`. You can check the official documentation to learn about their detailed description (https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params) and how each algorithm can be configured. As we can see in the above table, since the version 8.11 the definition of vector fields has been drastically simplified: Regarding the support for scalar quantization added in 8.12, remember we talked about this compression technique in the first part of this series. We won’t dig deeper in this article, but you can learn more about how this was implemented in Lucene in another Elastic Search Labs article . Similarly, we won’t dive into better binary quantization (BBQ) added in 8.18, and we invite you to learn more about that new groundbreaking algorithm in this article . That’s all there is to it! By simply defining and configuring a dense_vector field, we can now index vector data in order to run vector search queries in Elasticsearch using either the knn search option or the knn DSL query (introduced in 8.12). Elasticsearch supports two different vector search modes: 1) exact search using the script_score query and 2) approximate nearest neighbor search using the knn search option or the knn query (8.12+). We’re going to describe both modes next. Exact search If you recall from the first article of this series, where we reviewed the vector search landscape, an exact vector search simply boils down to performing a linear search, or brute-force search, across the full vector space. Basically, the query vector will be measured against each stored vector in order to find the closest neighbors. In this mode, the vectors do not need to be indexed in an HNSW graph but simply stored as binary doc values, and the similarity computation is run by a custom Painless script. First, we need to define the vector field mapping in a way that the vectors are not indexed, and this can be done by specifying index: false and no similarity metric in the mapping: The advantage of this approach is that vectors do not need to be indexed, which drastically lowers the ingestion time since there’s no need to build the underlying HNSW graph. However, depending on the size of the data set and your hardware, search queries can slow down pretty quickly as your data volume grows, since the more vectors you add, the more time is needed to visit each one of them (i.e., linear search has an O(n) complexity). With the index being created and the data being loaded, we can now run an exact search using the following script_score query: As you can see, the script_score query is composed of two main elements, namely the query and the script . In the above example, the query part specifies a filter (i.e., price >= 100 ), which restrains the document set against which the script will be executed. If no query was specified, it would be equivalent to using a match_all query, in which case the script would be executed against all vectors stored in the index. Depending on the number of vectors, the search latency can increase substantially. Since vectors are not indexed, there’s no built-in algorithm that will measure the similarity of the query vector with the stored ones, this has to be done through a script, and luckily for us, Painless provides most of the similarity functions that we’ve learned so far, such as: l1norm(vector, field) : L1 distance (Manhattan distance) l2norm(vector, field) : L2 distance (Euclidean distance) hamming(vector, field) : Hamming distance cosineSimilarity(vector, field) : Cosine similarity dotProduct(vector, field): Dot product similarity Since we’re writing a script, it is also possible to build our own similarity algorithm . Painless makes this possible by providing access to doc[<field>].vectorValue , which allows iterating over the vector array, and doc[<field>].magnitude , which returns the length of the vector. To sum up, even though exact search doesn’t scale well, it might still be suitable for certain very small use cases, but if you know your data volume will grow over time, you need to consider resorting to k-NN search instead. That’s what we’re going to present next. Approximate k-NN search Most of the time, this is the mode you’re going to pick if you have a substantial amount of data and need to implement vector search using Elasticsearch. Indexing latency is a bit higher since Lucene needs to build the underlying HNSW graph to store and index all vectors. It is also a bit more demanding in terms of memory requirements at search time, and the reason it’s called “approximate” is because the accuracy can never be 100% like with exact search. Despite all this, approximate k-NN offers a much lower search latency and allows us to scale to millions, or even billions of vectors, provided your cluster is sized appropriately. Let’s see how this works. First, let’s create a sample index with adequate vector field mapping to index the vector data (i.e., index: true + specific similarity ) and load it with some data: Simple k-NN search After running these two commands, our vector data is now properly indexed in a scalar-quantized HNSW graph and ready to be searched. Up until 8.11, the only way to run a simple k-NN search was by using the knn search option, located at the same level as the query section you’re used to, as shown in the query below: In the above search payload, we can see that there is no query section like for lexical searches, but a knn section instead. We are searching for the two ( k: 2 ) nearest neighboring vectors to the specified query vector. From 8.12 onwards, a new knn search query has been introduced to allow for more advanced hybrid search use cases, which is a topic we are going to handle in our next article. Unless you have the required expertise to combine k-NN queries with other queries, Elastic recommends sticking to the knn search option, which is easier to use. The first thing to note is that the new knn search query doesn’t have a k parameter, instead it uses the size parameter like any other queries. The second notable thing is that the new knn query enables to post-filter k-NN search results by leveraging a bool query that combines one or more filters with the knn search query, as shown in the code below: The above query first retrieves the top 3 documents having the nearest neighboring vectors and then filters out the ones whose price is smaller than 100. It is worth noting that with this kind of post-filtering, you might end up with no results at all if your filters are too aggressive. Also note that this behavior is different from the usual boolean full-text queries, where the filter part is executed first in order to reduce the document set that needs to be scored. If you are interested to learn more about the differences between the knn top-level search option and the new knn search query, you can head over to another great Search Labs article for more details. Let’s now move on and learn more about the num_candidates parameter. The role of num_candidates is to increase or decrease the likelihood of finding the true nearest neighbor candidates. The higher that number, the slower the search but also the more likely the real nearest neighbors will be found. As many as num_candidates vectors will be considered on each shard, and the top k ones will be returned to the coordinator node, which will merge all shard-local results and return the top k vectors from the global results as illustrated in Figure 1, below: Figure 1: Nearest neighbor search accuracy using num_candidates Vectors id4 and id2 are the k local nearest neighbors on the first shard, respectively id5 and id7 on the second shard. After merging and reranking them, the coordinator node returns id4 and id5 as the two global nearest neighbors for the search query. Several k-NN searches If you have several vector fields in your index, it is possible to send several k-NN searches as the knn section also accepts an array of queries, as shown below: As we can see, each query can take a different value of k as well as a different boost factor. The boost factor is equivalent to a weight, and the total score will be a weighted average of both scores. Filtered k-NN search Similarly to what we saw with the script_score query earlier, the knn section also accepts the specification of a filter in order to reduce the vector space on which the approximate search should run. For instance, in the k-NN search below, we’re restricting the search to only documents whose price is greater than or equal to 100. Now, you might wonder whether the data set is first filtered by price and then the k-NN search is run on the filtered data set (pre-filtering) or the other way around, i.e., the nearest neighbors are first retrieved and then filtered by price (post-filtering). It’s a bit of both actually. If the filter is too aggressive, the problem with pre-filtering is that k-NN search would have to run on a very small and potentially sparse vector space and would not return very accurate results. Whereas post-filtering would probably weed out a lot of high-quality nearest neighbors. So, even though the knn section filter is considered as a pre-filter, it works during the k-NN search in such a way as to make sure that at least k neighbors can be returned. If you’re interested in the details of how this works, you can check out the following Lucene issue dealing with this matter. Filtered k-NN search with expected similarity In the previous section, we learned that when specifying a filter, we can reduce the search latency, but we also run the risk of drastically reducing the vector space to vectors that are partly or mostly dissimilar to the query vector. In order to alleviate this problem, k-NN search makes it possible to also specify a minimum similarity value that all returned vectors are expected to have. Reusing the previous query, it would look like this: Basically, the way it works is that the vector space will be explored by skipping any vector that either doesn’t match the provided filter or that has a lower similarity than the specified one up until the k nearest neighbors are found. If the algorithm can’t honor at least k results (either because of a too-restrictive filter or an expected similarity that is too low), a brute-force search is attempted instead so that at least k nearest neighbors can be returned. A quick word concerning how to determine that minimum expected similarity. It depends on which similarity metric you’ve chosen in your vector field mapping. If you have picked l2_norm , which is a distance function (i.e., similarity decreases as distance grows), you will want to set the maximum expected distance in your k-NN query, that is, the maximum distance that you consider acceptable. In other words, a vector having a distance between 0 and that maximum expected distance with the query vector will be considered “close” enough to be similar. If you have picked dot_product or cosine instead, which are similarity functions (i.e., similarity decreases as the vector angle gets wider), you will want to set a minimum expected similarity. A vector having a similarity between that minimum expected similarity and 1 with the query vector will be considered “close” enough to be similar. Applied to the sample filtered query above and the sample data set that we have indexed earlier, Table 1, below, summarizes the cosine similarities between the query vector and each indexed vector. As we can see, vectors 3 and 4 are selected by the filter (price >= 100), but only vector 3 has the minimum expected similarity (i.e., 0.975) to be selected. Table 2: Sample filtered search with expected similarity Vector Cosine similarity Price 1 0.8473 23 2 0.5193 9 3 0.9844 124 4 0.9683 1457 Limitations of k-NN Now that we have reviewed all the capabilities of k-NN searches in Elasticsearch, let’s see the few limitations that you need to be aware of: Up until 8.11, k-NN searches cannot be run on vector fields located inside nested documents. From 8.12 onwards, this limitation has been lifted. However, such nested knn queries do not support the specification of a filter. The search_type is always set to dfs_query_then_fetch , and it is not possible to change it dynamically. The ccs_minimize_roundtrips option is not supported when searching across different clusters with cross-cluster search. This has been mentioned a few times already, but due to the nature of the HNSW algorithm used by Lucene (as well as any other approximate nearest neighbors search algorithms for that matter), “approximate” really means that the k nearest neighbors being returned are not always the true ones. Tuning k-NN As you can imagine, there are quite a few options that you can use in order to optimize the indexing and search performance of k-NN searches. We are not going to review them in this article, but we really urge you to check them out in the official documentation if you are serious about implementing k-NN searches in your Elasticsearch cluster. Beyond k-NN Everything we have seen so far leverages dense vector models (hence the dense_vector field type), in which vectors usually contain essentially non-zero values. Elasticsearch also provides an alternative way of performing semantic search using sparse vector models. Elastic has created a sparse NLP vector model called Elastic Learned Sparse EncodeR , or ELSER for short, which is an out-of-domain (i.e., not trained on a specific domain) sparse vector model that does not require any fine-tuning. It was pre-trained on a vocabulary of approximately 30000 terms, and being a sparse model, it means that vectors have the same number of values, most of which are zero. The way it works is pretty simple. At indexing time, the sparse vectors (term / weight pairs) are generated using the inference ingest processor and stored in fields of type sparse_vector , which is the sparse counterpart to the dense_vector field type. At query time, a specific DSL query also called sparse_vector will replace the original query terms with terms available in the ELSER model vocabulary which are known to be the most similar to them given their weights. We won’t dive deeper into ELSER in this article, but if you’re eager to discover how this works, you can check out this seminal article as well as the official documentation, which explains the topic in great detail. A quick glimpse into some upcoming related topics Elasticsearch also supports combining lexical search and vector search, and that will be the subject of the next and final article of this series. So far, we’ve had to generate the embeddings vectors outside of Elasticsearch and pass them explicitly in all our queries. Would it be possible to just provide the query text and a model would generate the embeddings on the fly? Well, the good news is that this is possible with Elasticsearch either by leveraging a construct called query_vector_builder (for dense vectors) or using the new semantic_text field type and semantic DSL query (for sparse vectors), and you can learn more about these techniques in this article . Let’s conclude In this article, we delved deeply into Elasticsearch vector search support. We first shared some background on Elastic’s quest to provide accurate vector search and why we decided to use Apache Lucene as our vector indexing and search engine. We then introduced the two main ways to perform vector search in Elasticsearch, namely either by leveraging the script_score query in order to run an exact brute-force search or by resorting to using approximate nearest neighbor search via the knn search option or the knn search query introduced in 8.12. We showed how to run a simple k-NN search and, following up on that, we reviewed all the possible ways of configuring the knn search option and query using filters and expected similarity and how to run multiple k-NN searches at the same time. To wrap up, we listed some of the current limitations of k-NN searches and what to be aware of. We also invited you to check out all the possible options that can be used to optimize your k-NN searches. If you like what you’re reading, make sure to check out the other parts of this series: Part 1: A Quick Introduction to Vector Search Part 3: Hybrid Search Using Elasticsearch Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Some background first How to set up k-NN Exact search Approximate k-NN search Simple k-NN search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to set up vector search in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/vector-search-set-up-elasticsearch",
+    "meta_description": "Learn how to set up vector search and execute k-NN searches in Elasticsearch.\n"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Speeding Up Multi-graph Vector Search Explore multi-graph vector search in Lucene and discover how sharing information between segment searches enhances search speed. Lucene MS TV By: Mayya Sharipova and Thomas Veasey On March 12, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog explores multi-graph vector search in Lucene and how sharing information among segment searches in multi-graph vector search allowed us to achieve significant search speedups. The previous state of multi-graph vector search in Lucene As we have described before Lucene's and hence Elasticsearch's approximate kNN search is based on searching an HNSW graph for each index segment and combining results from all segments to find the global k nearest neighbors. When it was first introduced a multi-graph search was done sequentially in a single thread, searching one segment after another. This comes with some performance penalty because searching a single graph is sublinear in its size. In Elasticsearch 8.10 we parallelized vector search , allocating up to a thread per segment in kNN vector searches, if there are sufficient available threads in the threadpool. Thanks to this change, we saw query latency drop to half its previous value in our nightly benchmark. Even though we were searching segment graphs in parallel, they were still independent searches, each collecting its own top k results unaware of progress made by other segment searches. We knew from our experience with the lexical search that we could achieve significant search speedups by exchanging information about best results collected so far among segment searches and we thought we could apply the same sort of idea for vector search. Speeding up multi-graph vector search by sharing information between segment searches When graph based indexes such as HNSW search for nearest neighbors to a query vector one can think of their strategy as a combination of exploration and exploitation. In the case of HNSW this is managed by gathering a larger top-n match set than the top-k which it will eventually return. The search traverses every edge whose end vector is competitive with the worst match found so far in the expanded set. This means it explores parts of the graph which it already knows are not competitive and will never be returned. However, it also allows the search to escape local minima and ultimately achieve better recall. By contrast a pure exploitation approach simply seeks to decrease the distance to the kth best match at every iteration and will only traverse edges whose end vectors will be added to the current top-k set. So the size of the expanded match set is a hyperparameter which allows one to trade run time for recall by increasing or decreasing exploration in the proximity graph. As we discussed already, Lucene builds multiple graphs for different partitions of the data. Furthermore, at large scale, data must be partitioned and separate graphs built if one wants to scale retrieval horizontally over several machines. Therefore, a generally interesting question is \"how should one adapt this strategy in the case that several graphs are being searched simultaneously for nearest neighbors?\" Recall is significantly higher when one searches graphs independently and combines the top-k sets from each. This makes sense through the lens of exploration vs exploitation: the multi-graph search is exploring many more vectors and so is much less likely to get trapped in local minima for the similarity function. However, it pays a cost to do this in increased run time. Ideally, we would like recall to be more independent of the sharding strategy and search to be faster. There are two factors which impact the efficiency of search on multiple graphs vs a single graph: the edges which are present in the single graph and having multiple independent top-n sets. In general, unless the vectors are partitioned into disjoint regions the neighbors of a vector in each partition graph will only comprise a subset of the true nearest neighbors in the single graph. This means one pays a cost in exploring non-competitive neighbors when searching multiple graphs. Since graphs are built independently, one necessarily has to pay a “structural” cost from having several graphs. However, as we shall see we can mitigate the second cost by intelligently sharing state between searches. Given a shared global top-n set it is natural to ask how we should search portions of graphs that are uncompetitive, specifically, edges whose end vertices that are further than the nth worst global match. If we were searching a single graph these edges would not be traversed. However, we have to bear in mind that the different searches have different entry points and progress at different rates, so if we apply the same condition to multi-graph search it is possible that the search will stop altogether before we visit its closest neighbors to the query. We illustrate this in the figure below. Figure 1 Two graph fragments showing a snapshot of a simultaneous search gathering the top-2 set. In this case if we were to prune edges whose unvisited end vertices are not globally competitive we would never traverse the red dashed edge and fail to find the best matches which are all in Graph 2. To avoid this issue we devised a simple approach that effectively switches between different parameterizations of each local search based on whether it is globally competitive or not. To achieve this, as well as the global queue which is synchronized periodically, we maintain two local queues of the distances to the closest vectors to the query found for the local graph. One has size n and the other has size ⌊ g × n ⌋ \\lfloor g \\times n \\rfloor ⌊ g × n ⌋ . Here, g g g controls the greediness of non-competitive search and is some number less than 1. In effect, g g g is a free parameter we can use to control recall vs the speed up. As the search progresses we check two conditions when deciding whether to traverse an edge: i) would we traverse the edge if we were searching the graph alone, ii) is the end vertex globally competitive or is it locally competitive with the \"greedy\" best match set. Formally, if we denote the query vector q q q , the end vector of the candidate edge v e v_e v e ​ , the n th n^{\\text{th}} n th local best match v n v_n v n ​ , the ⌊ g × n ⌋ th \\lfloor g \\times n\\rfloor^{\\text{th}} ⌊ g × n ⌋ th local best match v g v_g v g ​ and the n th n^{\\text{th}} n th global best match v g b v_{gb} v g b ​ then this amounts to adding v e v_e v e ​ to the search set if d ( v e , q ) < d ( v n , q ) AND ( d ( v e , q ) < d ( v g , q ) OR d ( v e , q ) < d ( v g b , q ) ) d(v_e, q) < d(v_n, q)\\text{ AND }(d(v_e, q) < d(v_g, q)\\text{ OR }d(v_e, q) < d(v_{gb}, q)) d ( v e ​ , q ) < d ( v n ​ , q ) AND ( d ( v e ​ , q ) < d ( v g ​ , q ) OR d ( v e ​ , q ) < d ( v g b ​ , q )) Here, d ( ⋅ , ⋅ ) d(\\cdot,\\cdot) d ( ⋅ , ⋅ ) denotes the index distance metric. Note that this strategy ensures we always continue searching each graph to any local minimum and depending on the choice of g g g we still escape some local minima. Modulo some details around synchronization, initialization and so on, this describes the change to the search. As we show this simple approach yields very significant improvements in search latency together with recall which is closer, but still better, than single graph search. Impact on performance Our nightly benchmarks showed up to 60% faster vector search queries that run concurrent with indexing (average query latencies dropped from 54 ms to 32 ms). Figure 2 Query latencies that run concurrently with indexing dropped significantly after upgrading to Lucene 9.10, which contains the new changes. On queries that run outside of indexing we observed modest speedups, mostly because the dataset is not that big, containing 2 million vectors of 96 dims across 2 shards (Figure 3). But still for those benchmarks, we could see a significant decrease in the number of visited vertices in the graph and hence the number of vector operations (Figure 4). Figure 3 Whilst we see small drops in the latencies after the change for queries that run without concurrent indexing, particularly for retrieving the top-100 matches, the number of vector operations (Figure 4) is dramatically reduced. Figure 4 We see very significant decreases in the number of vector operations used to retrieve the top-10 and top-100 matches. The speedups should be clearer for larger indexes with higher dimension vectors: in testing we typically saw between 2 × 2\\times 2 × to 3 × 3\\times 3 × , which is also consistent with the reduction in the number of vector comparisons we see above. For example, we show below the speedup in vector search operations on the Lucene nightly benchmarks. These use vectors of 768 dimensions. It is worth noting that in the Lucene benchmarks the vector search runs in a single thread sequentially processing one graph after another, but the change positively affects this case as well. This happens because the global top-n set collected after first graph searches sets up the threshold for subsequent graph searches and allows them to finish earlier if they don't contain competitive candidates. Figure 5 The graph shows that with the change committed on Feb 7th, the number queries per second increased from 104 queries/sec to 219 queries/sec. Impact on recall The multi-graph search speedups come at the expense of slightly reduced recall. This happens because we may stop exploration of a graph that may still have good matches based on the global matches from other graphs. Two notes on the reduced recall: i) From our experimental results we saw that the recall is still higher than the recall of a single graph search, as if all segments were merged together into a single graph (Figure 6). ii) Our new approach achieves better performance for the same recall: it Pareto dominates our old multi-graph search strategy (Figure 7). Figure 6 We can see the recall of kNN search on multiple segments slightly dropped for both top-10 and top-100 matches, but in both cases it is still higher than the recall of kNN search on a single merged segment. Figure 7 The Queries Per Second is better in the candidate (with the current changes) than the baseline (old multi-graph search strategy) for the 10 million documents of the Cohere/wikipedia-22-12-en-embeddings dataset for each equivalent recall. Conclusion In this blog we showed how we achieved significant improvements in Lucene vector search performance while still achieving excellent recall by intelligently sharing information between the different graph searches. The improvement is a part of the Lucene 9.10 release and is a part of the Elasticsearch 8.13 release. We're not done yet with improvements to our handling of multiple graphs in Lucene. As well as further improvements to search, we believe we've found a path to achieve dramatically faster merge times. So stay tuned! Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to The previous state of multi-graph vector search in Lucene Speeding up multi-graph vector search by sharing information between segment searches Impact on performance Impact on recall Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Speeding Up Multi-graph Vector Search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/multi-graph-vector-search",
+    "meta_description": "Explore multi-graph vector search in Lucene and discover how sharing information between segment searches enhances search speed."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog GitHub Assistant: Interact with your GitHub repository using RAG and Elasticsearch This blog introduces a GitHub Assistant using RAG with Elasticsearch to enable semantic code queries, providing insights into GitHub repositories, which can be extended to PRs feedback, issues handling, and production readiness reviews. Generative AI Python FS By: Fram Souza On October 23, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This project allows you to interact directly with a GitHub repository and leverage semantic search to understand the codebase. You'll learn how to ask specific questions about the repository's code and receive meaningful, context-aware responses. You can follow the GitHub code here . Key considerations Quality of data : The output is only as good as the input—ensure your data is clean and well-structured. Chunk size : Proper chunking of data is crucial for optimal performance. Performance evaluation : Regularly assess the performance of your RAG-based application. Components Elasticsearch : Serves as the vector database for efficient storage and retrieval of embeddings. LlamaIndex : A framework for building applications powered by LLM. OpenAI : Used for both the LLM and generating embeddings. Architecture Ingestion The process starts by cloning a GitHub repository locally to the /tmp directory. The SimpleDirectoryReader is then used to load the cloned repository for indexing, the documents are split into chunks based on file type, utilizing CodeSplitter for code files, along with JSON , Markdown , and SentenceSplitters for other formats, see: If you want to add more support language into this code, you can just add a new parser and extension to the parsers_and_extensions list. After parsing the nodes, embeddings are generated using the text-embedding-3-large model and stored in Elasticsearch. The embedding model is declared using the Setting bundle, which a global variable: This is then utilized in the main function as part of the Ingest Pipeline. Since it's a global variable, there's no need to call it again during the ingestion process: The code block above starts by parsing the documents into smaller chunks (nodes) and then initializes a connection to Elasticsearch. The IngestionPipeline is created with the specified Elasticsearch vector store, and the pipeline is executed to process the nodes and store their embeddings in Elasticsearch, while displaying the progress during the process. At this point we should have your data indexed in Elasticsearch with the embeddings generated and stored. Below is one example of how the document looks like in ESS: Query Once the data is indexed, you can query the Elasticsearch index to ask questions about the codebase. The query.py script allows you to interact with the indexed data and ask questions about the codebase. It retrieves a query input from the user, creates an embedding using the same OpenAIEmbedding model used in the index.py , and sets up a query engine with the VectorStoreIndex loaded from the Elasticsearch vector store. The query engine uses similarity search, retrieving the top 3 most relevant documents based on the query's similarity to the stored embeddings. The results are summarized in a tree-like format using response_mode=\"tree_summarize\" , you can see the code snippet below: Installation 1. Clone the repository : 2. Install required libraries : 3. Set up environment variables : Update the .env file with your Elasticsearch credentials and the target GitHub repository details (eg, GITHUB_TOKEN , GITHUB_OWNER , GITHUB_REPO , GITHUB_BRANCH , ELASTIC_CLOUD_ID , ELASTIC_USER , ELASTIC_PASSWORD , ELASTIC_INDEX ). Here's one example of the .env file: Usage 1. Index your data and create the embeddings by running : 2. Ask questions about your codebase by running : Example: Questions you might want to ask: Give me a detailed description of what are the main functionalities implemented in the code? How does the code handle errors and exceptions? Could you evaluate the test coverage of this codebase and also provide detailed insights into potential enhancements to improve test coverage significantly? Evaluation The evaluation.py code processes documents, generates evaluation questions based on the content, and then evaluates the responses for relevancy ( Whether the response is relevant to the question ) and faithfulness ( Whether the response is faithful to the source content ) using a LLM. Here’s a step-by-step guide on how to use the code: You can run the code without any parameters, but the example above demonstrates how to use the parameters. Here's a breakdown of what each parameter does: Document processing: --num_documents 5 : The script will process a total of 5 documents. --skip_documents 2 : The first 2 documents will be skipped, and the script will start processing from the 3rd document onward. So, it will process documents 3, 4, 5, 6, and 7. Question generation: After loading the documents, the script will generate a list of questions based on the content of these documents. --num_questions 3 : Out of the generated questions, only 3 will be processed. --skip_questions 1 : The script will skip the first question in the list and process questions starting from the 2nd question. --process_last_questions : Instead of processing the first 3 questions after skipping the first one, the script will take the last 3 questions in the list. Now what? Here are a few ways you can utilize this code: Gain insights into a specific GitHub repository by asking questions about the code, such as locating functions or understanding how parts of the code work. Build a multi-agent RAG system that ingests GitHub PRs and issues, enabling automatic responses to issues and feedback on PRs. Combine your logs and metrics with the GitHub code in Elasticsearch to create a Production Readiness Review using RAG, helping assess the maturity of your services. Happy RAG! Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Jump to Key considerations Components Architecture Ingestion Query Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "GitHub Assistant: Interact with your GitHub repository using RAG and Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/github-rag-elasticsearch",
+    "meta_description": "Explore the GitHub Assistant, which uses RAG & Elasticsearch to enable semantic code queries, PR feedback and insights into GitHub repositories."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing Retrievers - Search All the Things! Learn about Elasticsearch retrievers, including Standard, kNN, text_expansion, and RRF. Discover how to use retrievers with examples. Vector Database Python JV JC By: Jeff Vestal and Jack Conradson On May 28, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. In 8.14, Elastic introduced a new search capability called “Retrievers” in Elasticsearch. Keep reading to discover their simplicity and efficiency and how they can empower you in your search operations. Retrievers are a new abstraction layer added to the Search API in Elasticsearch. They offer the convenience of configuring multi-stage retrieval pipelines within a single _search API call. This architecture simplifies search logic in your application by eliminating the need for multiple Elasticsearch API calls for complex search queries. It also reduces the need for client-side logic, often required to combine results from multiple queries. Initial types of Retrievers There are three types of retrievers included in the initial release. Each of these retrievers is designed for a specific purpose, and when combined, they allow for complex search operations. The available types are: Standard - Return the top documents from a traditional query. These allow backward compatibility by supporting existing Query DSL request syntax, allowing you to migrate to the retriever framework at your own pace. kNN - Return the top documents from a kNN search. RRF - Combine and rank multiple first-stage retrievers into a single result set with no or minimal user tuning using the reciprocal rank fusion algorithm. An RRF retriever is a compound retriever whose filter element is propagated to its sub-retrievers. How are Retrievers different, and why are they useful? With traditional queries, the query is part of the overall search API call. Retrievers differ by being designed as standalone entities that can be used in isolation or easily combined. This modular approach allows for more flexibility when designing search strategies. Retrievers are designed to be part of a “retriever tree,” a hierarchical structure that defines search operations by clarifying their sequence and logic. This structure makes complex searches more manageable and easier for developers to understand and allows new functionality to be easily added in the future. Retrievers enable composability, allowing you to build pipelines and integrate different retrieval strategies. This allows for easy testing of varying retrieval combinations. They also provide more control over how documents are scored and filtered. You can, for example, specify a minimum score threshold, apply a complex filter without affecting scoring, and use parameters like terminate_after for performance optimizations. Backward compatibility is maintained with traditional query elements, automatically translating them to the appropriate retriever. Retrievers usage examples Let’s look at some examples of using retrievers. We are using an IMDB sample dataset. You can run the accompying jupyter notebook to ingest IMDB data into a Serverless Search project and run the below examples yourself! The high-level setup is: overview - a short summary of the movie names the movie's name overview_dense - a dense_vector generated from an e5-small model overview_sparse - a sparse vector using Elastic’s ELSER model. Only returning the text version of names and overview using fields and setting _source:false Standard - Search All the Text! kNN - Search all the Dense Vectors! text_expansion - Search all the Sparse Vectors! rrf - Combine All the Things! Current restrictions with Retrievers Retrievers come with certain restrictions users should be aware of. For example, only the query element is allowed when using compound retrievers. This enforces a cleaner separation of concerns and prevents the complexity from overly nested or independent configurations. Additionally, sub-retrievers may not use elements restricted from having a compound retriever as part of the retriever tree. These restrictions enforce performance and composability even with complex retrieval strategies. Retrievers are initially released as a technical preview, so their API may change. Conclusion Retrievers represent a significant step forward with Elasticsearch's retrieval capabilities and user-friendliness. They can be chained in a pipeline fashion where each retriever applies its logic and passes the results to the next item in the chain. By allowing for more structured, flexible, and efficient search operations, retrievers can significantly enhance the search experience. The following resources provide additional details on Retrievers. Semantic Reranking in Elasticsearch with Retrievers Retrievers API documentation Retrievers - Search Your Data documentation Try the above code yourself! You can run the accompying jupyter notebook to ingest IMDB data into an Elastic Serverless Search project! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Initial types of Retrievers How are Retrievers different, and why are they useful? Retrievers usage examples Standard - Search All the Text! kNN - Search all the Dense Vectors! Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing Retrievers - Search All the Things! - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-retrievers",
+    "meta_description": "Learn about Elasticsearch retrievers, including Standard, kNN, text_expansion, and RRF. Discover how to use retrievers with examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series GenAI for customer support This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! Part1 Inside Elastic July 22, 2024 GenAI for Customer Support — Part 2: Building a Knowledge Library This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! CM By: Cory Mangini Part2 Inside Elastic August 9, 2024 GenAI for Customer Support — Part 3: Designing a chat interface for chatbots... for humans This series gives you an inside look at how we’re using generative AI in customer support. Join us as we share our journey in designing a GenAI chatbot interface. IM By: Ian Moersen Part3 Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Part4 Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "GenAI for customer support - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/genai-for-customer-support",
+    "meta_description": "This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time!"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Smokin' fast BBQ with hardware accelerated SIMD instructions How we optimized vector comparisons in BBQ with hardware accelerated SIMD (Single Instruction Multiple Data) instructions. Vector Database Lucene CH By: Chris Hegarty On December 4, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. As we continue on our mission to make Elasticsearch and Apache Lucene the best place to store and search your vector data, we introduced a new approach in BBQ (Better Binary Quantization). BBQ is a huge step forward, bringing significant (32x) efficiencies by compressing the stored vectors, while maintaining high ranking quality and at the same time offering Smokin' Fast performance. You can read about how BBQ quantizes float32 to single bit vectors for storage, outperforms traditional approaches like Product Quantization in indexing speed (20-30x less quantization time) and query speed (2-5x faster queries), see the awesome accuracy and recall BBQ achieves for various datasets, in the BBQ blog. Since high-level performance is already covered in the BBQ blog, here we'll go a bit deeper to get a better understanding of how BBQ achieves such great performance. In particular, we'll look at how vector comparisons are optimized by hardware accelerated SIMD (Single Instruction Multiple Data) instructions. These SIMD instructions perform data-level parallelism so that one instruction operates on multiple vector components at a time. We'll see how Elasticsearch and Lucene target specific low-level SIMD instructions, like the AVX's VPOPCNTQ on x64 and NEON's vector instructions on ARM, to speed up vector comparisons. Why do we care so much about comparing vectors? Vector comparisons dominate the execution time within a vector database and are commonly the single most costly factor. We see this in profile traces, whether it be with float32 , int8 , int4 , or other levels of quantization. This is not surprising, since a vector database has to compare vectors a lot! Whether it be during indexing, e.g. building the HNSW graph, or during search as the graph or partition is navigated to find the nearest neighbours. Low-level performance improvements in vector comparisons can be really impactful on overall high-level performance. Elasticsearch and Lucene support a number of vector similarity metrics, like dot product, cosine, and Euclidean, however we'll focus on just dot product, since the others can be derived from that. Even though we have the ability in Elasticsearch to write custom native vector comparators, our preference is to stay in Java-land whenever possible so that Lucene can more easily get the benefits too. Comparing query and stored vectors For our distance comparison function to be fast, we need to simplify what it's doing as much as possible so it translates to a set of SIMD instructions that execute efficiently on the CPU. Since we've quantized the vectors to integer values we can load more of their components into a single register and also avoid the more costly floating-point arithmetic - this is a good start, but we need more. BBQ does asymmetric quantization; the query vector is quantized to int4 , while the stored vectors are even more highly compressed to just single bit values. Since a dot product is the sum of the product of the component values, we can immediately see that the only query component values that can contribute positively to the result are the values where the stored vector is 1 . A further observation is that if we translate the query vector in a way where the respective positional bits (1, 2, 3, 4) from each component value are grouped together, then our dot product reduces to a set of basic bitwise operations; AND + bit count for each component, followed by a shift representing the respective position of the query part, and lastly a sum to get the final result. See the BBQ blog for a more detailed explaination, but the following image visualizes an example query translation. Logically, the dot product reduces to the following, where d is the stored vector, and q1 , q2 , q3 , q4 are the respective positional parts of the translated query vector: For execution purposes, we layout the translated query parts, by increasing respective bit position, contiguously in memory. So we now have our translated query vector, q[] , which is four times the size of that of the stored vector d[] . A scalar java implementation of this dot product looks like this: While semantically correct, this implementation is not overly efficient. Let's move on and see what a more optimal implementation looks like, after which we can compare the runtime performance of each. Where does the performance come from? The implementation we've seen so far is a straightforward naive scalar implementation. To speed things up we rewrite the dot product with the Panama Vector API in order to explicitly target specific SIMD instructions. Here's a simplified snippet of what the code looks like for just one of the query parts - remember we need to do this four times, one for each of the translated int4 query parts. Here we're explicitly targeting AVX, operating on 256 bits per loop iteration. First performing a logical AND between the vq and vd , then a bit count on the result of that, before finally adding it to the sum accumulator. While we're interested in the bit count, we do however interpret the bytes in the vectors as longs, since that simplifies the addition and ensures that we don't risk overflowing the accumulator. A final step is then needed to horizontally reduce the lanes of the accumulator vector to a scalar result, before shifting by the representative query part number. Disassembling this on my Intel Skylake, we can clearly see the body of the loop. The rsi and rdx registers hold the address of the vectors to be compared, from which we load the next 256 bits into ymm4 and ymm5 , respectively. With our values loaded we now perform a bitwise logical AND, vpand , storing the result in ymm4 . Next you can see the population count instruction, vpopcntq , which counts the number of bits set to one. Finally we add 0x20 (32 x 8bits = 256bits) to the loop counter and continue. We're not showing it here for simplicity, but we actually unroll the 4 query parts and perform them all per loop iteration, as this reduces the load of the data vector. We also use independent accumulators for each of the parts, before finally reducing. There are 128 bit variants whenever the vector dimensions do not warrant striding 256 bits at a time, or on ARM where it compiles to sequences of Neon vector instructions AND , CNT , and UADDLP . And, of course, we deal with the tails in a scalar fashion, which thankfully doesn't occur all that often in practice given the dimension sizes that most popular models use. We continue our experiments with AVX 512, but so far it's not proven worthwhile to stride 512 bits at a time over this data layout, given common dimension sizes. How much does SIMD improve things? When we compare the scalar and vectorized dot product implementations, we see from 8x to 30x throughput improvement for a range of popular dimensions from 384 to 1536, respectively. With the optimized dot product we have greatly improved the overall performance such that the vector comparison is no longer the single most dominant factor when searching and indexing with BBQ. For those interested, here are some links to the benchmarks and code . Wrapping up BBQ is a new technique that brings both incredible efficiencies and awesome performance. In this blog we looked at how vector distance comparison is optimized in BBQ by hardware accelerated SIMD instructions. You can read more about the index and search performance, along with accuracy and recall in the BBQ blog. BBQ is released as tech-preview in Elasticsearch 8.16, and in Serverless right now! Along with new techniques like BBQ, we're continuously improving the low-level performance of our vector database. You can read more about what we've already done in these other blogs; FMA , FFM , and SIMD . Also, expect to see more of these low-level performance focused blogs like this one in the future, as we keep improving the performance of Elasticsearch so that it's the best place to store and search your vector data. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Why do we care so much about comparing vectors? Comparing query and stored vectors Where does the performance come from? How much does SIMD improve things? Wrapping up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Smokin' fast BBQ with hardware accelerated SIMD instructions - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/bbq-vector-comparison-simd-instructions",
+    "meta_description": "Explore how Elastic BBQ achieves its performance. Mainly how vector distance comparisons are optimized in BBQ by hardware accelerated SIMD instructions."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). Lucene Vector Database BT By: Benjamin Trent On January 6, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Our Better Binary Quantization (BBQ) indices are now even better(er). Recall improvements across the board (in extreme cases up to 20%) and unlocking the future of quantizing vectors to any bit size. As of Elasticsearch 8.18, BBQ indices are now backed by our state of the art optimized scalar quantization algorithm. Scalar quantization history Introduced in Elasticsearch 8.12, scalar quantization was initially a simple min/max quantization scheme. Per lucene segment, we would find the global quantile values for a given confidence interval. These quantiles are then used as the minimum and maximum to quantize all the vectors. While this naive quantization is powerful, it only really works for whole byte quantization. Static confidence intervals mean static quantiles. This is calculated once for all vectors in a given segment and works well for higher bit values. In Elasticsearch 8.15, we added half-byte, or int4, quantization . To achieve this with high recall, we added an optimization step, allowing for the best quantiles to be calculated dynamically. Meaning, no more static confidence intervals. Lucene will calculate the best global upper and lower quantiles for each segment. Achieving 8x reduction in memory utilization over float32 vectors. Dynamically searching for the best quantiles to reduce the vector similarity error. This was done once, globally, over a sample set of the vectors and applied to all. Finally, now in 8.18, we have added locally optimized scalar quantization. It optimizes quantiles per individual vector. Allowing for exceptional recall at any bit size, even single bit quantization. What is Optimized Scalar Quantization? For an in-depth explanation of the math and intuition behind optimized scalar quantization, check out our blog post on Optimized Scalar Quantization . There are three main takeaways from this work: Each vector, is centered on the Apache Lucene segment's centroid. This allows us to make better use of the possible quantized vectors to represent the dataset as a whole. Every vector is individually quantized with a unique set of optimized quantiles. Asymmetric quantization is used allowing for higher recall with the same memory footprint. In short, when quantizing each vector: We center the vector on the centroid Compute a limited number of iterations to find the optimal quantiles. Stopping early if the quantiles are unchanged or the error (loss) increases Pack the resulting quantized vectors Store the packed vector, its quantiles, the sum of its components, and an extra error correction term Here is a step by step view of optimizing 2 bit vectors. After the fourth iteration, we would normally stop the optimization process as the error (loss) increased. The first cell is each individual components error. The second is the distribution of 2 bit quantized vectors. Third is how the overall error is changing. Fourth is current step's quantiles overlayed of the raw vector being quantized. Storage and retrieval of optimized scalar quantization The storage and retrieval of optimized scalar quantization vectors are similar to BBQ. The main difference is the particular values we store. Stored for every binary quantized vector: dims/8 bytes, upper and lower quantiles, an additional correction term, the sum of the quantized components. One piece of nuance is the correction term. For Euclidean distance , we store the squared norm of the centered vector. For dot product we store the dot product between the centroid and the uncentered vector. Performance Enough talk. Here are the results from four datasets. Cohere's 768 dimensioned multi-lingual embeddings. This is a well distributed inner-product dataset. Cohere's 1024 dimensioned multi-lingual embeddings. This embedding model is well optimized for quantization. E5-Small-v2 quantized over the quora dataset. This model typically does poorly with binary quantization. GIST-1M dataset. This scientific dataset opens some interesting edge cases for inner-product and quantization. Here are the results for Recall@10|50 Dataset BBQ BBQ with OSQ Improvement Cohere 768 0.933 0.938 0.5% Cohere 1024 0.932 0.945 1.3% E5-Small-v2 0.972 0.975 0.3% GIST-1M 0.740 0.989 24.9% Across the board, we see that BBQ backed by our new optimized scalar quantization improves recall, and dramatically so for the GIST-1M dataset. But, what about indexing times? Surely all this per vector optimizations must add up. The answer is no. Here are the indexing times for the same datasets. Dataset BBQ BBQ with OSQ Difference Cohere 768 368.62s 372.95s +1% Cohere 1024 307.09s 314.08s +2% E5-Small-v2 227.37s 229.83s < +1% GIST-1M 1300.03s* 297.13s -300% Since the quantization methodology works so poorly over GIST-1M when using inner-product, it takes an exceptionally long time to build the HNSW graph as the vector distances are not well distinguished. Conclusion Not only does this new, state of the art quantization methodology improve recall for our BBQ indices, it unlocks future optimizations. We can now quantize vectors to any bit size and we want to explore how to provide 2 bit quantization, striking a balance between memory utilization and recall with no reranking. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Scalar quantization history What is Optimized Scalar Quantization? Storage and retrieval of optimized scalar quantization Performance Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/optimized-scalar-quantization-elasticsearch",
+    "meta_description": "Learn about optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ)."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Java Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Java +1 October 8, 2024 LangChain4j with Elasticsearch as the embedding store LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Discover how to use it to build your RAG application in plain Java. DP By: David Pilato Java How To October 3, 2024 Testing your Java code with mocks and real Elasticsearch Learn how to write your automated tests for Elasticsearch, using mocks and Testcontainers PP By: Piotr Przybyl Integrations Java +1 September 23, 2024 Introducing LangChain4j to simplify LLM integration into Java applications LangChain4j (LangChain for Java) is a powerful toolset to build your RAG application in plain Java. DP By: David Pilato ES|QL Java +1 May 2, 2024 ES|QL queries to Java objects Learn how to perform ES|QL queries with the Java client. Follow this guide for step-by-step instructions, including examples. LT By: Laura Trotta Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Java - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/category/java-programming",
+    "meta_description": "Java articles from Elasticsearch Labs"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Playground: Experiment with RAG applications with Elasticsearch in minutes Learn about Elastic's Playground and how to use it to experiment with RAG applications using Elasticsearch. Vector Database Generative AI Integrations Developer Experience How To JM SC By: Joe McElroy and Serena Chou On June 28, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this blog, you'll learn about Playground and how to use it to experiment with Retrieval-Augmented Generation (RAG) applications using Elasticsearch. Update: Try the new Playground app in the Elastic demo gallery. What is Playground? Elastic's playground experience is a low-code interface for developers to explore grounding LLMs of their choice with their own private data in minutes. While prototyping conversational search, the ability to rapidly iterate on and experiment with key components of a RAG workflow (for example: hybrid search, or adding reranking ) are important— to get accurate and hallucination-free responses from LLMs. Elasticsearch vector database and the Search AI platform provides developers with a wide range of capabilities such as comprehensive hybrid search, and to use innovation from a growing list of LLM providers. Our approach in our playground experience allows you to use the power of those features, without added complexity. A/B test LLMs and choose different inference providers Playground’s intuitive interface allows you to A/B test different LLMs from model providers (like OpenAI and Anthropic) and refine your retrieval mechanism, to ground answers with your own data indexed into one or more Elasticsearch indices. The playground experience can leverage transformer models directly in Elasticsearch, but is also amplified with the Elasticsearch Open Inference API which integrates with a growing list of inference providers including Cohere and Azure AI Studio . The best context window with retrievers and hybrid search As Elasticsearch developers already know, the best context window is built with hybrid search . Your strategy for architecting towards this outcome requires access to many shapes of vectorized and plain text data, that can be chunked and spread across multiple indices. We’re helping you simplify query construction with newly introduced query retrievers to Search All the Things! With three key retrievers (available now in 8.14 and Elastic Cloud Serverless) hybrid search with scores normalized with RRF is one unified query away. Using retrievers, the playground understands the shape of the selected data and will automatically generate a unified query on your behalf. Store vectorized data and explore a kNN retriever, or add metadata and context to generate a hybrid search query by selecting your data. Coming soon, semantic reranking can easily be incorporated into your generated query for even higher-quality recall. Once you’ve tuned and configured your semantic search to production standards, you’re ready to export the code and either finalize the experience in your application with your Python Elasticsearch language client or LangChain Python integration. Playground is accessible today on Elastic Cloud Serverless and available today in 8.14 on Elastic Cloud . Using the Playground Playground is accessible from within Kibana (the Elasticsearch UI) by navigating to “Playground” from within the side navigation. Connect to your LLM Playground supports chat completion models such as GPT-4o from OpenAI, Azure OpenAI, or Anthropic through Amazon Bedrock. To start, you need to connect to either one of these model providers to bring your LLM of choice. Chat with your data Any data can be used, even BM25-based indices. Your data fields can optionally be transformed using text embedding models (like our zero-shot semantic search model ELSER), but this is not a requirement.Getting Started is extremely simple - just select the indices you want to use to ground your answers and start asking questions.In this example, we are going to use a PDF and start with using BM25, with each document representing a page of the PDF. Indexing a PDF document with BM25 with Python First, we install the dependencies. We use the pypdf library to read PDFs and request to retrieve them. Then we read the file, creating an array of pages containing the text. And we then import this into elasticsearch, under the my_pdf_index_bm25 index. Chatting with your data with Playground Once we have connected our LLM with a connector and chosen the index, we can start asking questions about the PDF. The LLM will now easily provide answers to your data. What happens behind the scenes? When we choose an index, we automatically determine the best retrieval method. In this case, BM25 keyword search is only available, so we generate a multi-match type query to perform retrieval. As we only have one field, we defaulted to searching for this. If you have more than one field, you can choose the fields you want to search to improve the retrieval of relevant documents. Asking a question When you ask a question, Playground will perform a retrieval using the query to find relevant documents matching your question. It will then use this as context and provide it with the prompt, grounding the answer that’s returned from your chosen LLM model. We use a particular field from the document for the context. In this example, Playground has chosen the field named “text,” but this can be changed within the “edit context” action. By default, we retrieve up to 3 documents for the context, but you can adjust the number from within the edit context flyout as well. Asking a follow up question Typically, the follow-up question is tied to a previous conversation. With that in mind, we ask the LLM to rewrite the follow-up question using the conversation into a standalone question, which is then used for retrieval. This allows us to retrieve better documents to use as context to help answer the question. Context When documents are found based on your question, we provide these documents to the LLM as context to ground the LLM’s knowledge when answering. We automatically choose a single index field we believe is best, but you can change this field by going to the edit context flyout. Improving retrieval with Semantic Search and Chunking Since our query is in the form of a question, it is important for retrieval to be able to match based on semantic meaning. With BM25 we can only match documents that lexically match our question, so we’ll need to add semantic search too. Sparse Vector Semantic search with ELSER One simple way to start with semantic search is to use Elastic’s ELSER sparse embedding model with our data. Like many models of this size and architecture, ELSER has a typical 512-token limit and requires a design choice of an appropriate chunking strategy to accommodate it. In upcoming versions of Elasticsearch, we’ll chunk by default as part of the vectorization process, but in this version, we’ll follow a strategy to chunk by paragraphs as a starting point. The shape of your data may benefit from other chunking strategies, and we encourage experimentation to improve retrieval. Chunking and ingesting the PDF with pyPDF and LangChain To simplify the example, we will use LangChain tooling to load and split the pages into passages. LangChain is a popular tool for RAG development that can be integrated and used with the Elasticsearch vector database and semantic reranking capabilities with our updated integration. Creating an ELSER inference endpoint The following REST API calls can be executed to download, deploy, and check the model's running status. You can execute these using Dev Tools within Kibana. Ingesting into Elasticsearch Next we will set up an index and attach a pipeline that will handle the inference for us. Splitting pages into passages and ingesting into Elasticsearch Now that the ELSER model has been deployed, we can start splitting the PDF pages into passages and ingesting them into Elasticsearch. That’s it! We should have passages ingested into Elasticsearch that have been embedded with ELSER. See it in action on Playground Now when selecting the index, we generate an ELSER-based query using the deployment_id for embedding the query string. When asking a question, we now have a semantic search query that is used to retrieve documents that match the semantic meaning of the question. Hybrid Search made simple Enabling the text field can also enable hybrid search. When we retrieve documents, we now search for both keyword matches and semantic meaning and rank the two result sets with the RRF algorithm. Improve the LLM’s answers With Playground, you can adjust your prompt, tweak your retrieval, and create multiple indices (chunking strategy and embedding models) to improve and compare your responses. In the future, we will provide hints on how to get the most out of your index, suggesting methods to optimize your retrieval strategy. System prompt By default, we provide a simple system prompt which you can change within model settings. This is used in conjunction with a wider system prompt. You can change the simple system prompt by just editing it. Optimizing context Good responses rely on great context. Using methods like chunking your content and optimizing your chunking strategy for your data is important. Along with chunking your data, you can improve retrieval by trying out different text embedding models to see what gives you the best results. In the above example, we have used Elastic’s own ELSER model, but the inference service supports a wide number of embedding models that may suit your needs better. Other benefits of optimizing your context include better cost efficiency and speed: cost is calculated based on tokens (input and output). The more relevant documents we can provide, aided by chunking and Elasticsearch's powerful retrieval capabilities, the lower the cost and faster latency will be for your users. If you notice, the input tokens we used in the BM25 example are larger than those in the ELSER example. This is because we effectively chunked our documents and only provided the LLM with the most relevant passages on the page. Final Step! Integrate RAG into your application Once you’re happy with the responses, you can integrate this experience into your application. View code offers example application code for how to do this within your own API. For now, we provide examples with OpenAI or LangChain, but the Elasticsearch query, the system prompt, and the general interaction between the model and Elasticsearch are relatively simple to adapt for your own use. Conclusion Conversational search experiences can be built with many approaches in mind, and the choices can be paralyzing, especially with the pace of innovation in new reranking and retrieval techniques, both of which apply to RAG applications. With our playground, those choices are simplified and intuitive, even with the vast array of capabilities available to the developer. Our approach is unique in enabling hybrid search as a predominant pillar of the construction immediately, with an intuitive understanding of the shape of the selected and chunked data and amplified access across multiple external providers of LLMs. Build, test, fun with playground Try the Playground demo or head over to Playground docs to get started today! Explore Search Labs on GitHub for new cookbooks and integrations for providers such as Cohere, Anthropic, Azure OpenAI, and more. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to What is Playground? A/B test LLMs and choose different inference providers The best context window with retrievers and hybrid search Using the Playground Connect to your LLM Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Playground: Experiment with RAG applications with Elasticsearch in minutes - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/rag-playground-introduction",
+    "meta_description": "Learn about Elastic's Playground and how to use it to experiment with RAG applications using Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Semantic search, leveled up: now with native match, knn and sparse_vector support Semantic text search becomes even more powerful, with native support for match, knn and sparse_vector queries. This allows us to keep the simplicity of the semantic query while offering the flexibility of the Elasticsearch query DSL. Search Relevance Vector Database KD By: Kathleen DeRusso On March 6, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch’s semantic query is incredibly powerful, allowing users to perform semantic search over data configured in semantic_text fields. Much of this power lies in simplicity: just set up a semantic_text field with the inference endpoint you want to use, and then ingest content as if indexing content into a regular text field. The inference happens automatically and transparently, making it simple to set up and use a search index with semantic functionality. This ease of use does come with some tradeoffs: we simplified semantic search with semantic_text by making judgments on default behavior that fit the majority of use cases. Unfortunately, this means that some customizations available for traditional vector search queries aren’t present in the semantic query . We didn’t want to add all of these options directly to the semantic query, as that would undermine the simplicity that we strive for. Instead, we expanded the queries that support the semantic_text field, leaving it up to you to choose the best query that meets your needs. Let’s walk through these changes, starting with creating a simple index with a semantic_text field: We made match happen! First and most importantly, the match query will now work with semantic_text fields! This means that you can change your old semantic query: Into a simple match query: We can see the benefits of semantic search here because we’re searching for “song lyrics about love”, none of which appears in the indexed document. This is because of ELSER’s text expansion. But wait, it gets better! If you have multiple indices, and the same field name is semantic_text in one field and perhaps text in the other field, you can still run match queries against these fields. Let’s create another index, with the same field names, but different types ( text instead of semantic_text ). Here’s a simple example to illustrate: Here, searching for “crazy” brings up both the lexical match that has “crazy” in the title, and the semantic lyric “lose my mind.” There are some caveats to keep in mind when using the match functionality with semantic_text : The underlying semantic_text field has a limitation where you can’t use multiple inference IDs on the same field. This limitation extends to match — meaning that if you have two semantic_text fields with the same name, they need to have the same inference ID or you’ll get an error. You can work around this by creating different names and querying them in a boolean query or a compound retriever . Depending on what model you use, the scores between lexical (text) matches and semantic matches will likely be very different. In order to get the best ranking of results, we recommend using second stage rerankers such as semantic reranking or RRF . Semantic search using the match query is also available in ES|QL ! Here’s the same example as above, but using ES|QL: Expert-level semantic search with knn and sparse_vector Match is great, but sometimes you want to specify more vector search options than the semantic query supports. Remember, the tradeoff of making the semantic query as simple as it is involved making some decisions on default behavior. This means that if you want to take advantage of some of the more advanced vector search features, perhaps num_candidates or filter from the knn query or token pruning in the sparse_vector query , you won’t be able to do so using the semantic query. In the past, we provided some workarounds to this, but they were convoluted and required knowing the inner workings and architecture of the semantic_text field and constructing a corresponding nested query. If you’re doing that workaround now, it will still work—however, we now support query DSL using knn or sparse_vector queries on semantic_text fields. All about that dense (vector), no trouble Here’s an example script that populates a text_embedding model and queries a semantic_text field using the knn query: The knn query can be modified with extra options to enable more advanced queries against the semantic_text field. Here, we perform the same query but add a pre-filter against the semantic_text field: Keepin’ it sparse (vector), keepin’ it real Similarly, sparse embedding models can be queried more specifically using semantic_text fields as well. Here’s an example script that adds a few more documents and uses the sparse_vector query: The sparse_vector query can be modified with extra options, to enable more advanced queries against the semantic_text field. Here, we perform the same query but add token pruning against a semantic_text field: This example significantly decreases the token frequency ratio required to pruning, which helps us show differences with such a small dataset, though they’re probably more aggressive than you’d want to see in production (remember, token pruning is about pruning irrelevant tokens to improve performance, not drastically change recall or relevance). You can see in this example that the Avril Lavigne song is no longer returned, and the scores have changed due to the pruned tokens. (Note that this is an illustrative example, and we still recommend a rescore adding pruned tokens back into scoring for most use cases). You’ll note that with all of these queries if you’re only querying a semantic_text field, you no longer need to specify the inference ID in knn ’s query_vector_builder or in the sparse_vector query. This is because it will be inferred from the semantic_text field. You can specify it if you want to override with a different (compatible!) inference ID for some reason or if you’re searching combined indices that have both semantic_text and sparse_vector or dense_vector fields though. Try it out yourself We’re keeping the original semantic query simple, but expanding our semantic search capabilities to power more use cases and seamlessly integrate semantic search with existing workflows. These power-ups are native to Elasticsearch and are already available in Serverless. They’ll be available in stack-hosted Elasticsearch starting with version 8.18. Try it out today! Report an issue Related content Vector Database June 24, 2024 Elasticsearch new semantic_text mapping: Simplifying semantic search Learn how to use the new semantic_text field type and semantic query for simplifying semantic search in Elasticsearch. CD MP By: Carlos Delgado and Mike Pellegrini Vector Database July 23, 2024 Introducing the sparse vector query: Searching sparse vectors with inference or precomputed query vectors Learn about the Elasticsearch sparse vector query, how it works, and how to effectively use it. KD By: Kathleen DeRusso Vector Database How To December 7, 2023 Introducing kNN Query: An expert way to do kNN search Explore how the kNN query in Elasticsearch can be used and how it differs from top-level kNN search, including examples. MS BT By: Mayya Sharipova and Benjamin Trent Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Jump to We made match happen! Expert-level semantic search with knn and sparse_vector All about that dense (vector), no trouble Keepin’ it sparse (vector), keepin’ it real Try it out yourself Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Semantic search, leveled up: now with native match, knn and sparse_vector support - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/semantic-search-match-knn-sparse-vector",
+    "meta_description": "Explore the changes that leveled up semantic search and learn how to use those features, including native match, knn, and sparse_vector support."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Stateless — your new state of find with Elasticsearch Learn about Elasticsearch stateless and explore the stateless architecture, which brings performance improvements and reduces costs. ML Research LL TB QH By: Leaf Lin , Tim Brooks and Quin Hoxie On October 6, 2022 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. With stateless Elasticsearch, we are investing in building a new fully cloud native architecture to push the boundaries of scale and speed. In this blog, we explore where we started, the future of Elasticsearch with the introduction of a stateless architecture and the details of this architecture. Where we started The first version of Elasticsearch was released in 2010 as a distributed scalable search engine allowing users to quickly search for and surface critical insights. Twelve years and over 65,000 commits later, Elasticsearch continues to provide users with battle-tested solutions to a wide variety of search problems. Thanks to the efforts of over 1,500 contributors, including hundreds of full-time Elastic employees, Elasticsearch has constantly evolved to meet the new challenges that arise in the field of search. Early in Elasticsearch's life when data loss concerns were raised, the Elastic team underwent a multiyear effort to rewrite the cluster coordination system to guarantee that acknowledged data is stored safely. When it became clear that managing indices in large clusters was a hassle, the team worked on implementing an extensive ILM solution to automate this work by allowing users to predefine index patterns and lifecycle actions. As users found a need to store significant amounts of metric and time series data, various features such as better compression were added to reduce data size. As the storage cost of searching extensive amounts of cold data grew we invested in creating Searchable Snapshots as a way to search user data directly on low cost object stores. These investments lay the groundwork for the next evolution of Elasticsearch. With the growth of cloud-native services and new orchestration systems, we have decided it is time to evolve Elasticsearch to improve the experience when working with cloud-native systems. We believe that these changes present opportunities for operational, performance, and cost improvements while running Elasticsearch on Elastic Cloud . Where we are going — Adopting a stateless architecture One of the primary challenges when operating or orchestrating Elasticsearch is that it depends on numerous pieces of persistent state, it is therefore a stateful system. The three primary pieces are the translog, index store, and cluster metadata. This state means that storage must be persistent and cannot be lost during a node restart or replacement. The existing Elasticsearch architecture on Elastic Cloud must duplicate indexing across multiple availability zones to provide redundancy in the case of outages. We intend to shift the persistence of this data from local disks to an object store, like AWS S3. By relying on external services for storing this data, we will remove the need for indexing replication, significantly reducing the hardware associated with ingestion. This architecture also provides very high durability guarantees because of the way cloud object stores such as AWS S3, GCP Cloud Storage, and Azure Blob Storage replicate data across availability zones. Offloading index storage into an external service will also allow us to re-architect Elasticsearch by separating indexing and search responsibilities. Instead of having primary and replica instances handling both workloads, we intend to have an indexing tier and a search tier. Separating these workloads will allow them to be scaled independently and hardware selection to be more targeted for the respective use cases. It also helps solve a longstanding challenge where search and indexing load can impact one another. After undertaking a multi-month proof-of-concept and experimental phase, we are convinced that these object store services meet the requirements we envision for index storage and cluster metadata. Our testing and benchmarks indicate that these storage services can meet the high indexing needs of the largest clusters we have seen in Elastic Cloud. Additionally, backing the data in the object store reduces indexing costs and allows for simple tuning of the performance of search. In order to search data, Elasticsearch will use the battled-tested Searchable Snapshots model where data is permanently persisted in the cloud-native object store and local disks are used as caches for frequently accessed data. To help differentiate, we describe our existing model as \"node-to-node\" replication. In the hot tier for this model, the primary and replica shards both do the same heavy lifting to handle ingest and serve search requests. These nodes are \"stateful\" in that they rely on their local disks to safely persist the data for the shards they host. Additionally, primary and replica shards are constantly communicating to stay in sync. They do this by replicating the operations performed on the primary shard to the replica shard, which means that the cost of those operations (CPU, mainly) is incurred for each replica specified. The same shards and nodes doing this work for ingest are also serving search requests, so provisioning and scaling must be done with both workloads in mind. Beyond search and ingest, shards in the node-to-node replication model handle other intensive responsibilities, such as merging Lucene segments. While this design has its merits, we saw a lot of opportunity based on what we've learned with customers over the years and the evolution of the broader cloud ecosystem. The new architecture enables many immediate and future improvements, including: You can significantly increase ingest throughput on the same hardware, or to look at it another way, significantly improve efficiency for the same ingest workload. This increase comes from removing the duplication of indexing operations for every replica. The CPU-intensive indexing operations only need to happen once on the indexing tier, which then ships the resulting segments to an object store. From there, the data is ready to be consumed as-is by the search tier. You can separate compute from storage to simplify your cluster topology. Today, Elasticsearch has multiple data tiers (content, hot, warm, cold, and frozen) to match data with hardware profile. Hot tier is for near real-time search and frozen is for less frequently searched data. While these tiers provide value, they also increase complexity. In the new architecture, data tiers will no longer be necessary, simplifying the configuration and operation of Elasticsearch. We are also separating indexing from search, which further reduces complexity and allows us to scale both workloads independently. You can experience improved storage costs on the indexing tier by reducing the amount of data that must be stored on a local disk. Currently, Elasticsearch must store a full shard copy on hot nodes (both primary and replica) for indexing purposes. With the stateless approach of indexing directly to the object store, only a portion of that local data is required. For append only use cases, only certain metadata will need to be stored for indexing. This will significantly reduce the local storage required for indexing. You can lower storage costs associated with search queries. By making the Searchable Snapshots model the native mode of searching data, the storage cost associated with search queries will significantly decrease. Depending on the search latency needs for users, Elasticsearch will allow adjustments to increase local caching on frequently requested data. Benchmarking — 75% indexing throughput improvement In order to validate this approach we implemented an extensive proof of concept where data was only indexed on a single node and replication was achieved through cloud object stores. We found that we could achieve a 75% indexing throughput improvement by removing the need to dedicate hardware to indexing replication. Additionally, the CPU cost associated with simply pulling data from the object store was much lower than indexing the data and writing it locally, as is necessary for the hot tier today. This means that search nodes will be able to fully dedicate their CPU to search. These performance tests were performed on a two node cluster against all three major public cloud providers (AWS, GCP, and Azure). We intend to continue to build out larger benchmarks as we pursue a production stateless implementation. Indexing Throughput CPU Usage Stateless for us, savings for you The stateless architecture on Elastic Cloud will allow you to reduce indexing overhead, independently scale ingest and search, simplify data tier management, and accelerate operations, such as scaling or upgrading. This is the first milestone towards a substantial modernization of the Elastic Cloud platform. Become part of our Elasticsearch stateless vision Interested in trying out this solution before everyone else? You can reach out to us on discuss or on our community slack channel . We would love your feedback to help shape the direction of our new architecture. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Where we started Where we are going — Adopting a stateless architecture Benchmarking — 75% indexing throughput improvement Stateless for us, savings for you Become part of our Elasticsearch stateless vision Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Stateless — your new state of find with Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/stateless-your-new-state-of-find-with-elasticsearch",
+    "meta_description": "Learn about Elasticsearch stateless and explore the stateless architecture, which  brings performance improvements and reduces costs."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Better Binary Quantization (BBQ) vs. Product Quantization Why we chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch. Vector Database Lucene ML Research BT By: Benjamin Trent On November 18, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We have been progressively making vector search with Elasticsearch and Lucene faster and more affordable. Our main focuses have been not only improving the search speeds through SIMD, but also by reducting the cost through scalar quantization. First by 4x and then by 8x. However, this is still not enough. Through techniques like Product Quantization (referred to as PQ), 32x reductions can be achieved without significant costs in recall. We need to achieve higher levels of quantization to provide adequate tradeoffs for speed and cost. One way to achieve this is by focusing on PQ. Another is simply improving on binary quantization. Spoilers: BBQ is 10-50x faster at quantizing vectors than PQ BBQ is 2-4x faster at querying than PQ BBQ achieves the same or better recall than PQ So, what exactly did we test and how did it turn out? What exactly are we going to test? Both PQ and Better Binary Quantization have various pros vs. cons on paper. But we needed a static set of criteria from which to test both. Having an independent \"pros & cons\" list is too qualitative a measurement. Of course things have different benefits, but we want a quantitative set of criteria to aid our decision making. This is following a pattern similar to the decision making matrix explained by Rich Hickey . Our criteria were: Search speed Indexing speed flat Indexing speed with HNSW Merge speed Memory reduction possible Is the algorithm well known and battle tested in production environments? Is coarse grained clustering absolutely necessary? Or, how does this algorithm fair with just one centroid Brute-force oversampling required to achieve 95% recall HNSW indexing still works and can acheive +90% recall with similar reranking to brute-force Obviously, almost all the criteria were measurable, we did have a single qualitative criteria that we thought important to include. For future supportability, being a well known algorithm is important and if all other measures were tied, this could be the tipping point in the decision. How did we test it? Lucene and Elasticsearch are both written in Java, consequently we wrote two proof of concepts in Java directly. This way we get an apples-to-apples comparison on performance. Additionally, when doing Product Quantization, we only tested up to 32x reduction in space. While PQ does support further reduction in space by reducing the number of code books, we found that for many models recall quickly became unacceptable. Thus requiring much higher levels of oversampling. Additionally, we did not use Optimized PQ due to the compute constraints required for such a technique. We tested over different datasets and similarity metrics. In particular: e5Small , which only has 384 dimensions and whose vector space is fairly narrow compared to other models. You can see how poorly e5small with naive binary quantization performs in our bit vectors blog . Consequently, we wanted to ensure an evolution of binary quantization could handle such a model. Cohere's v3 model , which has 1024 dimensions and loves being quantized. If a quantization method doesn't work with this one, it probably won't work with any model. Cohere's v2 model , which has 768 dimensions and its impressive performance relies on the non-euclidean vector space of max-inner product. We wanted to ensure that it could handle non-euclidean spaces just as well as Product Quantization. We did testing locally on ARM based macbooks & remotely on larger x86 machines to make sure any performance differences we discovered were repeatable no matter the CPU architecture. Well, what were the results? e5small quora This was a smaller dataset, 522k vectors built using e5small. Its few dimensions and narrow embedding space make it prohibitive to use with naive binary quantization. Since BBQ is an evolution of binary quantization, verifying that it worked with such an adverse model in comparison with PQ was important. Testing on an M1 Max ARM laptop: Algorithm quantization build time (ms) brute-force latency (ms) brute-force recall @ 10:50 hnsw build time (ms) hnsw recall @ 10:100 hnsw latency (ms) BBQ 1041 11 99% 104817 96% 0.25 Product Quantization 59397 20 99% 239660 96% 0.45 CohereV3 This model excels at quantization. We wanted to do a larger number of vectors (30M) in a single coarse grained centroid to ensure our smaller scale results actually translate to higher number of vectors. This testing was on a larger x86 machine in google cloud: Algorithm quantization build time (ms) brute-force latency (ms) brute-force recall @ 10:50 hnsw build time (ms) hnsw recall @ 10:100 hnsw latency (ms) BBQ 998363 1776 98% 40043229 90% 0.6 Product Quantization 13116553 5790 98% N/A N/A N/A When it comes to index and search speed at similar recall, BBQ is a clear winner. Inner-product search and BBQ We have noticed in other experiments that non-euclidean search can be tricky to get correct when quantizing. Additionally, naive binary quantization doesn't care about vector magnitude, vital for inner-product. Well, footnote in hand, we spent a couple of days on the algebra as we needed to adjust the corrective measures applied at the end of the query estimation. Success! Algorithm recall 10:10 recall 10:20 recall 10:30 recall 10:40 recall 10:50 recall 10:100 BBQ 71% 87% 93% 95% 96% 99% Product Quantization 65% 84% 90% 93% 95% 98% That wraps it up The complete decision matrix for BBQ vs Product Quantization. We are pretty excited about Better Binary Quantization (BBQ). We have been hard at work kicking the tires and are continually surprised at the quality of results we get with just a single bit of information retained per vector dimension. Look for it coming in an Elasticsearch release near you! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Jump to What exactly are we going to test? How did we test it? Well, what were the results? e5small quora CohereV3 Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Better Binary Quantization (BBQ) vs. Product Quantization - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/bit-vectors-elasticsearch-bbq-vs-pq",
+    "meta_description": "Explore why Elastic chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open Inference API support for AlibabaCloud AI Search Discover how to use Elasticsearch vector database with AlibabaCloud AI Search, which offers inference, reranking, and embedding capabilities. Integrations Vector Database How To DK W By: Dave Kyle and Weizijun On September 18, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Our latest addition to the Elasticsearch Open Inference API is the integration of AlibabaCloud AI Search. This work enables Elastic users to connect directly with the AlibabaCloud AI platform. Developers building RAG applications using the Elasticsearch vector database can store and use dense and sparse embeddings generated from models hosted on AlibabaCloud AI Search platform with semantic_text. In addition, Elastic users now have integrated access to reranking models for enhanced semantic reranking and the Qwen LLM family. In this blog, we explore how to integrate AlibabaCloud's AI services with Elasticsearch. You'll learn how to set up and use Alibaba's completion, rerank, sparse embedding, and text embedding services within Elasticsearch. The broad set of supported models integrated into inference task types will enhance the relevance of many use cases including RAG. We’re grateful to the Alibaba team for contributing support for these task types to Elasticsearch open inference API! Let’s walk through examples of how to configure and use these services within an Elasticsearch environment. Note Alibaba uses the term service_id instead of model_id . Using a base model in AlibabaCloud AI Search platform This walkthrough assumes you already have an AlibabaCloud Account with access to AlibabaCloud AI Search platform. Next, you’ll need to create a workspace and API key for Inference creation. Creating an inference API endpoint in Elasticsearch In Elasticsearch, create your endpoint by providing the service as “alibabacloud-ai-search”, and the service settings including your workspace, the host, the service id and your api keys to access AlibabaCloud AI Search platform. In our example, we're creating a text embedding endpoint using \"ops-text-embedding-001\" as the service id. You will receive a response from Elasticsearch with the endpoint that was created successfully: Note that there are no additional settings for model creation. Elasticsearch will automatically connect to the AlibabaCloud AI Search platform to test your credentials and the service id, and fill in the number of dimensions and similarity measures for you. Next, let’s test our endpoint to ensure everything is set up correctly. To do this, we’ll call the perform inference API: The API call will return the generated embeddings for the provided input, which will look something like this: You are now ready to start exploring. After you have tried these examples, have a look at some new exciting innovations in Elasticsearch for semantic search use cases: The new semantic_text field simplifies storage and chunking of embeddings - just pick your model and Elastic does the rest! Introduced in 8.14, retrievers allow you to setup multi-stage retrieval pipelines But first, let’s dive into our examples! I. Completion To start, Alibaba Cloud provides several models for chat completion, with service IDs listed in their API documentation . Step 1: Configure the Completion Service First, set up the inference service for text completion: Response Step 2: Issue a Completion Request Using the configured endpoint, send a POST request to generate a completion: Returns Uniquely, for this Elastic Inference API integration with Alibaba, chat history can be included in the inputs, in this example, we’ve included the previous response and added: “What fun things are there?” The response clearly includes the history In future updates, we plan to allow users to explicitly include chat history, improving the ease of usage. II. Rerank Moving on to our next task type, rerank . Reranking helps re-order search results for improved relevance, using Alibaba's powerful models. If you want to read more about this concept, have a look at this blog on Elastic Search Labs . Step 1: Configure the Rerank Service Configure the reranking inference service: Step 2: Issue a Rerank Request Send a POST request to rerank your search query results: The rerank interface does not require a lot of configuration (task_settings), it returns the relevance scores ordered by the most relevant first and the index of the document in the input array. III. Sparse Embedding Alibaba provides a model specifically for sparse embeddings, we will use ops-text-sparse-embedding-001 for our example. Step 1: Configure the Sparse Embedding Service Step 2: Issue a Sparse Embedding query Sparse has task_settings for: input_type - either ingest or search return_token - if true include the token text in the response, else it is a number With return_token==false IV. Text Embedding Alibaba also offers text embedding models for different tasks. Step 1: Configure the Text Embedding Service Embeddings has one task_setting: input_type - either ingest or search Step 2: Issue a Text Embedding Request Send a POST request to generate a text embedding: AI search with Elastic and AlibabaCloud Whether you're using Elasticsearch for implementing hybrid search, semantic reranking, or enhancing RAG use cases with summarization, the connection to AlibabaCloud's AI Services opens up a new world of possibilities for Elasticsearch developers. Thanks again, Alibaba team, for the contribution! To dive deep, try this Jupyter notebook with an end-to-end example of using Inference API with the Alibaba Cloud AI Search. Read Alibaba Cloud's announcement about AI-powered search innovations with Elasticsearch. Users can start using this with Elasticsearch Serverless environments today and in an upcoming version of Elasticsearch. Happy searching! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Using a base model in AlibabaCloud AI Search platform Creating an inference API endpoint in Elasticsearch I. Completion Step 1: Configure the Completion Service Step 2: Issue a Completion Request Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch open Inference API support for AlibabaCloud AI Search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-alibaba-cloud-inference-api",
+    "meta_description": "Integrate AlibabaCloud's AI services with Elasticsearch. Learn how to use Elasticsearch vector database with AlibabaCloud AI Search, which offers inference, reranking, and embedding capabilities."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Indexing OneLake data into Elasticsearch Learn how to connect to OneLake and index documents into Elasticsearch. Then, take the configuration one step further by developing your own OneLake connector. Part1 Integrations Ingestion +1 January 23, 2025 Indexing OneLake data into Elasticsearch - Part 1 Learn to configure OneLake, consume data using Python and index documents in Elasticsearch to then run semantic searches. GL By: Gustavo Llermaly Part2 Integrations Ingestion +1 January 24, 2025 Indexing OneLake data into Elasticsearch - Part II Second part of a two-part article to index and search OneLake data into Elastic using a Custom connector. GL JR By: Gustavo Llermaly and Jeffrey Rengifo Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Indexing OneLake data into Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/indexing-onelake-data-into-elasticsearch",
+    "meta_description": "Learn how to connect to OneLake and index documents into Elasticsearch. Then, take the configuration one step further by developing your own OneLake connector."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Testing your Java code with mocks and real Elasticsearch Learn how to write your automated tests for Elasticsearch, using mocks and Testcontainers Java How To PP By: Piotr Przybyl On October 3, 2024 Part of Series Integration tests using Elasticsearch Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this post, we will introduce and explain two ways of testing software using Elasticsearch as an external system dependency. We'll cover tests using mocks as well as integration tests, show some practical differences between them, and give some hints on where to go for each style. Good tests for system confidence A good test is a test that increases the confidence of every person involved in the process of creating and maintaining an IT system. Tests aren't meant to be cool, fast, or to artificially increase code coverage. Tests play a vital role in ensuring that: What we want to deliver is going to work in production. The system satisfies the requirements and the contracts. There won't be regressions in the future. Developers (and other involved team members) are confident that what they have created will work. Of course, this doesn't mean that tests can't be cool, fast, or increase code coverage. The faster we can run our test suite, the better. It's just that in the pursuit of reducing the overall duration of the testing suite, we should not sacrifice the reliability, maintainability and confidence the automated tests give us. Good automated tests make the various team members more confident: Developers: they get to confirm that what they're doing works (even before the code they work on leaves their machine). Quality assurance team: they have less to test manually. System operators and SREs: are more relaxed, because the systems are easier to deploy and maintain. Last but not least: the architecture of a system. We love when systems are organized, easy to maintain, and the architecture is clean and serves its purpose. However, sometimes we might see an architecture which sacrifices too much for the excuse known as \"it's more testable this way\". There's nothing wrong with being very testable – only when the system is written primarily to be testable instead of serving the needs justifying its existence, we see a situation when the tail wags the dog. Two kinds of tests: Mocks & dependencies There are many ways the tests can be seen, and thus classified. In this post I'll focus on only one aspect of dividing the tests: using mocks (or stubs, or fakes, or ...) vs. using real dependencies. In our case the dependency is Elasticsearch. Tests using mocks are very fast because they don't need to start any external dependencies and everything happens only in memory. Mocking in automated testing is when fake objects are used instead of real ones to test parts of a program without using the actual dependencies. This is the reason they're needed and why they shine in any fast-detection-net tests, e.g. validation of input. There's no need to start a database and make a call to it only to verify that negative numbers in a request aren't allowed, for example. However, introducing mocks has several implications: Not everything and every time can be mocked easily, hence mocks have impact on the architecture of the system (which sometimes is great, sometimes not so much). Tests running on mocks might be fast, but developing such tests can take quite some time because the mocks deeply reflecting the systems they mimic usually aren't given for free. Someone who knows how the system works needs to write the mocks the proper way, and this knowledge can come from practical experience, studying documentation, and so on. Mocks need to be maintained. When your system depends on an external dependency, and you need to upgrade this dependency, someone has to ensure that the mocks mimicking the dependency also get updated with all the changes: breaking, documented and undocumented (which can also have an impact on our system). This becomes especially painful when you want to upgrade a dependency but your (using only mocks) test suite can't give you any confidence that all the tested cases are guaranteed to work. It takes discipline to ensure that the effort goes towards developing and testing the system, not the mocks. For these reasons many people advocate going exactly the opposite direction: never use mocks (or stubs, etc.), but rely solely on real dependencies. This approach works very nicely in demos or when the system is tiny and has only a few test cases generating huge coverage. Such tests can be integration tests (roughly speaking: checking a part of a system against some real dependencies) or end-to-end tests (using all real dependencies at the same time and checking the behavior of the system at all ends, while playing user workflows which define the system as usable and successful). A clear benefit of using this approach is that we also (often unintentionally) verify our assumptions about the dependencies, and how we integrate them against the system we're working on. However, when tests are using only real dependencies, we need to consider the following aspects: Some test scenarios don't need the actual dependency (e.g. to verify the static invariants of a request). Such tests usually aren't run in whole suites at developers' machines, because waiting for feedback would take too much time. They require more resources at CI machines, and it might take more time to tune things to not waste time & resources. It might not be trivial to initialize dependencies with test data. Tests with real dependencies are great for cordoning code before major refactoring, migration or dependency upgrade. They are more likely to be opaque tests, i.e. not being detailed about the internals of the system under test, but taking care of their results. The sweet spot: use both tests Instead of testing your system with just one kind of test, you can rely on both kinds where it makes sense and try to improve your usage of both of them. Run mock-based tests first because they are much faster, and only when all are successful, run slower dependency tests only after. Choose mocks for scenarios where external dependencies aren't really needed: when mocking would take too much time the code should be massively altered just for that; rely on external dependencies. There is nothing wrong in testing a piece of code using both approaches, as long as it makes sense. Example of SystemUnderTest For the next sections we're going to use an example which can be found here . It's a tiny demo application written in Java 21, using Maven as the build tool, relying on Elasticsearch client and using Elasticsearch's latest addition, using ES|QL (Elastic's new procedural query language). If Java is not your programming language, you should still be fine to understand the concepts we're going to discuss below and translate them to your stack. It's just using a real code example makes certain things easier to explain. The BookSearcher helps us handle search and analyze data, being books in our case (as demonstrated in one of the previous posts ). It requires Elasticsearch exactly in version 8.15.x as its only dependency (see isCompatibleWithBackend() ), e.g. because we're not sure if our code is forward-compatible, and we're sure it's not backwards compatible. Before upgrading Elasticsearch in production to a newer version, we shall first bump it in the tests to ensure that the behaviour of the System Under Test remains the same. We can use it to search for the number of books published in a given year (see numberOfBooksPublishedInYear ). We might also use it when we need to analyze our data set and find out the 20 most published authors between two given years (see mostPublishedAuthorsInYears ). Test with mocks to start with For creating the mocks used in our tests we're going to use Mockito , a very popular mocking library in Java ecosystem. We might begin with the following, to have mocks reset before each test: As we said before, not everything can be easily tested using mocks. But some things we can (and probably even should). Let's try verifying that the only supported version of Elasticsearch is 8.15.x for now (in the future we might extend the range once we confirm our system is compatible with future versions): We can verify similarly (simply by returning a different minor version), that our BookSearcher is not going to work with 8.16.x yet, because we're not sure if it's going to be compatible with it: Now let's see how we can achieve something similar when testing against a real Elasticsearch. For this we're going to use Testcontainers' Elasticsearch Module , which has only one requirement: it needs access to Docker, because it runs Docker containers for you. From a certain angle Testcontainers is simply a way of operating Docker containers, but instead of doing that in your Docker Desktop (or similar), in your CLI, or scripts, you can express your needs in the programming language you know. This makes fetching images, starting containers, garbage-collecting them after tests, copying files back and forth, executing commands, examining logs, etc. possible directly from your test code. The stub might look like this: In this example we rely on Testcontainers' JUnit integration with @Testcontainers and @Container , meaning we don't need to worry about starting Elasticsearch before our tests and stopping it after. The only thing we need to do is to create the client before each test and close it after each test (to avoid resource leaks, which could impact bigger test suites). Annotating a non-static field with @Container means, that a new container will be started for each test, hence we don't have to worry about stale data or resetting the container's state. However, with many tests, this approach might not perform well, so we're going to compare it with alternatives in one of the next posts. Note: By relying on docker.elastic.co (Elastic's official Docker image repository), you avoid exhausting your limits on Docker hub. It is also recommended to use the same version of your dependency in your tests and production environment, to ensure maximum compatibility. We also recommend being precise with selecting the version, for this reason, there is no latest tag for Elasticsearch images. Connecting to Elasticsearch in tests Elasticsearch Java client is capable of connecting to Elasticsearch running in a test container even with security and SSL/TLS enabled (which are default for versions 8.x, that's why we didn't have to specify anything related to security in the container declaration.) Assuming the Elasticsearch you're using in production has also TLS and some security enabled, it is recommended to go for the integration test setup as close to production scenario as possible, and therefore not disabling them in tests. How to obtain data necessary for connection, assuming the container is assigned to field or variable elasticsearch : elasticsearch.getHost() will give you the host on which the container is running (which most of the time will be probably \"localhost\" , but please don't hardcode this as sometimes, depending on your setup, it might be another name, therefore the host should always be obtained dynamically). elasticsearch.getMappedPort(9200) will give the host port you have to use to connect to Elasticsearch running inside the container (because every time you start the container, the outside port is different, so this has to be a dynamic call as well). Unless they were overwritten, the default username and password are \"elastic\" and \"changeme\" respectively. If there was no SSL/TLS certificate specified during the container setup, and the secured connectivity is not disabled (which is the default behaviour from versions 8.x), a self-signed certificate is generated. To trust it (e.g. like cURL can do ) the certificate can be obtained using elasticsearch.caCertAsBytes() (which returns Optional<byte[]> ), or another convenient way is to get SSLContext using createSslContextFromCa() . The overall result might look like this: Another example of creating an instance of ElasticsearchClient can be found in the demo project . Note : For creating client in production environments please refer to the documentation . First integration test Our very first test, verifying that we can create BookSearcher using Elasticsearch version 8.15.x, might look like this: As you can see, we don't need to set up anything else. We don't need to mock the version returned by Elasticsearch, the only thing we need to do is to provide BookSearcher with a client connected to a real instance of Elasticsearch, which has been started for us by Testcontainers. Integration tests care less about the internals Let's do a little experiment: let's assume that we have to stop extracting data from the result set using column indices, but have to rely on column names. So in the method isCompatibleWithBackend instead of we are going to have: When we re-run both tests we'll notice, that the integration test with real Elasticsearch still passes without any issues. However, the tests using mocks stopped working, because we mocked calls like rs.getInt(int) , not rs.getInt(String) . To have them passing, we now have to either mock them instead, or mock them both, depending on other use cases we have in our test suite. Integration tests can be a cannon to kill a fly Integration tests are capable of verifying the behaviour of the system, even if external dependencies aren't needed. However, using them this way is usually a waste of execution time and resources. Let's look at the method mostPublishedAuthorsInYears(int minYear, int maxYear) . The first two lines are as follows: The first statement is checking a condition, which doesn't depend on Elasticsearch (or any other external dependency) in any way. Therefore, we don't need to start any containers to merely verify, that if the minYear is greater than maxYear , an exception is thrown. A simple mocking test, which is also fast and not resource-heavy is more than enough to ensure that. After setting up the mocks, we can simply go for: Starting a dependency, instead of mocking, would be wasteful in this test case because there's no chance of making a meaningful call for this dependency. However, to verify the behaviour starting with String query = ... , that the query is written correctly, gives results as expected: the client library is capable of sending proper requests and responses, there are no syntax changes and so it's way easier to use an integration test, e.g.: This way, we can rest assured that when we feed our data to Elasticsearch (in this or any future version we choose to migrate to), our query is going to give us exactly what we expected: the data format didn't change, the query is still valid, and all the middleware (clients, drivers, security, etc.) will to continue to work. We don't have to worry about keeping the mocks up to date, the only change needed to ensure compatibility with e.g. 8.15 would be changing this: The same happens if you decide to e.g. use good old QueryDSL instead of ES|QL: the results you receive from the query (regardless of the language) should still be the same. Use both approaches when needed The case of the method mostPublishedAuthorsInYears illustrates that a single method can be tested using both methods. And perhaps even should be. Using only mocks means we have to maintain the mock and have zero confidence when upgrading our system. Using only integration tests would mean that we're wasting quite a lot of resources, without needing them at all. Let's recap Using both mocking and integration tests with Elasticsearch is possible. Use mocking tests as fast-detection-net and only if they pass successfully, start tests with dependencies (e.g. using ./mvnw test '-Dtest=!TestInt*' && ./mvnw test '-Dtest=TestInt*' or Failsafe and Surefire plugins). Use mocks when testing the behaviour of your system (\"lines of code\") where integration with external dependencies doesn't really matter (or could even be skipped). Use integration tests to verify your assumptions about and integration with external systems. Don't be afraid to test using both approaches – if it makes sense – according to the points above. One could make an observation, that being so strict about the version (in our case 8.15.x ) is too much. Using just the version tag alone could be, but please be aware that in this post it serves as the representation of all other features that might change between the versions. In the next installment in the series , we'll look at ways of initialising Elasticsearch running in a test container, with test data sets. Let us know if you built anything based on this blog or if you have questions on our Discuss forums and the community Slack channel . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Good tests for system confidence Two kinds of tests: Mocks & dependencies The sweet spot: use both tests Example of SystemUnderTest Test with mocks to start with Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Testing your Java code with mocks and real Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/tests-with-mocks-and-real-elasticsearch",
+    "meta_description": "Explore how to test java using Elasticsearch as an external system dependency. This blog covers integration and mocks testing as well as testcontainers."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Navigating an Elastic vector database An overview of operating a modern Elastic vector database with practical code samples. Vector Database How To JC By: Justin Castilla On September 25, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector databases are quickly becoming the de facto data store for semantic search , a type of search that considers context and meaning of content over traditional keyword search. Elastic has consistently provided modern tooling to perform semantic search, and it is important to identify and understand the larger mechanisms required to query a vector database. This article will cover the necessary components to operate a vector database within Elasticsearch. We will explore the underlying technologies to create an efficient system to ensure the best performance and speed balance. Topics covered include: Current Elastic Full-Text Search ( BM25 & TF/IDF ) Vectors, Embedding, and necessary Considerations Choosing an Embedding Model Indexing a Vector Vector Query Algorithms Running the code to create your own vector database To create your own vector database as shown within this article, you’ll need an Elasticsearch instance optimized for machine learning (8.13 or later) Docker Desktop or a similar Docker container manager Python 3.8 (or later) Elasticsearch Python Client An associated repository is available here to illustrate the various moving parts necessary for querying a vector database. We will observe code snippets throughout the article from this source. The example repository will create an index of approximately 10,000 objects supplied in a JSON file representing science fiction books reviewed on goodreads.com . Each object will have a vector embedding of text that describes the book. The goal of this article and repository is to demonstrate searching for books based on vector similarity between a provided query string and the embedded book description vectors. Here is a sample book object that we will be using within our code samples. Current Elastic full-text search ( BM25 & TF/IDF ) To understand the scope and benefit of vector databases over traditional full-text search databases, it is worth taking a moment to review the current underlying technology that powers Elasticsearch, BM25, and its predecessor, TF/IDF. TF/IDF TF/IDF is the statistical measure of the frequency and importance of a query term based on how often it appears in an individual document and its rate of occurrence within an entire index of documents. Term Frequency (TF): This is a measure of how often a term occurs within a document. A higher occurrence of the term within a document, the higher the likelihood that the document will be relevant to the original query. This is measured as a raw count or normalized count based on the occurrence of the term divided by the total number of terms in the document Inverse Document Frequency (IDF): This is a measure of how important a query term is based on the overall frequency of use over all documents in an index. A term that occurs across more documents is considered less important and informative as is thus given less weight in the scoring. This is calculated as the logarithm of the total documents divided by the number of documents that contain the query term. I D F = l o g ( n t ​ N ​ ) IDF=log(\\frac{nₜ}{​N}​)\\newline I D F = l o g ( ​ N n t ​ ​ ​ ) nₜ = number of documents containing the term N = total number of documents Using our example of Science Fiction books, lets break down a few sentences with a focus on the words \"space\" and \"whales\". \"The interstellar whales graceully swam through the cosmos onto their next adventure.\" \"The space pirates encountered a human space station on their trip to Venus, ripe for plunder\" \"In space no one can year you scream.\" \" Space is the place to be.\" \"Purrgil were a semi-sentient species of massive whales that lived in deep space , traveling from star system to star system.\" Space The term \"space\" appears in 4 documents (Sentences 2, 3, 4, and 5). Total number of documents (N) = 5. IDF formula for “space”: I D F = log ⁡ ( 5 4 ) ≈ 0.10 IDF=\\log \\left( \\frac{5}{4} \\right)≈0.10\\ I D F = lo g ( 4 5 ​ ) ≈ 0.10 \"space\" is considered less important because it appears frequently across 4 of the 5 sentences. Whales The term \"whales\" appears in 2 documents (Sentences 1 and 5). Total number of documents (N) = 5. IDF formula for “whales”: I D F = log ⁡ ( 5 2 ) ≈ 0.40 IDF=\\log \\left( \\frac{5}{2} \\right)≈0.40\\ I D F = lo g ( 2 5 ​ ) ≈ 0.40 \"Whales\" is considered relatively important because it appears in only 2 of 5 sentences. \"Whales\" is assigned a higher IDF value because it appears in fewer documents, making it more distinctive and relevant to the context of those documents. Conversely, \"space\" is more common across the documents, leading to a lower IDF score, thus indicating that it is less useful for distinguishing between the documents. In the context of our codebase's 10,000 book ojbjects, the term \"space\" occurs 1,319 times, while the term \"whale\" appears 6 times total. It is understandable that a search for \"space whales\" would first prioritize the occurence of \"whales\" as more important than \"space.\" While still considered a powerful search algorithm in its own right, TF/IDF fails to prevent search bias on longer documents that may have a proportionately larger amount of term occurrences compared to a smaller document. BM25 BM25 (Best Match 25) is an enhanced ranking function using IDF components to create a score outlining the relevance of a document over others within an index. Term Frequency Saturation: It has been noted that at a certain point, having a high occurrence of a query term in a document does not significantly increase its relevance compared to other documents with a similarly high count. BM25 introduces a saturation parameter for term frequency, which reduces the impact of the term frequency logarithmically as the count increases. This prevents very large documents with high term occurrences from disproportionately affecting the relevance score, ensuring that all scores level off after a certain point. Enhanced Inverse Document Frequency (IDF): In cases where a query term appears in all documents or has a very low occurrence, the IDF score might become zero. To avoid this, BM25 adds a value of 0.5 to the calculation, ensuring that the IDF scores do not reach zero or become too low. Average Document Length : This is a consideration of the actual length of the document. It's understood that longer documents may naturally have more occurrences of a term compared to shorter documents, which doesn't necessarily mean they are more relevant. This adjustment compensates for document length to avoid a bias towards longer documents simply due to their higher term frequencies. BM25 is an excellent tool for efficient search and retrieval of exact-matches with text. Since 2016 it has been the default search algorithm for Lucene (versions 6.0 and higher), the underlying low-level search engine used by Elasticsearch. It should be noted that BM25 cannot provide semantic search as with vectors, which provides an understanding of the context of the query, rather than pure keyword search. It should also be noted that vocabulary mismatch may cause issues with receiving proper results. As an example, a user-submitted query using the term \"cosmos\" may not retrieve the intended results if the words don't exactly match, such as documents containing the term \"space.\" It is understood that \"cosmos\" is another term for \"space\", but this isn't explicity known or checked for with the default BM25f algorithm. Knowing when to choose traditional keyword search over semantic search is crucial to ensure efficient use of computational resources. Running the code: A full-text search (BM25) Further Reading: The BM25 Algorithm and its variables BM25: A Guide to Modern Information Retrieveal BM25 The Next Generation of Lucene Relevance Vectors & vector databases Vector databases essentially store and index mathematical representations (vectors) of documents for similarity search. Vectorization of data allows for a normalization of complex and nuanced documents (text, images, audio, video, etc.) into a format that computers may compare with other vectors with consistent similarity results. It is important to understand the many mechanisms in place to provide a production-ready solution. What is a vector? A vector is a representation of data information projected into the mathematical realm as an array of numbers. With numbers instead of words, comparisons are very efficient for computers and thus offer a considerable performance boost. Nearly every conceivable data type (text, images, audio, video, etc.) used in computing may be converted into vector representations. Images are broken down to the pixel and visual patterns such as textures, gradients, corners, and transparencies are captured into numeric representations. Words, words in specific phrases, and entire sentences are also analyzed, assigned various sentiment, contextual, and synonym values and converted to arrays of floating points. It is within these multidimensional matrices where systems are able to discern numeric similarities in certain portions of the vector to find similarly colored inventory in commerce sites, answers to coding questions on Elastic.co, or recognize the voice of a Nobel Prize winner. Each data type benefits from a dedicated Vector Embedding Model, which can best identify and store the various characteristics of that particular type. A text embedding model excels at understanding common phrases and nuanced alliteration, while completely failing to recognize the emotions displayed on a posing figure in an image. Above we can see the embedding model receiving three different strings as input and producing three distinct vectors (arrays of floats) as output. Embedding models When converting a vector of data, in this case, text, a model is used. It should be noted that models An embedding model is a pre-trained machine-learning instance that converts text (words, phrases, and sentences) into numerical representations. These representations become multidimensional arrays of floats, with each dimension representing a different characteristic of the original text, such as sentiment, context, and syntactics. These various representations allow for comparison with other vectors to find similar documents and text fragments. Different embedding models have been developed that provide various benefits; some are extremely hardware efficient and can be run with less computational power. Some have a greater “understanding” of the context and content within the index it is storing within and can answer questions, perform text summarization, and lead a threaded conversation. Some focus on having an acceptable balance of performance and speed and efficiency. Choosing an embedding model We will briefly describe three models for text embeddings and their various properties. Word2Vec : Efficient and simple to train, this model provides good quality embeddings for words based on their context as well as semantic meanings. Word2VEc is best used in applications with semantic similarly needs, sentiment analysis, or when computational resources are limited. GloVe : Similar to Word2Vec in many respects, GloVe builds a global awareness across an entire index. By analyzing word co-occurrences throughout the entire text, GloVe captures both the frequency and context of words, resulting in embeddings that reflect the overall relationships and meanings of words in the totality of the stored vectors. BERT : Unlike Word2Vec and GloVe, BERT (Bidirectional Encoder Representations from Transformers) looks at the text before and after each word to establish a local context and semantic meaning. By pre-training this model on a large body of text, this model excels at facilitating question and answer tasks as well as sentiment analysis. Running the code: Creating an ingest pipeline to embed vectors For the coding example, a smaller, simpler version of BERT was chosen called sentence-transformers__msmarco-minilm-l-12-v3 . It is considered a MiniLM, which is more efficient than normal sized models yet still retains the performance needed for vector similarity. This is a good model choice for a non-production tutorial to get the code running quickly with no necessary fine tuning. More information about the model is available here Below we are creating an ingest pipeline for our index books . This means that all book objects that are created and stored in the Elasticsearch index will automatically have their book_description field converted to a vector embedding named description_embedding . This reduces the codebase necessary to create new book objects on the client side. If there is a failure, the documents will be stored in the failure-books index and an error message will be included in the documents. This allows you to view any errors that may have caused the failed embedding, and the ability to re-index the failed-index with updated code, ensuring no documents are lost or left behind. Note: Since the workload of embedding the vectors is passed on to Elastic via this Inference Ingest Pipeline, it should be noted that a larger tier of available CPU and RAM in this Elasticsearch cloud instance may be desirable to allow for quick embedding and indexing of new and updated documents. If the workload of embedding is left to a local application codebase, consideration should also be given to the necessary compute hardware during times of high throughput. The provided codebase for this article includes an option to embed the book_description locally to allow for comparison and compute pressure. This snippet is creating an ingest pipeline named text-embedding which creates an inference processor. The processor uses the sentence-transformers__msmarco-minilm-l-12-v3 model to copy and convert the book_description text to a vector embedding and stores it under the description_embedding property. Indexing vectors The method in which vectors are indexed has a significant impact on the performance and accuracy of search results. Indexing vectors entails storing them in specialized data structures designed to ensure efficient similarity search, speedy vector distance calculations, and ultimately vector retrievals as results. How you decide to store your vectors should be based on your unique data needs. It should also be noted that Elasticsearch uses the term index as a verb acted upon documents to mean adding a document to an index. Care should be taken so as to not confuse the two. There are two general methods of indexing documents: KNN, and ANN. It is important to make the distinction between the two and the tradeoffs when selecting one or the other. KNN Given k=4, four of the nearest vectors to the query vector (yellow) are selected and returned. KNN (K-Nearest Neighbors) will provide an exact result of the K closest neighbor vectors based on a provided distance metric. As with most things that return exact results, the tradeoff is speed, which must be sacrificed for accuracy. The KNN method collates every distance from a target point to all other existing points in a dimension of a vector. The distances are then sorted and the closest K are returned. In the diagram above, a k of 4 is requested and the four nearest vectors are returned ANN ANN (Approximate Nearest Neighbors) will provide an approximation of the nearest neighbors based on an established index of vectors that have had their dimensions lowered for easier and faster processing. The tradeoff is a sped-up seeking phase where traversal is aided by an index, which could be thought of as a predefined map of all the vectors. ANN is preferred over KNN in semantic search when speed, scalability, and resource efficiency are considered higher priority over exact precision. This makes ANN a practical choice for large-scale and real-time applications where fast, approximate matches are acceptable. Much like a Probablistic Data Structure, a slight hit to accuracy has been accepted for the sake of speed and size. As an example, think of various vector points being stored in shelves in a grocery store and you are given a map when you enter the facility. A premade map would allow you to find where you want to go vs traversing every single aisle from entrance to exit until you finally reach your intended point (KNN). HNSW HNSW (Hierarchical Navigable Small World) is the default ANN method with Elastic Vector Databases. It utilizes graph-based relationships (nodes and vertices) to efficiently traverse an index to find the nearest neighbors. In this case, the nodes are data points and the edges are connections to nearby neighbors. HNSW consists of many layers of graphs that have closer and closer distances to each other. With HNSW, each subsequent traversal to a higher layer exposes closer nodes, or vectors, until the desired vector(s) are found. This is similar to a traditional skip list data structure, where the higher layers have elements that are more far apart from each other, or “coarse” granularity, whereas the lower layers have elements that are closer to each other, or “finer” granularity. The traversal from a higher to lower plane of nodes and vertices means that not all vectors need to be searched and compared, only the nearest neighbors. This ensures that high dimensional vectors can be indexed and searched quickly and efficiently. HNSW is ideally used for real-time semantic search and recommendation systems that have high-dimensional vectors yet can accept approximate results over exact nearest neighbors. HNSW can be considered extraneous with smaller data sets, lower-dimensional vectors, or when memory size-constraints are a factor. Due to the complexity of the graph structure, HNSW may not be a good fit for a highly dynamic data store where inserts, updates, and deletions occur in high frequency. This is due to the overhead maintenance of the graph connections throughout the many different layers required. Like all facets of implementing a vector database, a balance must be struck between available resources, latency tolerance, and desired performance. Running the code: Creating an index Elasticsearch provides an indices.create() method documented here which creates an index based on a given index name and mapping of expected data types within the documents indexed. This allows for faster, efficient indexing and retrieval of documents based on numerical ranges, full text, and keyword searches as well as semantic search. Note that the description_embedding property is not included - that will be created automatically by the ingest pipeline defined above when inserting the book objects. Elasticsearch provides an indices.create() method which creates an index based on a given index name and mapping of expected data types within the documents indexed. This allows for faster, efficient indexing and retrieval of documents based on numerical ranges, full text, and keyword searches as well as semantic search. Note that the description_embedding property is not included - that will be created automatically by the ingest pipeline defined above. Now that the index books has been created, we can populate Elasticsearch with book objects to use for semantic search. Let’s start by storing a single book object in Elasticsearch. Here we are running the bulk insertion method which greatly reduces the time necessary to index our starting library of 10,000+ books. The bulk method is recommended when indexing numerous objects. More information can be found here . Note that in both single indexing and bulk indexing methods we are including the pipeline=’text-embedding’ argument to have Elasticsearch trigger our inference processor defined above every time a new book object is added. Below is the post-index document of a sample book that has been indexed into the books vector database. There are now two new fields: model_id : the embedding model that was use to create our vectors. description_embedding : The vector embedding created from our book_description field. It has been truncated in this article for space, as there are 384 total float values in the array (the dimensions of our specific model chosen.) Vector search algorithms With a vector database full of embedded vectors, semantic search and vector querying may now take place. Each index method and embedding model excel with the utilization of different algorithms to return optimized results. We will cover the most commonly used methods and discuss their specific applications. Cosine Similarity : Cosine Similarity, the default algorithm for Elasticsearch's vector search, measures the cosine of the angle between two vectors in a space. This distance metric is ideal for semantic search and recommendation systems. ANN indexes such as HNSW are optimized to be used with cosine similarity specifically. The smaller the angle, the more similar the vectors are, thus the nearer neighbor. The larger the angle, the less related they are, with 90 ° \\degree ° , or perpendicularity, being unrelated altogether. Any value over 90 ° \\degree ° is considered treading into the “opposite” of what a given vector contains. Above are three pairs vectors that display different directions. These directions illustrate how the pairs can be similar, dissimilar, or completely opposite from each other One caveat of Cosine Similarity is the curse of dimensionality. This happens when the distance between a vector pairing begins to reach an average value between other vector pairings since the space in which a vector occupies becomes vaster and vaster. The distances between vectors become farther and farther the more features or data points exist. This occurs in very high dimension vectors - care should be taken to evaluate different distance metrics to meet your needs Dot Product: Dot Product receives two vectors as input and sum the product of each of the individual components. D o t P r o d u c t = A 1 ⋅ ​ B 1 ​ + A 2 ⋅ B 2 ​ + + A n ⋅ ​ B n Dot \\thinspace Product = A₁⋅​B₁ ​+ A₂⋅B₂​ + + Aₙ⋅​Bₙ Do t P ro d u c t = A 1 ​ ⋅ ​ B 1 ​ ​ + A 2 ​ ⋅ B 2 ​ ​ + + A n ​ ⋅ ​ B n ​ As an example, if vector A contains the components [1,3,5] and vector B contains the components [4,9,1], the resulting computation would be as follows: A = [ 1 , 3 , 5 ] B = [ 4 , 9 , 1 ] ( 1 ⋅ 4 ) + ( 3 ⋅ 9 ) + ( 5 ⋅ 1 ) = 36 A = [1,3,5] \\newline B = [4,9,1] \\newline (1⋅4) + (3⋅9) + (5⋅1) = 36 A = [ 1 , 3 , 5 ] B = [ 4 , 9 , 1 ] ( 1 ⋅ 4 ) + ( 3 ⋅ 9 ) + ( 5 ⋅ 1 ) = 36 A higher sum value represents a closer similarity between the given vectors. If the value is or near a zero value, the vectors are perpendicular to each other, which makes them considered unrelated. A negative value means that the vectors are opposite each other. Euclidean (L2) : Euclidean Distance can be imagined as an n-dimensional extension of the Pythagorean Theorem (a² + b² = c²) , which is used to find the hypotenuse of a right triangle. For each component in a vector A, we determine the distance from its corresponding component in vector B. This can be achieved with the absolute value of one component’s value subtracted from the other. We then square the difference and add it to the next squared component difference, until every distance between every component in the two vectors has been determined, squared, and summed to create one final value. We then find the square root of that value to reach our Euclidean Distance between the two vectors A and B. E u c l i d e a n D i s t a n c e = ( A 1 ​ − B 1 ​ ) 2 + ( A 2 ​ − B 2 ​ ) 2 + ( A n ​ − B n ​ ) 2 Euclidean \\thinspace Distance = \\sqrt{ (A₁​− B₁​)²+(A₂​−B₂​)²+(Aₙ ​− Bₙ​)² } E u c l i d e an D i s t an ce = ( A 1 ​ ​ − B 1 ​ ​ ) 2 + ( A 2 ​ ​ − B 2 ​ ​ ) 2 + ( A n ​ ​ − B n ​ ​ ) 2 ​ As an example, we have 2 vectors A [3, 4, 5] and B [6, 7, 8] Our computation would be as follows: ( 3 ​ − 6 ​ ) 2 + ( 4 ​ − 7 ​ ) 2 + ( 5 ​ − 8 ​ ) 2 9 + 9 + 9 27 ≈ 5.196 \\sqrt{(3 ​− 6​)²+( 4​ − 7 ​)²+( 5 ​− 8 ​)²}\\newline \\sqrt{ 9 + 9 + 9 }\\newline \\sqrt{ 27 } \\approx 5.196 ( 3​ − 6​ ) 2 + ( 4​ − 7​ ) 2 + ( 5​ − 8​ ) 2 ​ 9 + 9 + 9 ​ 27 ​ ≈ 5.196 Between the two vectors A and B, the distance is approximately 5.196. Euclidean, much like Cosine, suffers from the curse of dimensionality, with most vector distances becoming homogeneously similar at higher dimensionalities. For this reason Euclidean is recommended for lower-dimensionality vectors. Manhattan (L1): Manhattan distance, similar to Euclidean distance, sums the distances of the components of two corresponding vectors. Instead of finding the exact direct distance between two points as a line, Manhattan can be thought of as using a grid system, much like the block layout in the city of Manhattan, New York. As an example, if a person walks 3 blocks north, then 4 blocks east to reach their destination, then you will have traveled a total distance of 7 blocks. This can be generalized as: S u m ( ∣ A 1 ​ − B 1 ∣ + ∣ A 2 ​ − B 2 ∣ + ∣ A n − B n ∣ ) Sum(| A₁​− B₁| + | A₂​− B₂| + | Aₙ− Bₙ|) S u m ( ∣ A 1 ​ ​ − B 1 ​ ∣ + ∣ A 2 ​ ​ − B 2 ​ ∣ + ∣ A n ​ − B n ​ ∣ ) In our numbered example, we can establish our origin as [0,0] and our destination as [3,4]. Therefore this computation would apply: S u m ( ∣ 0 − 3 ∣ + ∣ 0 − 4 ∣ ) S u m ( 3 + 4 ) = 7 Sum( | 0 - 3 | + | 0 - 4 | )\\newline Sum( 3 + 4 ) = 7 S u m ( ∣0 − 3∣ + ∣0 − 4∣ ) S u m ( 3 + 4 ) = 7 Unlike Euclidean and Cosine, Manhattan scales well in higher-dimensional vectors and is a great candidate for feature-rich vectors. Changing similarity algorithms To set a specific similarity algorithm of a vector field type in Elasticsearch, use the similarity field in the index mappings object. Elasticsearch allows you to define the similarity as dot_prodcut or l2_norm (Euclidean). With no similarity field definition, Elasticsearch defaults to cosine . Here we are choosing l2_norm as our similarity metric for our description_embedding field: Putting it all together: Utilizing a vector database Now that we have a fundamental understanding of how to create a vector, and the methods to compare vector similarity, we need to understand the sequences of events to successfully utilize our vector database. We shall assume now that we have a database full of vector embeddings representing data. All of the vectors have been created using the same model. We receive raw query data. This query data text is embedded using the same model we used previously. This gives us a resulting query vector that will have the same dimensions and features as the vectors existing in our database. We run a similarity algorithm between our query vector and the index of vectors to find the vectors with the highest degree of similarity based on our chosen distance metric and indexing method. We receive our results based on their similarity score. Each vector returned should also have the original unembedded data as well as any pertinent information to the dataset subject matter. In the code sample below, we execute the search command with a knn argument that contains what field to compare ( description_embedding ) and the original query string along with which model to use to embed the query. The search method converts the query to a vector and runs the similarity algorithm. As a response, we receive a payload back from the Elastic cloud containing an array of book objects that have been sorted by similarity score, with 0 being the least relevant, and 1 being a perfect match. Here is a truncated version of the response: Conclusion Operating a vector database within Elasticsearch opens up new possibilities for efficiently managing and querying complex datasets, far beyond what traditional full-text search methods like BM25 or TF/IDF can offer. By selecting and testing vector embedding models and similarity algorithms against your specific use case, you can enable sophisticated semantic search functionality that understands the nuances of your data, be it text, images, or other multimedia. This is critical in applications that require precise and context-aware search results, such as recommendation systems, natural language processing, and image recognition. In the process of building a vector database around the volume of book objects in our repository, hopefully you will see the utility of searching through the individual book descriptions with natural human language. This provides an opportunity to speak to a librarian or book store clerk for our data. By providing contextual understanding to your query input and matching it with the existing documents that have already been processed and vectorized, the power of semantic search providing the ideal use case. RAG (Retrieval Augmented Generation) is the process of using a transformer model, such as ChatGPT, that has been granted access to your curated documents to generate natural language answers to natural language queries. This provides an enhanced user experience and can handle complex queries. Conversely, thought should also be given before and after implementing a vector database as to whether or not semantic search is necessary for your specific use case. Well-crafted queries in a traditional full-text query ecosystem may return the same or better results with lower computational overhead. Care must be given to evaluate the complexity and anticipated scale of your data before opting for a vector database, as smaller or simpler datasets might not remarkably benefit from the addition of vector embeddings. Oftentimes fine-tuning indexing strategies and implementing ranking models within a traditional search framework can provide more efficient performance without the need for machine learning enhancements. As vector databases and the supporting technologies continue to evolve, staying informed about the latest developments, such as generative AI integrations and further tuning techniques like tokenization and quantization, will be crucial. These advancements will not only enhance the performance and scalability of your vector database but also ensure that it remains adaptable to the growing demands of modern applications. With the right tools and knowledge, you can fully harness the power of Elasticsearch's vector capabilities to deliver cutting-edge solutions to your daily tasks. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Running the code to create your own vector database Current Elastic full-text search ( BM25 & TF/IDF ) TF/IDF Space Whales Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Navigating an Elastic vector database - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-vector-database-practical-example",
+    "meta_description": "A tutorial on building, managing and operating a Elastic vector database with practical code samples. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Using the Elasticsearch Go client for keyword search, vector search & hybrid search This series explains how to use the Elasticsearch Go client for traditional keyword search, vector search and hybrid search. Part1 Go How To October 31, 2023 Perform text queries with the Elasticsearch Go client Learn how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client through a practical example. CR LS By: Carly Richmond and Laurent Saint-Félix Part2 Vector Database How To November 1, 2023 Perform vector search in Elasticsearch with the Elasticsearch Go client Learn how to perform vector search in Elasticsearch using the Elasticsearch Go client through a practical example. CR LS By: Carly Richmond and Laurent Saint-Félix Part3 Vector Database How To November 2, 2023 Using hybrid search for gopher hunting with Elasticsearch and Go Learn how to achieve hybrid search by combining keyword and vector search using Elasticsearch and the Elasticsearch Go client. CR LS By: Carly Richmond and Laurent Saint-Félix Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using the Elasticsearch Go client for keyword search, vector search & hybrid search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/elasticsearch-go-client-for-keyword-vector-and-hybrid-search",
+    "meta_description": "This series explains how to use the Elasticsearch Go client for traditional keyword search, vector search and hybrid search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Using Cohere embeddings with Elastic-built search experiences Elasticsearch now supports Cohere embeddings! This blog explains how to use Cohere embeddings with Elastic-built search experiences. Integrations Vector Database How To SC JB DK By: Serena Chou , Jonathan Buttner and Dave Kyle On April 11, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch open inference API adds support for Cohere Embeddings We're pleased to announce that Elasticsearch now supports Cohere embeddings! Releasing this capability has been a great journey of collaboration with the Cohere team, with more to come. Cohere is an exciting innovator in the generative AI space and we're proud to enable developers to use Cohere's incredible text embeddings with Elasticsearch as the vector database, to build semantic search use cases. This blog goes over Elastic's approach to shipping and explains how to use Cohere embeddings with Elastic-built search experiences. Elastic's approach to shipping: frequent, production ready iterations Before we dive in, if you're new to Elastic (welcome!), we've always believed in investing in our technology of choice (Apache Lucene) and ensuring contributions can be used as production grade capabilities, in the fastest release mode we can provide. Let's dig into what we've built so far, and what we will be able to deliver soon: In August 2023 we discussed our contribution to Lucene to enable maximum inner product and enable Cohere embeddings to be first class citizens of the Elastic Stack. This was contributed first into Lucene and released in the Elasticsearch 8.11 version . In that same release we also introduced the tech preview of our /_inference API endpoint which supported embeddings from models managed in Elasticsearch, but quickly in the following release, we established a pattern of integration with third party model providers such as Hugging Face and OpenAI . Cohere embeddings support is already available to customers participating in the preview of our stateless offering on Elastic Cloud and soon will be available in an upcoming Elasticsearch release for all. You'll need a Cohere account, and some working knowledge of the Cohere Embed endpoint . You have a choice of available models, but if you're just trying this out for the first time we recommend using the model embed-english-v3.0 or if you're looking for a multilingual variant try embed-multilingual-v3.0 with dimension size 1024. In Kibana , you'll have access to a console for you to input these next steps in Elasticsearch even without an IDE set up. When you choose to run this command in the console you should see a corresponding 200 for the creation of your named Cohere inference service. In this configuration we've specified that the embedding_type is byte which will be the equivalent to asking Cohere to return signed int8 embeddings. This is only a valid configuration if you're choosing to use a v3 model. You'll want to set up the mappings in the index to prepare for the storage of your embeddings that you will soon retrieve from Cohere. Elasticsearch vector database for Cohere embeddings In the definition of the mapping you will find an excellent example of another contribution made by the Elastic team to Lucene, the ability to use Scalar Quantization . Just for fun, we've posted the command you would see in our Getting Started experience that ingests a simple book catalog. At this point you have your books content in an Elasticsearch index and now you need to enable Cohere to generate embeddings on the documents! To accomplish this step, you'll be setting up an ingest pipeline which utilizes our inference processor to make the call to the inference service you defined in the first PUT request. If you weren't ingesting something as simple as this books catalog, you might be wondering how you'd handle token limits for the selected model. If you needed to, you could quickly amend your created ingest pipeline to chunk large documents , or use additional transformation tools to handle your chunking prior to first ingest. If you're looking for additional tools to help figure out your chunking strategy, look no further than these notebooks in Search Labs . Fun fact, in the near future, this step will be made completely optional for Elasticsearch developers. As was mentioned at the beginning of this blog, this integration we're showing you today is a firm foundation for many more changes to come. One of which will be a drastic simplification of this step, where you won't have to worry about chunking at all, nor the construction and design of an ingest pipeline. Elastic will handle those steps for you with great defaults! You're set up with your destination index, and the ingest pipeline, now it's time to reindex to force the documents through the step. Elastic kNN search for Cohere vector embeddings Now you're ready to issue your first vector search with Cohere embeddings. It's as easy as that. If you have already achieved a good level of understanding of vector search, we highly recommend you read this blog on running kNN as a query - which unlocks expert mode! This integration with Cohere is offered in Serverless and in Elasticsearch 8.13 . Happy Searching, and big thanks again to the Cohere team for their collaboration on this project! Looking to use more of Cohere's capabilities?: Read about our support for Cohere's Rerank 3 model Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Elasticsearch open inference API adds support for Cohere Embeddings Elastic's approach to shipping: frequent, production ready iterations Elasticsearch vector database for Cohere embeddings Elastic kNN search for Cohere vector embeddings Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using Cohere embeddings with Elastic-built search experiences - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-cohere-embeddings-support",
+    "meta_description": "Elasticsearch now supports Cohere embeddings! This blog explains how to use Cohere embeddings with Elastic-built search experiences."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog / Series Jira connector tutorials Learn how to integrate Elasticsearch with Jira using Elastic’s Jira native connector and explore optimization techniques. Part1 Integrations Ingestion +1 January 15, 2025 Elastic Jira connector tutorial part I Reviewing a use case for the Elastic Jira connector. We'll be indexing our Jira content into Elasticsearch to create a unified data source and do search with Document Level Security. GL By: Gustavo Llermaly Part2 Integrations Ingestion +1 January 16, 2025 Elastic Jira connector tutorial part II: Optimization tips After connecting Jira to Elasticsearch, we'll now review best practices to escalate this deployment. GL By: Gustavo Llermaly Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Jira connector tutorials - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/series/jira-connector-tutorials",
+    "meta_description": "Learn how to integrate Elasticsearch with Jira using Elastic’s Jira native connector and explore optimization techniques."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Benchmarking passage retrieval In this blog post, we'll examine benchmark solutions to compare retrieval methods. We use a collection of data sets to benchmark BM25 against two dense models and illustrate the potential gain using fine-tuning strategies with one of those models. Generative AI GC QH TV By: Grégoire Corbière , Quentin Herreros and Thomas Veasey On July 13, 2023 Part of Series Improving information retrieval in the Elastic Stack Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In a previous blog post , we discussed common approaches to information retrieval and introduced the concepts of models and training stages. Here, we will examine benchmark solutions to compare various methods in a fair manner. Note that the task of benchmarking is not straightforward and can lead to misperceptions about how models perform in real-world scenarios. Historically, comparisons between BM25 and learned retrieval models have been based on limited data sets, or even only on the training data set of these dense models: MSMARCO, which may not provide an accurate representation of the models' performance on your data. Despite this approach being useful for demonstrating how well a dense model performs against BM25 in a specific domain, it does not capture one of BM25's key strengths: its ability to perform well in many domains without the need for supervised fine-tuning. Therefore, it may be considered unfair to compare these two methods using such a specific data set. The BEIR paper (\" BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models ,\" by Takhur et al. 2021) offers to address the issue of evaluating information retrieval methods in a generic setting. The paper proposes a framework using 18 publicly available data sets from a diverse range of topics to benchmark state-of-the-art retrieval systems. In this post, we use a subcollection of those data sets to benchmark BM25 against two dense models that have been specifically trained for retrieval. Then we will illustrate the potential gain achievable using fine-tuning strategies with one of those dense models. We plan to return to this benchmark in our next blog post, since it forms the basis of the testing we have done to enhance Elasticsearch relevance using language models in a zero-shot setting. BEIR data sets Performance can vary greatly between retrieval methods, depending on the type of query, document size, or topic. In order to assess the diversity of data sets and to identify potential blind spots in our benchmarks, a classification algorithm trained to recognize natural questions was used to understand queries typology. The results are summarized in Table 1. In our benchmarks, we choose not to include MSMARCO to solely emphasize performance in unfamiliar settings. Evaluating a model in a setting that is different from its training data is valuable when the nature of your use case data is unknown or resource constraints prevent adapting the model specifically. Search relevance metrics Selecting the appropriate metric is crucial in evaluating a model's ranking ability accurately. Of the various metrics available, three are commonly utilized for search relevance: Mean Reciprocal Rank (MRR) is the most straightforward metric. While it is easy to calculate, it only considers the first relevant item in the results list and ignores the possibility that a single query could have multiple relevant documents. In some instances, MRR may suffice, but it is often not precise enough. Mean Average Precision (MAP) excels in ranking lists and works well for binary relevance ratings (a document is either relevant or non-relevant). However, in data sets with fine-grained ratings, MAP is not able to distinguish between a highly relevant document and a moderately relevant document. Also, it is only appropriate if the list is reordered since it is not sensitive to order; a search engineer will prefer that the relevant documents appear first. Normalized Discounted Cumulative Gain (NDCG) is the most complete metric as it can handle multiple relevant documents and fine-grained document ratings. This is the metric we will examine in this blog and future ones. All of these metrics are applied to a fixed-sized list of retrieved documents. The list size can vary depending on the task at hand. For example, a preliminary retrieval before a reranking task might consider the top 1000 retrieved documents, while a single-stage retrieval might use a smaller list size to mimic a user's search engine behavior. We have chosen to fix the list size to the top 10 documents, which aligns with our use case. BM25 and dense models out-of-domain In our previous blog post , we noted that dense models, due to their training design, are optimized for specific data sets. While they have been shown to perform well on this particular data set, in this section we explore if they maintain their performance when used out-of-domain. To do this, we compare the performance of two state-of-the-art dense retrievers ( msmarco-distilbert-base-tas-b and msmarco-roberta-base-ance-fristp ) with BM25 in Elasticsearch using the default settings and English analyzer. Those two dense models both outperform BM25 on MSMARCO (as seen in the BEIR paper ), as they are trained specifically on this data set. However, they are usually worse out-of-domain. In other words, if a model is not well adapted to your specific data, it’s very likely that using kNN and dense models would degrade your retrieval performance in comparison to BM25. Fine-tuning dense models The portrayal of dense models in the previous description isn't the full picture. Their performance can be improved by fine-tuning them for a specific use case with some labeled data that represents that use case. If you have a fine-tuned embedding model, the Elastic Stack is a great platform to both run the inference for you and retrieve similar documents using ANN search. There are various methods for fine-tuning a dense model, some of which are highly sophisticated. However, this blog post won't delve into those methods as it's not the focus. Instead, two methods were tested to gauge the potential improvement that can be achieved with not a lot of domain specific training data. The first method (FineTuned A) involved using labeled positive documents and randomly selecting documents from the corpus as negatives. The second method (FineTuned B) involved using labeled positive documents and using BM25 to identify documents that are similar to the query from BM25's perspective, but aren't labeled as positive. These are referred to as \"hard negatives.\" Labeling data is probably the most challenging aspect in fine-tuning. Depending on the subject and field, manually tagging positive documents can be expensive and complex. Incomplete labeling can also create problems for hard negatives mining, causing adverse effects on fine-tuning. Finally, changes to the topic or semantic structure in a database over time will reduce retrieval accuracy for fine-tuned models. Conclusion We have established a foundation for information retrieval using 13 data sets. The BM25 model performs well in a zero-shot setting and even the most advanced dense models struggle to compete on every data set. These initial benchmarks indicate that current SOTA dense retrieval cannot be used effectively without proper in-domain training. The process of adapting the model requires labeling work, which may not be feasible for users with limited resources. In our next blog, we will discuss alternative approaches for efficient retrieval systems that do not require the creation of a labeled data set. These solutions will be based on hybrid retrieval methods. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid retrieval Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to BEIR data sets Search relevance metrics BM25 and dense models out-of-domain Fine-tuning dense models Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving information retrieval in the Elastic Stack: Benchmarking passage retrieval - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/improving-information-retrieval-elastic-stack-benchmarking-passage-retrieval",
+    "meta_description": "In this blog post, we'll examine benchmark solutions to compare retrieval methods. We use a collection of data sets to benchmark BM25 against two dense models and illustrate the potential gain using fine-tuning strategies with one of those models."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog LangChain4j with Elasticsearch as the embedding store LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Discover how to use it to build your RAG application in plain Java. Integrations Java How To DP By: David Pilato On October 8, 2024 Part of Series Introducing LangChain4j: Building RAG apps in plain Java Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In the previous post , we discovered what LangChain4j is and how to: Have a discussion with LLMs by implementing a ChatLanguageModel and a ChatMemory Retain chat history in memory to recall the context of a previous discussion with an LLM This blog post is covering how to: Create vector embeddings from text examples Store vector embeddings in the Elasticsearch embedding store Search for similar vectors Create embeddings To create embeddings, we need to define an EmbeddingModel to use. For example, we can use the same mistral model we used in the previous post . It was running with ollama: A model is able to generate vectors from text. Here we can check the number of dimensions generated by the model: To generate vectors from a text, we can use: Or if we also want to provide Metadata to allow us filtering on things like text, price, release date or whatever, we can use Metadata.from() . For example, we are adding here the game name as a metadata field: If you'd like to run this code, please checkout the Step5EmbedddingsTest.java class. Add Elasticsearch to store our vectors LangChain4j provides an in-memory embedding store. This is useful to run simple tests: But obviously, this could not work with much bigger dataset because this datastore stores everything in memory and we don't have infinite memory on our servers. So, we could instead store our embeddings into Elasticsearch which is by definition \"elastic\" and can scale up and out with your data. For that, let's add Elasticsearch to our project: As you noticed, we also added the Elasticsearch TestContainers module to the project, so we can start an Elasticsearch instance from our tests: To use Elasticsearch as an embedding store, you \"just\" have to switch from the LangChain4j in-memory datastore to the Elasticsearch datastore: This will store your vectors in Elasticsearch in a default index. You can also change the index name to something more meaningful: If you'd like to run this code, please checkout the Step6ElasticsearchEmbedddingsTest.java class. Search for similar vectors To search for similar vectors, we first need to transform our question into a vector representation using the same model we used previously. We already did that, so it's not hard to do this again. Note that we don't need the metadata in this case: We can build a search request with this representation of our question and ask the embedding store to find the first top vectors: We can iterate over the results now and print some information, like the game name which is coming from the metadata and the score: As we could expect, this gives us \"Out Run\" as the first hit: If you'd like to run this code, please checkout the Step7SearchForVectorsTest.java class. Behind the scene The default configuration for the Elasticsearch Embedding store is using the approximate kNN query behind the scene. But this could be changed by providing another configuration ( ElasticsearchConfigurationScript ) than the default one ( ElasticsearchConfigurationKnn ) to the Embedding store: The ElasticsearchConfigurationScript implementation runs behind the scene a script_score query using a cosineSimilarity function . Basically, when calling: This now calls: In which case the result does not change in term of \"order\" but just the score is adjusted because the cosineSimilarity call does not use any approximation but compute the cosine for each of the matching vectors: If you'd like to run this code, please checkout the Step7SearchForVectorsTest.java class. Conclusion We have covered how easily you can generate embeddings from your text and how you can store and search for the closest neighbours in Elasticsearch using 2 different approaches: Using the approximate and fast knn query with the default ElasticsearchConfigurationKnn option Using the exact but slower script_score query with the ElasticsearchConfigurationScript option The next step will be about building a full RAG application, based on what we learned here. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Create embeddings Add Elasticsearch to store our vectors Search for similar vectors Behind the scene Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "LangChain4j with Elasticsearch as the embedding store - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/langchain4j-elasticsearch-embedding-store",
+    "meta_description": "LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Learn how to use LangChain4j & Elasticsearch to build a RAG app in Java."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Understanding fused multiply-add (FMA) within vector similarity computations in Lucene Learn how to use fused multiply-add (FMA) within vector similarity computations in Lucene and discover how FMA can improve performance. Lucene Vector Database CH By: Chris Hegarty On November 20, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In Lucene 9.7.0 we added support that leverages SIMD instructions to perform data-parallelization of vector similarity computations. Now we’re pushing this even further with the use of Fused Multiply-Add (FMA). What is fused multiply-add (FMA) Multiply and add is a common operation that computes the product of two numbers and adds that product with a third number. These types of operations are performed over and over during vector similarity computations. a ∗ b + c a * b + c a ∗ b + c Fused multiply-add (FMA) is a single operation that performs both the multiply and add operations in one - the multiplication and addition are said to be “fused” together. FMA is typically faster than a separate multiplication and addition because most CPUs model it as a single instruction. FMA also produces more accurate results. Separate multiply and add operations on floating-point numbers have two rounds; one for the multiplication, and one for the addition, since they are separate instructions that need to produce separate results. That is effectively, r o u n d ( r o u n d ( a ∗ b ) + c ) round(round(a * b) + c) ro u n d ( ro u n d ( a ∗ b ) + c ) Whereas FMA has a single rounding, which applies only to the combined result of the multiplication and addition. That is effectively, r o u n d ( a ∗ b + c ) round(a * b + c) ro u n d ( a ∗ b + c ) Within the FMA instruction the a * b produces an infinite precision intermediate result that is added with c , before the final result is rounded. This eliminates a single round, when compared to separate multiply and add operations, which results in more accuracy. FMA in Lucene: Under the hood So what has actually changed? In Lucene we have replaced the separate multiply and add operations with a single FMA operation. The scalar variants now use Math::fma , while the Panama vectorized variants use FloatVector::fma . If we look at the disassembly we can see the effect that this change has had. Previously we saw this kind of code pattern for the Panama vectorized implementation of dot product. The vmovdqu32 instruction loads 512 bits of packed doubleword values from a memory location into the zmm0 register. The vmulps instruction then multiplies the values in zmm0 with the corresponding packed values from a memory location, and stores the result in zmm0 . Finally, the vaddps instruction adds the 16 packed single precision floating-point values in zmm0 with the corresponding values in zmm4 , and stores the result in zmm4 . With the change to use FloatVector::fma , we see the following pattern: Again, the first instruction is similar to the previous example, where it loads 512 bits of packed doubleword values from a memory location into the zmm0 register. The vfmadd231ps (this is the FMA instruction), multiplies the values in zmm0 with the corresponding packed values from a memory location, adds that intermediate result to the values in zmm4 , performs rounding and stores the resulting 16 packed single precision floating-point values in zmm4 . The vfmadd231ps instruction is doing quite a lot! It’s a clear signal of intent to the CPU about the nature of the computations that the code is running. Given this, the CPU can make smarter decisions about how this is done, which typically results in improved performance (and accuracy as previously described). Performance improvements with FMA In general, the use of FMA typically results in improved performance. But as always you need to benchmark! Thankfully, Lucene deals with quite a bit of complexity when determining whether to use FMA or not, so you don’t have to. Things like, whether the CPU even has support for FMA, if FMA is enabled in the Java Virtual Machine, and only enabling FMA on architectures that have proven to be faster than separate multiply and add operations. As you can probably tell, this heuristic is not perfect, but goes a long way to making the out-of-the-box experience good. While accuracy is improved with FMA, we see no negative effect on pre-existing similarity computations when FMA is not enabled. Along with the use of FMA, the suite of vector similarity functions got some (more) love. All of dot product, square, and cosine distance, both the scalar and Panama vectorized variants have been updated. Optimizations have been applied based on the inspection of disassembly and empirical experiments, which have brought improvements that help fill the pipeline keeping the CPU busy; mostly through more consistent and targeted loop unrolling, as well as removal of data dependencies within loops. It’s not straightforward to put concrete performance improvement numbers on this change, since the effect spans multiple similarity functions and variants, but we see positive throughput improvements, from single digit percentages in floating-point dot product, to higher double digit percentage improvements in cosine. The byte based similarity functions also show similar throughput improvements. Wrapping up In Lucene 9.7.0, we added the ability to enable an alternative faster implementation of the low-level primitive operations used by Vector Search through SIMD instructions. In the upcoming Lucene 9.9.0 we built upon this to leverage faster FMA instructions, as well as to apply optimization techniques more consistently across all the similarity functions. Previous versions of Elasticsearch are already benefiting from SIMD, and the upcoming Elasticsearch 8.12.0 will have the FMA improvements. Finally, I'd like to call out Lucene PMC member Robert Muir for continuing to make improvements in this area, and for the enjoyable and productive collaboration. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to What is fused multiply-add (FMA) FMA in Lucene: Under the hood Performance improvements with FMA Wrapping up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Understanding fused multiply-add (FMA) within vector similarity computations in Lucene - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/vector-similarity-computations-fma-style",
+    "meta_description": "Learn how to use fused multiply-add (FMA) within vector similarity computations in Lucene and discover how FMA can improve performance."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing Elasticsearch Relevance Engine (ESRE) — Advanced search for the AI revolution Explore the Elasticsearch Relevance Engine (ESRE) by Elastic. ESRE powers gen AI solutions for private data sets with a vector database and ML models for semantic search. ML Research MR By: Matt Riley On June 21, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We’ve introduced the Elasticsearch Relevance Engine (ESRE) , new capabilities for creating highly relevant AI search applications. ESRE builds on Elastic’s leadership in search and over two years of machine learning research and development. The Elasticsearch Relevance Engine combines the best of AI with Elastic’s text search. ESRE gives developers a full suite of sophisticated retrieval algorithms and the ability to integrate with large language models (LLMs). Even better, it’s accessible via a simple, unified API that Elastic’s community already trusts, so developers around the world can start using it immediately to elevate search relevance. The Elasticsearch Relevance Engine’s configurable capabilities can be used to help improve relevance by: Applying advanced relevance ranking features including BM25f, a critical component of hybrid search Creating, storing, and searching dense embeddings using Elastic’s vector database Processing text using a wide range of natural language processing (NLP) tasks and models Letting developers manage and use their own transformer models in Elastic for business specific context Integrating with third-party transformer models such as OpenAI’s GPT-3 and 4 via API to retrieve intuitive summarization of content based on the customer’s data stores consolidated within Elasticsearch deployments Enabling ML-powered search without training or maintaining a model using Elastic’s out-of-the-box Learned Sparse Encoder model to deliver highly relevant, semantic search across a variety of domains Easily combining sparse and dense retrieval using Reciprocal Rank Fusion (RRF), a hybrid ranking method that gives developers control to optimize their AI search engine to their unique mix of natural language and keyword query types Integrating with third-party tooling such as LangChain to help build sophisticated data pipelines and generative AI applications The evolution of search has been driven by a constant need to improve relevance and the ways in which we interact with search applications. Highly relevant search results can lead to increased user engagement on search apps with significant downstream impacts on both revenue and productivity. In the new world of LLMs and generative AI, search can go even further — understanding user intent to provide a level of specificity in responses that’s never been seen before. Notably, every search advancement delivers better relevance while addressing new challenges posed by emerging technologies and changing user behaviors. Whether expanding on keyword search to offer semantic search or enabling new search modalities for video and images, new technology requires unique tools to deliver better experiences for search users. By the same token, today’s world of artificial intelligence calls for a new, highly scalable developer toolkit that’s been built on a tech stack with proven, customer-tested capabilities. With generative AI’s momentum and increased adoption of technologies like ChatGPT, as well as growing awareness of large language model capabilities, developers are hungry to experiment with technology to improve their applications. The Elasticsearch Relevance Engine ushers in a new age of capabilities in the world of generative AI and meets the day with powerful tools that any developer team can use right away. The Elasticsearch Relevance Engine is available on Elastic Cloud — the only hosted Elasticsearch offering to include all of the new features in this latest release. You can also download the Elastic Stack and our cloud orchestration products, Elastic Cloud Enterprise and Elastic Cloud for Kubernetes, for a self-managed experience. ChatGPT and Elasticsearch Elastic Learned Sparse Encoder blog Accessing machine learning models in Elastic Privacy-first AI search using LangChain and Elasticsearch Overcoming the limitations of generative AI models The Elasticsearch Relevance Engine is well positioned to help developers evolve quickly and address these challenges of natural language search, including generative AI. Enterprise data/context aware: The model might not have sufficient internal knowledge relevant to a particular domain. This stems from the data set that the model is trained on. In order to tailor the data and content that LLMs generate, enterprises need a way to feed models proprietary data so they can learn to furnish more relevant, business-specific information. Superior relevance: The Elasticsearch Relevance Engine makes integrating data from private sources as simple as generating and storing vector embeddings to retrieve context using semantic search. Vector embeddings are numerical representations of words, phrases, or documents that help LLMs understand the meanings of words and their relationships. These embeddings enhance transformer model output at speed and scale. ESRE also lets developers bring their own transformer models into Elastic or integrate with third-party models. We also realized that the emergence of late interaction models allows us to provide this out of the box — without the need for extensive training or fine tuning on third-party data sets. Since not every development team has the resources nor expertise to train and maintain machine learning models nor understand the trade-offs to scale, performance, and speed, the Elasticsearch Relevance Engine also includes Elastic Learned Sparse Encoder, a retrieval model built for semantic search across diverse domains. The model pairs sparse embeddings with traditional, keyword-based BM25 search for an easy to use Reciprocal Rank Fusion (RRF) scorer for hybrid search. ESRE gives developers machine learning-powered relevance and hybrid search techniques on day one. Privacy and security: Data privacy is central to how enterprises use and securely pass proprietary data over a network and between components, even when building innovative search experiences. Elastic includes native support for role-based and attribute-based access control to ensure that only those roles with access to data can see it, even for chat and question answering applications. Elasticsearch can support your organization’s need to keep certain documents accessible to privileged individuals, helping your organization to maintain universal privacy and access controls across all of your search applications. When privacy is of the utmost concern, keeping all data within your organization’s network can be not only paramount, but obligatory. From allowing your organization to implement deployments that are in an air-gapped environment to supporting access to secure networks, ESRE provides the tools you need to help your organization keep your data safe. Size and cost: Using large language models can be prohibitive for many enterprises due to data volumes and required computing power and memory. Yet businesses that want to build their own generative AI apps like chatbots need to marry LLMs with their private data. The Elasticsearch Relevance Engine gives enterprises the engine to deliver relevance efficiently with precision context windows that help reduce the data footprint without hassle and expense. Out of date: The model is frozen in time at the point when training data is collected. So the content and data that generative AI models create is only as fresh as data they’re trained on. Integrating corporate data is an inherent need to power timely results from LLMs. Hallucinations: When answering questions or conversing with the model, it may invent facts that sound trustworthy and convincing, but are in-fact projections that aren’t factual. This is another reason that grounding LLMs with contextual, customized knowledge is so critical to making models useful in a business context. The Elasticsearch Relevance Engine lets developers link to their own data stores via a context window in generative AI models. The search results added can provide up-to-date information that’s from a private source or specialized domain, and therefore can return more factual information when prompted instead of relying solely on a model’s so-called \"parametric\" knowledge. Supercharged by a vector database The Elasticsearch Relevance Engine includes a resilient, production grade vector database by design. It gives developers a foundation on which to build rich, semantic search applications. Using Elastic’s platform, development teams can use dense vector retrieval to create more intuitive question-answering that’s not constrained to keywords nor synonyms. They can build multimodal search using unstructured data like images, and even model user profiles and create matches to personalize search results in product and discovery, job search, or matchmaking applications. These NLP transformer models also enable machine learning tasks like sentiment analysis, named entity recognition, and text classification. Elastic’s vector database lets developers create, store, and query embeddings that are highly scalable and performant for real production applications. Elasticsearch excels at high-relevance search retrieval. With ESRE, Elasticsearch provides context windows for generative AI linked to an enterprise’s proprietary data, allowing developers to build engaging, more accurate search experiences. Search results are returned according to a user’s original query, and developers can pass that data on to the language model of their choice to provide an answer with added context. Elastic supercharges question-answer and personalization capabilities with relevant contextual data from your enterprise content store that’s private and tailored to your business. Delivering superior relevance out-of-the-box for all developers With the release of the Elasticsearch Relevance Engine, we’re making Elastic’s proprietary retrieval model readily available. The model is easy to download and works with our entire catalog of ingestion mechanisms like the Elastic web crawler , connectors or API. Developers can use it out of the box with their searchable corpus, and it’s small enough to fit within a laptop’s memory. Elastic’s Learned Sparse Encoder provides semantic search across domains for search use cases such as knowledge bases, academic journals, legal discovery, and patent databases to deliver highly relevant search results without the need to adapt or train it. Most real-world testing shows hybrid ranking techniques are producing the most relevant search result sets. Until now, we've been missing a key component — RRF. We're now including RRF for your application searching needs so you can pair vector and textual search capabilities. Machine learning is on the leading edge of enhancing search result relevance with semantic context, but too often its cost, complexity, and resource demands make it insurmountable for developers to implement it effectively. Developers commonly need the support of specialized machine learning or data science teams to build highly relevant AI-powered search. These teams spend considerable time selecting the right models, training them on domain-specific data sets, and maintaining models as they evolve due to changes in data and its relationships. Learn how Go1 uses Elastic’s vector database for scalable, semantic search . Developers who don’t have the support of specialized teams can implement semantic search and benefit from AI-powered search relevance from the start without the effort and expertise required for alternatives. Starting today, all customers have the building blocks to help achieve better relevance and modern, smarter search. Try it out Read about these capabilities and more . Existing Elastic Cloud customers can access many of these features directly from the Elastic Cloud console . Not taking advantage of Elastic on cloud? See how to use Elasticsearch with LLMs and generative AI . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch, Elasticsearch Relevance Engine, ESRE, Elastic Learned Sparse Encoder and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Overcoming the limitations of generative AI models Supercharged by a vector database Delivering superior relevance out-of-the-box for all developers Try it out Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing Elasticsearch Relevance Engine (ESRE) — Advanced search for the AI revolution - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/introducing-elasticsearch-relevance-engine-esre",
+    "meta_description": "Explore the Elasticsearch Relevance Engine (ESRE) by Elastic. ESRE powers gen AI solutions for private data sets with a vector database and ML models for semantic search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog MAXSCORE & block-max MAXSCORE: More skipping with block-max MAXSCORE Learn about MAXSCORE, block-max MAXSCORE & WAND. Improve the MAXSCORE algorithm to evaluate disjunctive queries more like a conjunctive query. Lucene AG By: Adrien Grand On December 6, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction to WAND and MAXSCORE How do you quickly identify the top-k matches of a disjunctive query using an inverted index? This is the problem that the WAND 1 and MAXSCORE 2 algorithms try to solve. These two algorithms are based on the same idea: taking advantage of the maximum impact score (the maximum contribution of a particular term to the overall score) for each term, so as to skip hits whose score cannot possibly compare greater than the score of the current k-th top hit (referred to as \"minimum competitive score\" below). For instance, if you are searching for hits that contain the or fox , and the has a maximum contribution of 0.2 to the score while the minimum competitive score is 0.5, then there is no point in evaluating hits that only contain the anymore, as they have no chance of making it to the top hits. However, WAND and MAXSCORE come with different performance characteristics: WAND typically evaluates fewer hits than MAXSCORE but with a higher per-hit overhead. This makes MAXSCORE generally perform better on high k values or with many terms - when skipping hits is hard, and WAND perform better otherwise. Block-max MAXSCORE & MAXSCORE in Lucene While Lucene first implemented a variant of WAND called block-max WAND , it later got attracted to the lower overhead of MAXSCORE and started using block-max MAXSCORE for top-level disjunctions in July 2022 (annotation EN in Lucene's nightly benchmarks). The MAXSCORE algorithm is rather simple: it sorts terms by increasing maximum impact score and partitions them into two groups: essential terms and non-essential terms. Non-essential terms are terms with low maximum impact scores whose sum of maximum scores is less than the minimum competitive score. Essential terms are all other terms. Essential terms are used to find candidate matches while non-essential terms are only used to compute the score of a candidate. MAXSCORE usage example Let's take an example: you are searching for the quick fox , and the maximum impact scores of each term are respectively 0.2 for the , 0.5 for quick and 1 for fox . As you start collecting hits, the minimum competitive score is 0, so all terms are essential and no terms are non-essential. Then at some point, the minimum competitive score reaches e.g. 0.3, meaning that a hit that only contains the has no chance of making it to the top-k hits. the moves from the set of essential terms to the set of non-essential terms, and the query effectively runs as (the) +(quick fox) . The + sign here is used to express that a query clause is required, such as in Lucene's classic query parser . Said another way, from that point on, the query will only match hits that contain quick or fox and will only use the to compute the final score. The below table summarizes cases that MAXSCORE considers: Minimum competitive score interval Query runs as [0, 0.2] +(the quick fox) (0.2, 0.7] (the) +(quick fox) (0.7, 1.7] (the quick) +(fox) (1.7, +Infty) No more matches The last case happens when the minimum competitive score is greater than the sum of all maximum impact scores across all terms. It typically never happens with regular MAXSCORE, but may happen on some blocks with block-max MAXSCORE. Improving MAXSCORE to intersect terms Something that WAND does better than MAXSCORE is to progressively evaluate queries less and less as a disjunction and more and more as a conjunction as the minimum competitive score increases, which yields more skipping. This raised the question of whether MAXSCORE can be improved to also intersect terms? The answer is yes: for instance if the minimum competitive score is 1.3, then a hit cannot be competitive if it doesn't match both quick and fox . So we modified our block-max MAXSCORE implementation to consider the following cases instead: Minimum competitive score interval Query runs as [0, 0.2] +(the quick fox) (0.2, 0.7] (the) +(quick fox) (0.7, 1.2] (the quick) +(fox) (1.2, 1.5] (the) +quick +fox (1.5, 1.7] +the +quick +fox (1.7, +Infty) No more matches Now the interesting question is whether these new cases we added are likely to occur or not? The answer depends on how good your score upper bounds are, your actual k value, whether terms actually have matches in common, etc., but it seems to kick in especially often in practice on queries that either have two terms, or that combine two high-scoring terms and zero or more low-scoring terms (e.g. stop words), such as the query we looked at in the above example. This is expected to cover a sizable number of queries in many query logs. Implementing this optimization yielded a noticeable improvement on Lucene's nightly benchmarks (annotation FU), see OrHighHigh (11% speedup) and OrHighMed (6% speedup). It was released in Lucene 9.9 and should be included in Elasticsearch 8.12. We hope you'll enjoy the speedups! Footnotes Broder, A. Z., Carmel, D., Herscovici, M., Soffer, A., & Zien, J. (2003, November). Efficient query evaluation using a two-level retrieval process. In Proceedings of the twelfth international conference on Information and knowledge management (pp. 426-434). ↩ Turtle, H., & Flood, J. (1995). Query evaluation: strategies and optimizations. Information Processing & Management, 31(6), 831-850. ↩ Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Introduction to WAND and MAXSCORE Block-max MAXSCORE & MAXSCORE in Lucene MAXSCORE usage example Improving MAXSCORE to intersect terms Footnotes Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "MAXSCORE & block-max MAXSCORE: More skipping with block-max MAXSCORE - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/more-skipping-with-bm-maxscore",
+    "meta_description": "Learn about MAXSCORE, block-max MAXSCORE & WAND. Improve the MAXSCORE algorithm to evaluate disjunctive queries more like a conjunctive query."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent Learn about different options to ingest data from AWS S3 into Elastic Cloud. This blog covers how to ingest data from AWS S3 using Elastic Agent. Ingestion How To HL By: Hemendra Singh Lodhi On October 10, 2024 Part of Series How to ingest data from AWS S3 into Elastic Cloud Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. This is the second installment in a multi-part blog series exploring different options for ingesting data from AWS S3 into Elastic Cloud. Check out the other parts of the series: Part 1: Elastic Serverless Forwarder Part 3: Elastic S3 Connector In this blog we will learn about how to ingest data from AWS S3 using Elastic Agent. Note 1: See different options comparison in Part 1 : Elastic Serverless Forwarde r Note 2: Elastic Cloud deployment is a prerequisite to follow along the steps described below. Elastic Cloud Check the Part 1 : Elastic Serverless Forwarder of the blog series on how to get started with Elastic Cloud. Skip this if you already have a active deployment. Elastic Agent for AWS S3 data ingestion Another option to ingest data from AWS S3 is using Elastic Agent . Elastic Agent is a single, unified way to ingest data such as logs, metrics. Elastic agent is installed on an instance such as EC2 and using integrations can connect to the AWS services such as S3 and can forward the data to Elasticsearch. High level Elastic Agent working: A policy is created which is like a manifest file and consist of instructions for agent. In the policy integrations are added which are essentialy modules consists of assets such as configs, mappings, dashboards etc. Agents are installed with the required policy. Agent will perform ingestion action based on the integrations. Features of Elastic Agent Ships both Logs & Metrics Support data transfer over AWS PrivateLink Support all integrations and agent can be managed using Fleet (comes default with Elastic Cloud) Agents needs to be installed and maintaned and there is no autoscaling. Using Fleet can simplify the agent maintenance. Good performance out of the box and performance parameters can be configured to use performance presets . Preset can be used depending on the data type and ingestion requirement. More about Fleet server scalability here Cost is of EC2 instance for agent installation and for SQS notification Data flow High level data flow for Elastic Agent based data ingestion: VPC flow log is configured to write to S3 bucket Once log is written to S3 bucket, S3 event notifications is sent to SQS Elastic agent polls SQS queue for new message. Based on the metadata in the message it reads the log data from S3 bucket and send it to Elasticsearch SQS is recommeded for performance so that agent can read only the new updated objects in S3 bucket instead of polling entire bucket each time Set up For Steps (1)-(2), follow the details from Part 1 : Elastic Serverless Forwarder : 1. Create S3 Bucket to store VPC flow logs 2. Enable VPC Flow logs and send to S3 bucket created above 3. Create SQS queue with default settings Note: Create SQS queue in same region as S3 bucket Provide queue name sqs-vpc-flow-logs-elastic-agent and keep the other setting as default: Update the SQS Access Policy (Advance) to allow s3 bucket to send notification to SQS queue. Replace account-id with your AWS account id. Keep other options as default. Here, we are specifying S3 to send message to SQS queue (ARN) from the S3 bucket: Note the SQS URL, in queue setting under Details: 4. Enable VPC flow log event notification in S3 bucket Go to S3 bucket s3-vpc-flow-logs-elastic -> Properties and Create event notification Provide name and on what event type you want to trigger SQS. We have selected object create when any object is added to the bucket: Select destination as SQS queue and choose sqs-vpc-flow-logs-elastic-agent : Once saved, configuration will look like below: Confirm VPC flow logs are published in S3 bucket: Confirm S3 event notification is sent to SQS queue: 5. Install Elastic Agent on EC2 instance Launch an EC2 instance To get the installation commands, Go to: Kibana -> Fleet -> Add Agent Create new agent policy aws-vpc-flow-logs-s3-policy and click Create Policy. Once policy is created, copy the instruction to install Elastic Agent . Leave other settings as default: Login to EC2 instance and run the commands: Upon successful completion, status will be updated on fleet page: Update policy aws-vpc-flow-logs-s3-policy with aws integration. This will push aws integration configuration to the agent which is subscribed to this policy. More on how fleet and agent work together is here . Kibana -> Fleet -> Agent policies. Select the policy aws-vpc-flow-logs-s3-policy and click Add integration. This will take you to the integration page search for AWS integration. Choosing AWS integration is better if you want monitor more than 1 AWS service: Provide AWS Access Key ID and Secret Access Key for authentication and allow Elastic Agent to read from AWS services. There are other authentication options available. Details here . Namespace option is used to segregate the data based on environment or any other identifier: Toggle off other services and use Collect VPC flow logs from S3 . Update S3 bucket and SQS queue URL copied earlier. Leave advance settings as default: Scroll down and click Existing hosts option as we have already intalled the agent and select the policy aws-vpc-flow-logs-s3-policy . Save and continue. This will push the configured integration to Elastic Agent: Go to Kibana -> Fleet -> Agent policies and policy aws-vpc-flow-logs-s3-policy is updated with AWS integration. After a couple of minutes, you can validate flow logs are ingested from S3 into Elastic. Go to Kibana -> Discover: 6. Monitor VPC flow logs in Kibana dashboards Integrations comes with assets such as dashboard which are pre-built for common use cases. Go to Kibana -> Dashboard and search for VPC Flow logs: More dashboards! As promised, here are few dashboards that can help monitor AWS services used in our setup using the Elastic agent ingestion method. This will help in tracking usage and help in optimisation. We will use the same setup used in the Elastic Agent data ingestion option to configure settings and populate dashboards. Go to Kibana -> Fleet -> aws-vpc-flow-logs-s3-policy . Select AWS integration and toggle on the required service and fill in the details. Some of the interesting Dashboards: Note: All dashboards are available under Kibana->Analytics->Dashboards [Metrics AWS] Lambda overview If you have implemented ingestion using Elastic Serverless Forwarder, then you can use this dashboard to track AWS Lambda metrics. It mainly shows Lambda function duration, errors, and any function throttling: [Metrics AWS] S3 overview This dashboard outlines S3 usage and helps in monitoring bucket size, number of objects, etc. This can help in optimisation of S3 usage by tracking stale buckets and objects: [Logs AWS] S3 server access log overview This dashboard shows S3 server access logging and provides detailed records for the requests that are made to a bucket. This can be useful in security and access audits and can also help in learning how users access your S3 buckets and objects: [Metrics AWS] Usage overview This dashboard shows the general usage of AWS services and highlights API usage against AWS services. This can help in understanding the service usage and potential optimisation: [Metrics AWS] Billing overview This dashboard shows the billing usage by service and helps monitor how many $$ are spent for the services: [Metrics AWS] SQS overview This dashboard shows SQS queues utilisation showing messages sent, received and any delay in sending messages. This is important in monitoring the SQS queues for any issues as it is an important component in the architecture. Any issues with SQS can potentially cause delay in data ingestion: [Metrics AWS] EC2 overview If you are using the Elastic agent ingestion method, then you can monitor the utilisation of the EC2 instance for CPU, memory, disk, etc. hosting the Elastic agent, which can be helpful in sizing the instance if there is a high traffic load. This can also be used for your other EC2 instances: [Elastic Agent] S3 input metrics This dashboard shows the detailed utilisation of Elastic agent showing how Elastic agent is processing S3 inputs and monitoring interaction with SQS and S3. The dashboard shows aggregated metrics of the Elastic agent on reading SQS messages and S3 objects and forwarding them to Elasticsearch. Together with the [Metrics AWS] EC2 Overview dashboard, this can help in understanding the utilisation of EC2 and Elastic agent and can potentially helps in scaling these components: Conclusion Elasticsearch provides multiple options to sync data from AWS S3 into Elasticsearch deployments. In this walkthrough, we have demonstrated that it is relatively easy to implement Elastic Agent ingestion options and leverage Elastic's industry-leading search capabilities. In Part 3 of this series , we'll dive into using Elastic S3 Native Connector as another option for ingesting AWS S3 data. Don't forget to check out Part 1 : Elastic Serverless Forwarder of the series and Part 3: Elastic S3 Connector . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Elastic Cloud Elastic Agent for AWS S3 data ingestion Features of Elastic Agent Data flow Set up Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/ingest-aws-s3-data-elastic-cloud-elastic-agent",
+    "meta_description": "Learn how to ingest data from AWS S3 into Elastic Cloud using Elastic Agent. A tutorial with setup steps and dashboards for monitoring."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Hybrid search with multiple embeddings: A fun and furry search for cats! A walkthrough of how to implement different types of search - lexical, vector and hybrid - on multiple embeddings (text and image). It uses a simple and playful search application on cats. Vector Database How To JL By: Jo Ann de Leon On October 31, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Did you know that Elastic can be used as a powerful vector database? In this blog, we’ll explore how to generate, store, and query vector embeddings alongside traditional lexical search. Elastic’s strength lies in its flexibility and scalability, making it an excellent choice for modern search use cases. By integrating vector embeddings with Elastic, you can improve search relevance, and enhance search capabilities across various data types—including non-textual documents like images. But it gets even better! Learning Elastic’s search features can be fun too. In this article, we’ll show you how to search for your favorite cats using Elastic to search both text descriptions and images of cats. Through a simple Python app that accompanies this article, you’ll learn how to implement both vector and keyword-based searches. We’ll guide you through generating your own vector embeddings, storing them in Elastic and running hybrid queries - all while searching for adorable feline friends. Whether you're an experienced developer or new to Elasticsearch, this fun project is a great way to understand how modern search technologies work. Plus, if you love cats, you'll find it even more engaging. So let’s dive in and set up the Elasticats app while exploring Elasticsearch’s powerful capabilities. Before we begin, let’s make sure that you have your Elastic cloud ID and API key ready. Make a copy of the .env-template file , save it as .env and plug in your Elastic cloud credentials. Application architecture Here’s a high-level diagram that depicts our application architecture: Generating and storing vector embeddings Before we can perform any type of search, we first need to have data. Our data.json contains the list of cat documents that we will index in Elasticsearch. Each document describes a cat and has the following mappings: Each cat’s photo property points to the location of the cat’s image. When we call the reindex function in our application, it will generate two embeddings: 1. First is a vector embedding for each cat’s image. We used the clip-ViT-B-32 model. Image models allow you to embed images and text into the same vector space. This allows you to implement image search either as text-to-image or image-to-image search. 2. The second embedding is for the summary text about each cat that is up for adoption. We used a different model which is all-MiniLM-L6-v2. We then store the embeddings as part of our documents. We’re now ready to call the reindex function. From the terminal, run the following command: We can now run our web application: Our initial form looks like this: As you can see, we have exposed some of the keywords as filters (e.g. age, gender, size, etc.) that we will use as part of our queries. Executing different types of searches The following workflow diagram shows the different search paths available in our web application. We’ll walk through each scenario. Lexical search The simplest scenario is a “match all” query which basically returns all cats in our index. We don’t use any of the filters nor enter a description or upload an image. If any of the filters were supplied in the form, then we perform a boolean query. In this scenario, no description is entered so we’re applying the filters in our “match all” query. Vector search In our web form, we are able to upload a similar image of a cat(s). By uploading an image, we can do a vector search by transforming the uploaded image into an embedding and then performing a knn search on the image embeddings that were previously stored. First, we save the uploaded image in an uploads folder. We then create a knn query for the image embedding. Notice that the vector search can be performed with or without the filters (from the boolean query). Also, note that k=5 which means that we’re only returning the top 5 similar documents (cats). Try any of these images stored in the images/<breed> folder: Abyssinian Dahlia - 72245105_3.jpg American shorthair Uni - 64635658_2.jpg Sugarplum - 72157682_4.jpeg Persian Sugar - 72528240_2.jpeg Hybrid search The most complex scenario in our application is when some text is entered into the description field. Here, we perform 3 different types of search and combine them into a hybrid search. First, we perform a lexical “match” query on the actual text input. We also create 2 knn queries: Using the model for the text embedding, we generate an embedding for the text input and perform a knn search on the summary embedding. Using the model for the image embedding, we generate another embedding for the text input and perform a knn search on the image embedding. I mentioned earlier that image models allow you to do not just an image-to-image search as we’ve seen in the vector search scenario above, but it also allows you to do a text-to-image search. This means that if I type “black cats” in the description, it will search for images that may contain or resemble black cats! We then utilize the Reciprocal Rank Fusion (RRF) retriever to effectively combine and rank the results from all three queries into a single cohesive result set. RRF is a method designed to merge multiple result sets, each with potentially different relevance indicators, into one unified set. Unlike simply joining the result arrays, RRF applies a specific formula to rank documents based on their positions in the individual result sets. This approach ensures that documents appearing in multiple queries are given higher importance, leading to improved relevance and quality of the final results. By using RRF, we avoid the complexities of manually tuning weights for each query and achieve a balanced integration of diverse search strategies. To further illustrate, the following is a table showing the ranking of the individual result sets when we search for “sisters”. Using the RRF formula (with the default ranking constant k=60), we can then derive the final score for each document. Sorting the final scores in descending order then gives us the final ranking of the documents. “Willow & Nova” is our top hit (cat)! Cat (document) Lexical ranking knn (on img_embedding) ranking knn (on summary_embedding) ranking Final Score Final Ranking Sugarplum 1 3 0.0322664585 2 Willow & Nova 2 1 1 0.0489159175 1 Zoe & Zara 2 0.01612903226 4 Sage 3 2 0.03200204813 3 Primrose 4 0.015625 5 Dahlia 5 0.01538461538 7 Luke & Leia 4 0.015625 6 Sugar & Garth 5 0.01538461538 8 Here are some other tests you can use for the description: “sisters” vs “siblings” “tuxedo” “black cats” with “American shorthair” breed filter “white” Conclusion Besides the obvious — **cats!** — Elasticats is a fantastic way to get to know Elasticsearch. It’s a fun and practical project that lets you explore search technologies while reminding us of the joy that technology can bring. As you dive deeper, you’ll also discover how Elasticsearch’s ability to handle vector embeddings can unlock new levels of search functionality. Whether it’s for cats, images, or other data types, Elastic makes search both powerful and enjoyable! Feel free to contribute to the project or fork the repository to customize it further. Happy searching, and may you find the cat of your dreams! 😸 Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Application architecture Generating and storing vector embeddings Executing different types of searches Lexical search Vector search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Hybrid search with multiple embeddings: A fun and furry search for cats! - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/hybrid-search-multiple-embeddings",
+    "meta_description": "A walkthrough of how to implement lexical, vector and hybrid search on multiple embeddings (text & image), including how to generate vector embeddings."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data from AWS S3 into Elastic Cloud - Part 3: Elastic S3 Connector Learn about different options to ingest data from AWS S3 into Elastic Cloud. This time we will focus on Elastic S3 Connector. Ingestion Integrations How To HL By: Hemendra Singh Lodhi On November 5, 2024 Part of Series How to ingest data from AWS S3 into Elastic Cloud Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. This is the third installment in a multi-part blog series exploring different options for ingesting data from AWS S3 into Elastic Cloud. Check out the other parts of the series: Part 1: Elastic Serverless Forwarder Part 2: Elastic Agent In this blog, we will learn about how to ingest data from AWS S3 using the Elastic S3 Native connector . Elastic Native connectors are available directly within your Elastic Cloud environment. Customers have the option to use self-managed connector clients that provide the highest degree of customization options and flexibility. Note 1: See the comparison between the options in Part 1 : Elastic Serverless Forwarder. Note 2: An Elastic Cloud deployment is a prerequisite to follow along the steps described below. Elastic Cloud Check the Part 1 of the blog series on how to get started with Elastic Cloud. Elastic S3 Native Connector This option for ingesting S3 data is quite different from the earlier ones in terms of use case. This time we will use Elastic S3 Native Connector which is available in Elastic Cloud. Connectors sync data from data sources and create searchable, read only replicas of the data source. They ingest the data and transform them into Elasticsearch documents. The Elastic S3 Native Connector is a good option to ingest data suitable for content search. For example, you can sync your company's private data (such as internal knowledge data and other files) in S3 buckets and perform a text-based search or can perform vector/semantic search through the use of large language models (LLMs). S3 connectors may not be a suitable option for ingesting Observability data such as logs & metrics as its main use case is ingesting content. Features Native connectors are available by default in Elastic Cloud and customers can use self-managed connectors too if they need further customization. Currently an Enterprise Search node (at least 1) must be configured in your cluster to use connectors. Basic & advanced sync rules are available for data filtering at source, such as specific bucket prefix. Synced data is always stored in content tier which is used for search related use cases. The connector offers default and custom options for data filtering, extracting and transforming content . Connector connection is a public egress (outbound) from Elastic Cloud and using Elastic Traffic Filter has no impact as Elastic Traffic Filter (Private Link) connection is is a one-way private egress from AWS. This means that data transfer will be over the public network (HTTPS) and Connector connection is independent of the traffic filter use. Connector scaling depends on the volume of data ingested from the source. The Enterprise Search node sizing depends on Elasticsearch sizing as well, and it is recommended to reach out to Elastic for large-scale data ingestion. Generally, a 2GB–4GB RAM size for Enterprise Search is sufficient for light to medium use cases. Cost will be for object storage in S3 only. There is no data transfer cost from S3 buckets to Elasticsearch when they are within the same AWS Region. There will be some data transfer cost for cross-region data sync i.e S3 bucket and Elasticsearch deployment are in different region. More on AWS Data transfer pricing here . Data flow of Elastic S3 Connector The Elastic S3 Connector syncs data between the S3 bucket and Elasticsearch, as per the below high level flow: Elastic S3 Connector is configured with S3 bucket information and credentials with the necessary permissions to connect to the bucket and sync data. Based on the sync rules , the connector will pull the data from the specified bucket(s). Ingest pipelines perform data parsing and filtering before indexing. When you create a connector, ent-search-generic-ingestion pipeline is available by default which performs most of the common data processing tasks. Custom ingest pipelines can be defined too for transforming data as needed. Connection is over public (HTTPS) network to AWS S3. Note 1: Content larger than 10MB will not be synced. If you are using self-managed connectors, you can use the self-managed local extraction service to handle larger binary files. Note 2: Original permissions at the S3-bucket level are not synced and all indexed documents are visible to the user with Elastic deployment access. Customers can manage documents permissions at Elastic using Role based access control , Document level security & Field level security More information on Connector architecture is available here . Set up Elastic S3 Native Connector 1. Create the S3 bucket, here named elastic-s3-native-connector : AWS Console -> S3 -> Create bucket. You may leave other settings as default or change as per the requirements. This bucket will store data to be synced to Elastic. The connector supports a variety of file types . For the purpose of ingestion testing we will upload a few pdf and text files. 2. Login to Kibana and navigate to Search-> Content -> Connectors. Search for S3 connector. Provide a name for connector, we have given aws-s3-data-connector : The Connector will show a warning message if there is no Enterprise Search node detected, similar to the below: Login to Elastic Cloud Console and edit your deployment. Under Enterprise Search, select the node size and zones and save: You can provide a different index name or the same as connector name. We are using the same index name: Provide AWS credentials and bucket details for elastic-s3-native-connector : When you update the configuration, the connector can display a validation error if there is some delay in updating AWS credentials and bucket names. You can provide the required information and ignore the error banner. This is a known thing as connectors communicate asynchronously with Kibana and for any configuration update there is some delay in communication between the connector and Kibana. The error will go away once the sync starts or a refresh starts after some time: 3. Once configuration is done successfully, click on the \"Sync\" button to perform the initial full content sync. For a recurring sync, configure the sync frequency under the Scheduling tab. It is disabled by default, so you'll need to toggle the \"Enable\" button to enable it. Once scheduling is done, the connector will run at the configured time and pull all the content from the S3 bucket. Elastic Native Connector can only sync files of size 10MB and less. Any files more than 10MB of size will be ignored and will not be synced. You either have to chunk the files accordingly or use a self-managed connector to customize the behavior: Search after AWS S3 data ingestion Once the data is ingested, you can validate directly from the connector under Documents tab: Also, Elasticsearch provides Search applications feature which enable users to build search-powered applications. You can create search applications based on your Elasticsearch indices, build queries using search templates, and easily preview your results directly in the Kibana Search UI. Enhance the search experience of AWS S3 ingested data with Playground Elastic provides Playground functionality to implement Retrieval Augmented Generation(RAG) based question answering with LLM to enhance the search experience on the ingested data. In our case, once the data is ingested from S3, you can configure Playground and use its chat interface, which takes your questions, retrieves the most relevant results from your Elasticsearch documents, and passes those documents to the LLM to generate tailored responses. Check out this great blog post from Han Xiang Choong showcasing the Playground feature using S3 data ingestion. Conclusion In this blog series, we have seen 3 different options Elasticsearch provides to sync and ingest data from AWS S3 into Elasticsearch deployments. Depending on the use case and requirements, customers can choose the best option for them and ingest data via Elastic Serverless Forwarder , Elastic Agent or the S3 Connector. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Elastic Cloud Elastic S3 Native Connector Features Data flow of Elastic S3 Connector Set up Elastic S3 Native Connector Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to ingest data from AWS S3 into Elastic Cloud - Part 3: Elastic S3 Connector - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/ingest-aws-elastic-s3-connector",
+    "meta_description": "Learn how to sync and ingest data from AWS S3 into Elasticsearch deployments using the Elastic S3 Connector. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Vector similarity techniques and scoring Explore vector similarity techniques and scoring in Elasticsearch, including L1 & L2 distance, cosine similarity, dot product similarity and max inner product similarity. Vector Database VC By: Valentin Crettaz On May 13, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. When the need for searching free text arises and Ctrl+F / Cmd+F don't cut it anymore, using a lexical search engine is usually the next logical choice that comes to mind. Lexical search engines excel at analyzing and tokenizing the text to be searched into terms that can be matched at search time, but they usually fall short when it comes to understanding and making sense of the true meaning of the text being indexed and searched. That's exactly where vector search engines shine. They can index the same text in such a way that it can be searched based on both the meaning it represents and its relationships with other concepts having similar or related meaning. In this blog, we will briefly touch upon how vectors are a great mathematical concept for conveying the meaning of text. We'll then dive deeper into the different similarity techniques supported by Elasticsearch when it comes to searching for neighboring vectors, i.e., searching for vectors carrying a similar meaning, and how to score them. What are vector embeddings? This article doesn't delve deeply into the intricacies of vector embeddings. If you're looking to explore this topic further or need a primer before continuing, we recommend checking out the following guide . In a nutshell, vector embeddings are obtained through a machine learning process (e.g. deep learning neural networks) that transforms any kind of unstructured input data (e.g., raw text, image, video, sound, etc.) into numerical data that carries their meaning and relationships. Different flavors of unstructured data require different kinds of machine learning models that have been trained to \"understand\" each type of data. Each vector locates a specific piece of data as a point in a multidimensional space and that location represents a set of features the model uses to characterize the data. The number of dimensions depends on the machine learning model, but they usually range from a couple hundred to a few thousand. For instance, OpenAI Embeddings models boasts 1536 dimensions, while Cohere Embeddings models can range from 382 to 4096 dimensions. The Elasticsearch dense_vector field type supports up to 4096 dimensions as of the latest release. The true feat of vector embeddings is that data points that share similar meaning are close together in the space. Another interesting aspect is that vector embeddings also help capture relationships between data points. How do we compare vectors? Knowing that unstructured data is sliced and diced by machine learning models into vector embeddings that capture the similarity of the data along a high number of dimensions, we now need to understand how the matching of those vectors works. It turns out that the answer is pretty simple. Vector embeddings that are close to one another represent semantically similar pieces of data. So, when we query a vector database, the search input (image, text, etc.) is first turned into a vector embeddings using the same machine learning model that has been used for indexing all the unstructured data, and the ultimate goal is to find the nearest neighboring vectors to that query vector. Hence, all we need to do is figure out how to measure the \"distance\" or \"similarity\" between the query vector and all the existing vectors indexed in the database - it's that simple. Distance, similarity and scoring Luckily for us, measuring the distance or similarity between two vectors is an easy problem to solve thanks to vector arithmetics. So, let’s look at the most popular distance and similarity functions that are supported by Elasticsearch. Warning, math ahead! Just before we dive in, let's have a quick look at scoring. Factually, Lucene only allows scores to be positive. All the distance and similarity functions that we will introduce shortly yield a measure of how close or similar two vectors are, but those raw figures are rarely fit to be used as score since they can be negative. For this reason, the final score needs to be derived from the distance or similarity value in a way that ensures the score will be positive and a bigger score corresponds to a higher ranking (i.e. to closer vectors). L1 distance The L1 distance, also called the Manhattan distance, of two vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B is measured by summing up the pairwise absolute difference of all their elements. Obviously, the smaller the distance δ L 1 \\delta_{L1} δ L 1 ​ , the closer the two vectors are. The L1 distance formula (1) is pretty simple, as can be seen below: δ L 1 ( A ⃗ , B ⃗ ) = ∑ 1 ≤ i ≤ n ∣ A i − B i ∣ (1) \\tag{1} \\delta_{L1}(\\vec{A}, \\vec{B}) = \\sum_{\\mathclap{1\\le i\\le n}} \\vert A_i-B_i \\vert δ L 1 ​ ( A , B ) = 1 ≤ i ≤ n ∑ ​ ∣ A i ​ − B i ​ ∣ ( 1 ) Visually, the L1 distance can be illustrated as shown in the image below (in red): Computing the L1 distance of the following two vectors A ⃗ = ( 1 2 ) \\vec{A} = \\binom{1}{2} A = ( 2 1 ​ ) and B ⃗ = ( 2 0.5 ) \\vec{B} = \\binom{2}{0.5} B = ( 0.5 2 ​ ) would yield ∣ 1 – 2 ∣ + ∣ 2 – 0.5 ∣ = 2.5 \\vert 1–2 \\vert + \\vert 2–0.5 \\vert = 2.5 ∣1–2∣ + ∣2–0.5∣ = 2.5 Important: It is worth noting that the L1 distance function is only supported for exact vector search (aka brute force search) using the script_score DSL query, but not for approximate kNN search using the knn search option or knn DSL query . L2 distance The L2 distance, also called the Euclidean distance, of two vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B is measured by first summing up the square of the pairwise difference of all their elements and then taking the square root of the result. It’s basically the shortest path between two points. Similarly to L1, the smaller the distance δ L 2 \\delta_{L2} δ L 2 ​ , the closer the two vectors are: δ L 2 ( A ⃗ , B ⃗ ) = ∑ 1 ≤ i ≤ n ( A i − B i ) 2 (2) \\tag{2} \\delta_{L2}(\\vec{A},\\vec{B}) = \\sqrt{\\sum_{\\mathclap{1\\le i\\le n}} ( A_i-B_i )^2 } δ L 2 ​ ( A , B ) = 1 ≤ i ≤ n ∑ ​ ( A i ​ − B i ​ ) 2 ​ ( 2 ) The L2 distance is shown in red in the image below: Let’s reuse the same two sample vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B as we used for the δ L 1 \\delta_{L1} δ L 1 ​ distance, and we can now compute the δ L 2 \\delta_{L2} δ L 2 ​ distance as ( 1 − 2 ) 2 + ( 2 − 0.5 ) 2 = 3.25 ≊ 1.803 \\sqrt{(1-2)^2 + (2-0.5)^2} = \\sqrt{3.25} \\approxeq 1.803 ( 1 − 2 ) 2 + ( 2 − 0.5 ) 2 ​ = 3.25 ​ ≊ 1.803 . As far as scoring goes, the smaller the distance between two vectors, the closer (i.e., the more similar) they are. So in order to derive a score we need to invert the distance measure, so that the smallest distance yields the highest score. The way the score is computed when using the L2 distance looks as shown in formula (3) below: _ s c o r e L 2 ( A ⃗ , B ⃗ ) = 1 1 + δ L 2 ( A ⃗ , B ⃗ ) 2 (3) \\tag{3} \\_score_{L2}(\\vec{A},\\vec{B}) = \\frac{1}{1 + \\delta_{L2}(\\vec{A}, \\vec{B})^2} _ scor e L 2 ​ ( A , B ) = 1 + δ L 2 ​ ( A , B ) 2 1 ​ ( 3 ) Reusing the sample vectors from the earlier example, their score would be 1 4.25 ≊ 0.2352 \\frac{1}{4.25} \\approxeq 0.2352 4.25 1 ​ ≊ 0.2352 . Two vectors that are very close to one another will near a score of 1, while the score of two vectors that are very far from one another will tend towards 0. Wrapping up on L1 and L2 distance functions, a good analogy to compare them is to think about A and B as being two buildings in Manhattan, NYC. A taxi going from A to B would have to drive along the L1 path (streets and avenues), while a bird would probably use the L2 path (straight line). Cosine similarity In contrast to L1 and L2, cosine similarity does not measure the distance between two vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B , but rather their relative angle, i.e., whether they are both pointing in roughly the same direction. The higher the similarity s c o s s_{cos} s cos ​ , the smaller the angle α \\alpha α between the two vectors, and hence, the \"closer\" they are and the \"similar\" their conveyed meaning are. To illustrate this, let's think of two people out in the wild looking in different directions. In the figure below, the person in blue looks in the direction symbolized by vector A ⃗ \\vec{A} A and the person in red in the direction of vector B ⃗ \\vec{B} B . The more they will direct their eyesight towards the same direction (i.e., the closer their vectors get), the more their field of view symbolized by the blue and red areas will overlap. How much their field of view overlap is their cosine similarity. However, note that person B looks farther away than person A (i.e., vector B ⃗ \\vec{B} B is longer). Person B might be looking at a mountain far away on the horizon, while person A could be looking at a nearby tree. For cosine similarity, that doesn't play any role as it is only about the angle. Now let's compute that cosine similarity. The formula (4) is pretty simple, where the numerator consists of the dot product of both vectors and the denominator contains the product of their magnitude (i.e., their length): s c o s ( A ⃗ , B ⃗ ) = A ⃗ ⋅ B ⃗ ∥ A ⃗ ∥ × ∥ B ⃗ ∥ (4) \\tag{4} s_{cos}(\\vec{A}, \\vec{B}) = \\frac{\\vec{A} \\cdot \\vec{B}}{\\Vert \\vec{A} \\Vert \\times \\Vert \\vec{B} \\Vert} s cos ​ ( A , B ) = ∥ A ∥ × ∥ B ∥ A ⋅ B ​ ( 4 ) The cosine similarity between A ⃗ \\vec{A} A and B ⃗ \\vec{B} B is shown in the image below as a measure of the angle between them (in red): Let's take a quick detour in order to explain what these cosine similarity values mean concretely. As can be seen in the image below depicting the cosine function, values always oscillate in the [ − 1 , 1 ] [-1, 1] [ − 1 , 1 ] interval. Remember that in order for two vectors to be considered similar, their angle must be as acute as possible, ideally nearing a 0 ° 0° 0° angle, which would boil down to a perfect similarity of 1 1 1 . In other words, when vectors are... ... close to one another, the cosine of their angle nears 1 1 1 (i.e., close to 0 ° 0° 0° ) ... unrelated , the cosine of their angle nears 0 0 0 (i.e., close to 90 ° 90° 90° ) ... opposite , the cosine of their angle nears − 1 -1 − 1 (i.e., close to 180 ° 180° 180° ) Now that we know how to compute the cosine similarity between two vectors and we have a good idea of how to interpret the resulting value, we can reuse the same sample vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B and compute their cosine similarity using the formula (4) we saw earlier. s c o s ( A ⃗ , B ⃗ ) = ( 1 ⋅ 2 ) + ( 2 ⋅ 0.5 ) ( 1 2 + 2 2 ) × ( 2 2 + 0. 5 2 ) ≊ 3 4.609 ≊ 0.650791 s_{cos}(\\vec{A}, \\vec{B}) = \\frac{(1 \\cdot 2) + (2 \\cdot 0.5)}{\\sqrt{(1^2 + 2^2)} \\times \\sqrt{(2^2 + 0.5^2)}} \\approxeq \\frac{3}{4.609} \\approxeq 0.650791 s cos ​ ( A , B ) = ( 1 2 + 2 2 ) ​ × ( 2 2 + 0. 5 2 ) ​ ( 1 ⋅ 2 ) + ( 2 ⋅ 0.5 ) ​ ≊ 4.609 3 ​ ≊ 0.650791 We get a cosine similarity of 0.650791 0.650791 0.650791 , which is closer to 1 1 1 than to 0 0 0 , meaning that the two vectors are somewhat similar , i.e., not perfectly similar, but not completely unrelated either, and certainly do not carry opposite meaning. In order to derive a positive score from any cosine similarity value, we need to use the following formula (5), which transforms cosine similarity values oscillating within the [ − 1 , 1 ] [-1, 1] [ − 1 , 1 ] interval into scores in the [ 0 , 1 ] [0, 1] [ 0 , 1 ] interval: _ s c o r e c o s ( A ⃗ , B ⃗ ) = 1 + s c o s ( A ⃗ , B ⃗ ) 2 (5) \\tag{5} \\_score_{cos}(\\vec{A},\\vec{B}) = \\frac{1 + s_{cos}(\\vec{A}, \\vec{B})}{2} _ scor e cos ​ ( A , B ) = 2 1 + s cos ​ ( A , B ) ​ ( 5 ) The score for the sample vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B would thus be: 1 + 0.650791 2 ≊ 0.8253 \\frac{1 + 0.650791}{2} \\approxeq 0.8253 2 1 + 0.650791 ​ ≊ 0.8253 . Dot product similarity One drawback of cosine similarity is that it only takes into account the angle between two vectors but not their magnitude, which means that if two vectors point roughly in the same direction but one is much longer than the other, both will still be considered similar. Dot product similarity, also called scalar or inner product similarity, improves that by taking into account both the angle and the magnitude of the vectors, which provides for a more accurate similarity metric. In order to make the magnitude of the vectors irrelevant, dot product similarity requires that the vectors be normalized first, so we are ultimately only comparing vectors of unit length 1. Let's try to illustrate this again with the same two people as before, but this time, we put them in the middle of a circular room, so that their sight reach is exactly the same (i.e., the radius of the room). Similarly to cosine similarity, the more they turn towards the same direction (i.e., the closer their vectors get), the more their field of view will overlap. However, in contrary to cosine similarity, both vectors have the same length and both areas have the same surface, which means that the two people look at exactly the same picture located at the same distance. How well those two areas overlap denotes their dot product similarity. Before introducing the dot product similarity formula, let's quickly see how a vector can be normalized. It's pretty simple and can be done in two trivial steps: compute the magnitude of the vector divide each component by the magnitude obtained in 1. As an example, let's take vector A ⃗ = ( 1 2 ) \\vec{A} = \\binom{1}{2} A = ( 2 1 ​ ) . We can compute its magnitude ∥ A ⃗ ∥ \\Vert \\vec{A} \\Vert ∥ A ∥ as we have seen earlier when reviewing the cosine similarity, i.e. 1 2 + 2 2 = 5 \\sqrt{1^2 + 2^2} = \\sqrt{5} 1 2 + 2 2 ​ = 5 ​ . Then, dividing each component of the vector by its magnitude, we obtain the following normalized vector C ⃗ \\vec{C} C : A n o r m ⃗ = C ⃗ = ( 1 5 2 5 ) ≊ ( 0.44 0.89 ) \\vec{A_{norm}} = \\vec{C} = \\dbinom{\\frac{1}{\\sqrt{5}}}{\\frac{2}{\\sqrt{5}}} \\approxeq \\dbinom{0.44}{0.89} A n or m ​ ​ = C = ( 5 ​ 2 ​ 5 ​ 1 ​ ​ ) ≊ ( 0.89 0.44 ​ ) Going through the same process for the second vector B ⃗ = ( 2 0.5 ) \\vec{B} = \\binom{2}{0.5} B = ( 0.5 2 ​ ) would yield the following normalized vector D ⃗ \\vec{D} D : B n o r m ⃗ = D ⃗ = ( 2 4.25 0.5 4.25 ) ≊ ( 0.97 0.24 ) \\vec{B_{norm}} = \\vec{D} = \\dbinom{\\frac{2}{\\sqrt{4.25}}}{\\frac{0.5}{\\sqrt{4.25}}} \\approxeq \\dbinom{0.97}{0.24} B n or m ​ ​ = D = ( 4.25 ​ 0.5 ​ 4.25 ​ 2 ​ ​ ) ≊ ( 0.24 0.97 ​ ) In order to derive the dot product similarity formula, we can compute the cosine similarity between our normalized vectors C ⃗ \\vec{C} C and D ⃗ \\vec{D} D using formula (4), as shown below: s c o s ( C ⃗ , D ⃗ ) = C ⃗ ⋅ D ⃗ 1 × 1 s_{cos}(\\vec{C}, \\vec{D}) = \\frac{\\vec{C} \\cdot \\vec{D}}{1 \\times 1} s cos ​ ( C , D ) = 1 × 1 C ⋅ D ​ And since the magnitude of both normalized vectors is now 1 1 1 , the dot product similarity formula (6) simply becomes... you guessed it, a dot product of both normalized vectors: s d o t ( C ⃗ , D ⃗ ) = C ⃗ ⋅ D ⃗ (6) \\tag{6} s_{dot}(\\vec{C}, \\vec{D}) = \\vec{C} \\cdot \\vec{D} s d o t ​ ( C , D ) = C ⋅ D ( 6 ) In the image below, we show the normalized vectors C ⃗ \\vec{C} C and D ⃗ \\vec{D} D and we can illustrate their dot product similarity as the projection of one vector onto the other (in red). Using our new formula (6), we can compute the dot product similarity of our two normalized vectors, which unsurprisingly yields the exact same similarity value as the cosine one: s d o t ( C ⃗ , D ⃗ ) = ( 1 5 ⋅ 2 4.25 ) + ( 2 5 ⋅ 0.5 4.25 ) ≊ 0.650791 s_{dot}(\\vec{C}, \\vec{D}) = \\Big(\\frac{1}{\\sqrt{5}} \\cdot \\frac{2}{\\sqrt{4.25}}\\Big) + \\Big(\\frac{2}{\\sqrt{5}} \\cdot \\frac{0.5}{\\sqrt{4.25}}\\Big) \\approxeq 0.650791 s d o t ​ ( C , D ) = ( 5 ​ 1 ​ ⋅ 4.25 ​ 2 ​ ) + ( 5 ​ 2 ​ ⋅ 4.25 ​ 0.5 ​ ) ≊ 0.650791 When leveraging dot product similarity, the score is computed differently depending on whether the vectors contain float or byte values. In the former case, the score is computed the same way as for cosine similarity using formula (7) below: _ s c o r e d o t − f l o a t ( C ⃗ , D ⃗ ) = 1 + s d o t ( C ⃗ , D ⃗ ) 2 (7) \\tag{7} \\_score_{dot-float}(\\vec{C},\\vec{D}) = \\frac{1 + s_{dot}(\\vec{C}, \\vec{D})}{2} _ scor e d o t − f l o a t ​ ( C , D ) = 2 1 + s d o t ​ ( C , D ) ​ ( 7 ) However, when the vector is composed of byte values, the scoring is computed a bit differently as shown in formula (8) below, where d i m s dims d im s is the number of dimensions of the vector: _ s c o r e d o t − b y t e ( C ⃗ , D ⃗ ) = 0.5 + s d o t ( C ⃗ , D ⃗ ) 32768 × d i m s (8) \\tag{8} \\_score_{dot-byte}(\\vec{C},\\vec{D}) = \\frac{0.5 + s_{dot}(\\vec{C}, \\vec{D})}{32768 \\times dims} _ scor e d o t − b y t e ​ ( C , D ) = 32768 × d im s 0.5 + s d o t ​ ( C , D ) ​ ( 8 ) Also, one constraint in order to yield accurate scores is that all vectors, including the query vector, must have the same length, but not necessarily 1. Max inner product similarity Since release 8.11, there is a new similarity function that is less constrained than the dot product similarity, in that the vectors don't need to be normalized. The main reason for this is explained at length in the following article , but to sum it up very briefly, certain datasets are not very well adapted to having their vectors normalized (e.g., Cohere embeddings ) and doing so can cause relevancy issues. The formula for computing max inner product similarity is exactly the same as the dot product one (6). What changes is the way the score is computed by scaling the max inner product similarity using a piecewise function whose formula depends on whether the similarity is positive or negative, as shown in formula (9) below: _ s c o r e m i p ( A ⃗ , B ⃗ ) = { 1 1 − s d o t ( A ⃗ , B ⃗ ) if s d o t < 0 1 + s d o t ( A ⃗ , B ⃗ ) if s d o t ⩾ 0 (9) \\tag{9} \\_score_{mip}(\\vec{A},\\vec{B}) = \\begin{cases} \\Large \\frac{1}{1 - s_{dot}(\\vec{A}, \\vec{B})} &\\text{if } s_{dot} < 0 \\\\ 1 + s_{dot}(\\vec{A}, \\vec{B}) &\\text{if } s_{dot} \\geqslant 0 \\end{cases} _ scor e mi p ​ ( A , B ) = ⎩ ⎨ ⎧ ​ 1 − s d o t ​ ( A , B ) 1 ​ 1 + s d o t ​ ( A , B ) ​ if s d o t ​ < 0 if s d o t ​ ⩾ 0 ​ ( 9 ) What this piecewise function does is that it scales all negative max inner product similarity values in the [ 0 , 1 [ [0, 1[ [ 0 , 1 [ interval and all positive values in the [ 1 , ∞ [ [1, \\infty[ [ 1 , ∞ [ interval. In summary That was quite a ride, mathematically speaking, but here are a few takeaways that you might find useful. Which similarity function you can use, ultimately depends on whether your vector embeddings are normalized or not. If your vectors are already normalized or if your data set is agnostic to vector normalization (i.e., relevancy will not suffer), you can go ahead and normalize your vectors and use dot product similarity, as it is much faster to compute than the cosine one since there is no need to compute the length of each vector. When comparing millions of vectors, those computations can add up quite a lot. If your vectors are not normalized, then you have two options: use cosine similarity if normalizing your vectors is not an option use the new max inner product similarity if you want the magnitude of your vectors to contribute to scoring because they do carry meaning (e.g., Cohere embeddings) At this point, computing the distance or similarity between vector embeddings and how to derive their scores should make sense to you. We hope you found this article useful. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to What are vector embeddings? How do we compare vectors? Distance, similarity and scoring L1 distance L2 distance Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Vector similarity techniques and scoring - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/vector-similarity-techniques-and-scoring",
+    "meta_description": "Explore vector similarity techniques and scoring in Elasticsearch, including L1 & L2 distance, cosine similarity, dot product similarity and max inner product similarity."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Optimizing retrieval with ELSER v2 Learn how we are reducing the retrieval costs of the Learned Sparse EncodeR (ELSER) v2. ML Research TV QH VK By: Thomas Veasey , Quentin Herreros and Valeriy Khakhutskyy On October 17, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our last post we introduced ELSER v2, discussed its zero shot relevance and the inference performance improvements we made. This blog focuses on how we reduced its retrieval costs. It has been noted that retrieval can be slow when using scores computed from learned sparse representations, such as ELSER. Slow is a relative term and in this context we mean slow when compared to BM25 scored retrieval. There are two principle reasons for this: The query expansion means we're usually matching many more terms than are present in user supplied keyword searches. The weight distribution for BM25 is particularly well suited to query optimisation. The first bottleneck can be tackled at train time, albeit with a relevance retrieval cost tradeoff. There is a regularizer term in the training loss which allows one to penalize using more terms in the query expansion. There are also gains to be had by performing better model selection. Retrieval cost aware training When training any model it is sensible to keep the best one as optimisation progresses. Typically the quality is measured using the training loss function evaluated on a hold-out, or validation, dataset. We had found this metric alone did not correlate as well as we liked with zero-shot relevance; so we were already measuring NDCG@10 on several small datasets from the BEIR suite to help decide which model to retain. This allows us to measure other aspects of retrieval behavior. In particular, we compute the retrieval cost using the number of weight multiplications performed on average to find the top-k matches for every query. We found that there is quite significant variation between the retrieval cost for relatively small variation in retrieval quality and used this information to identify Pareto optimal models. This was done for various choices of our regularization hyperparameters at different points along their learning trajectories. The figure below shows a scatter plot of the candidate models we considered characterized by their relevance and cost, together with the choice we made for ELSER v2. In the end we sacrificed around 1% in relevance for around a 25% reduction in the retrieval cost. Performing model selection for ELSER v2 via relevance retrieval cost multi-objective optimization Whilst this is a nice win, the figure also shows there is only so much it is possible to achieve when making the trade off at train time. At least without significantly impacting relevance. As we discussed before , with ELSER our goal is to train a model with excellent zero-shot relevance. Therefore, if we make the tradeoff during training we make it in a global setting, without knowing anything about the specific corpus where the model will be applied. To understand how to overcome the dichotomy between relevance and retrieval cost we need to study the token statistics in a specific corpus. At the same time, it is also useful to understand why BM25 scoring is so efficient for retrieval. Optimizing ELSER queries The BM25 score comprises two factors, one which relates to its frequency in each document and one which relates to the frequency of each query term in the corpus. Focusing our attention on second factor, the score contribution of a term t t t is weighted by its inverse document frequency (IDF) or log ⁡ ( 1 − f t f t + 1 ) \\log\\left(\\frac{1 - f_t}{f_t} + 1\\right) lo g ( f t ​ 1 − f t ​ ​ + 1 ) . Here f t = n t + 0.5 N f_t=\\frac{n_t+0.5}{N} f t ​ = N n t ​ + 0.5 ​ and n t n_t n t ​ and N N N denote the matching document count and total number of documents, respectively. So f t f_t f t ​ is just the proportion of the documents which contain that term, modulo a small correction which is negligible for large corpuses. It is clear that IDF is a monotonic decreasing function of the frequency. Coupled with block-max WAND , this allows retrieval to skip many non-competitive documents even if the query includes frequent terms. Specifically, in any given block one might expect some documents to contain frequent terms, but with BM25 scoring they are unlikely to be competitive with the best matches for the query. The figure below shows statistics related to the top tokens generated by ELSER v2 for the NFCorpus dataset. This is one of the datasets used to evaluate retrieval in the BEIR suite and comprises queries and documents related to nutrition. The token frequencies, expressed as a percentage of the documents which contain that token, are on the right hand axis and the corresponding IDF and the average ELSER v2 weight for the tokens are on the left hand axis. If one examines the top tokens they're what we might expect given the corpus content: things like “supplement”, “nutritional”, “diet”, etc. Queries expand to a similar set of terms. This underlines that even if tokens are well distributed in the training corpus as a whole, they can end up concentrated when we examine a specific corpus. Furthermore, we see that unlike BM25 the weight is largely independent of token frequency and this makes block-max WAND ineffective. The outcome is retrieval is significantly more expensive than BM25. Average ELSER v2 weights and IDF for the top 500 tokens in the document expansions of NFCorpus together with the percentage of documents in which they appear Taking a step back, this suggests we reconsider token importance in light of the corpus subject matter. In a general setting, tokens related to nutrition may be highly informative. However, for a corpus about nutrition they are less so. This in fact is the underpinning of information theoretic approaches to retrieval. Roughly speaking we have two measures of the token information content for a specific query and corpus: its assigned weight - which is the natural analogue of the term frequency term used in BM25 - and the token frequency in the corpus as a whole - which we disregard when we score matches using the product of token weights. This suggests the following simple strategy to accelerate queries with hopefully little impact on retrieval quality: Drop frequent tokens altogether provided they are not particularly important for the query in the retrieval phase, Gather slightly more matches than required, and Rerank using the full set of tokens. We can calculate the expected fraction of documents a token will be present in, assuming they all occur with equal probability. This is just the ratio N T N ∣ T ∣ \\frac{N_T}{N|T|} N ∣ T ∣ N T ​ ​ where N T N_T N T ​ is the total number of tokens in the corpus, N N N is the number of documents in the corpus and ∣ T ∣ |T| ∣ T ∣ is the vocabulary size, which is 30522. Any token that occurs in a significantly greater fraction of documents than this is frequent for the corpus. We found that pruning tokens which are 5 times more frequent than expected was an effective relevance retrieval cost tradeoff. We fixed the count of documents reranked using the full token set to 5 times the required set, so 50 for NDCG@10. We found we achieved more consistent results setting the weight threshold for which to retain tokens as a fraction of the maximum weight of any token in the query expansion. For the results below we retain all tokens whose weight is greater than or equal to 0.4 × “max token weight for the query”. This threshold was chosen so NDCG@10 was unchanged on NFCorpus. However, the same parameterization worked for the other 13 test datasets we tested, which strongly suggests that it generalizes well. The table below shows the change in NDCG@10 relative to ELSER v2 with exact retrieval together with the retrieval cost relative to ELSER v1 with exact retrieval using this strategy. Note that the same pruning strategy can be applied to any learned sparse representation. However, we view that the key questions to answer are: Does the approach lead to any degradation in relevance compared to using exact scoring? What improvement in the retrieval latency might one expect using ELSER v2 and query optimization compared to the performance of the text_expansion query to date? In summary, we achieved a very small improvement(!) of 0.07% in average NDCG@10 when we used the optimized query compared to the exact query and an average 3.4 times speedup. Furthermore, this speedup is measured without block-max WAND. As we expected, the optimization works particularly well together with block-max WAND. On a larger corpus (8.8M passages) we saw an 8.4 times speedup with block-max WAND enabled. Measuring the relevance and latency impact of using token pruning followed by reranking. The relevance is measured by percentage change in NDCG@10 for exact retrieval with ELSER v2 and the speedup is measured with respect to exact retrieval with ELSER v1 An intriguing aspect of these results is that on average we see a small relevance improvement. Together with the fact that we previously showed carefully tuned combinations of ELSER v1 and BM25 scores yield very significant relevance improvements, it strongly suggests there are benefits available for relevance as well as for retrieval cost by making better use of corpus token statistics. Ideally, one would re-architect the model and train the query expansion to make use of both token weights and their frequencies. This is something we are actively researching. Implementing with an Elasticsearch query As of Elasticsearch 8.13.0, we have integrated this optimization in the text_expansion query via token pruning so it is automatically applied in the retrieval phase for the text_expansion query. For versions of Elasticsearch before 8.13.0, it is possible to achieve the same results using existing Elasticsearch query DSL given an analysis of the token frequencies and their weights. Tokens are stored in the _source field so it is possible to paginate through the documents and accumulate token frequencies to find out which tokens to exclude. Given an inference response one can partition the tokens into a “kept” and “dropped” set. The kept set is used to score the match in a should query. The dropped set is used in a rescore query on a window of the top 50 docs. Using query_weight and rescore_query_weight both equal to one simply sums the two scores so recovers the score using the full set of tokens. The query together with some explanation is shown below. Conclusion In these last two posts in our series we introduced the second version of the Elastic Learned Sparse EncodeR. So what benefits does it bring? With some improvements to our training data set and regularizer we were able to obtain roughly a 2% improvement on our benchmark of zero-shot relevance. At the same time we've also made significant improvements to inference performance and retrieval latency. We traded a small degradation (of a little less than 1%) in relevance for a large improvement (of over 25%) in the retrieval latency when performing model selection in the training loop. We also identified a simple token pruning strategy and verified it had no impact on retrieval quality. Together these sped up retrieval by between 2 and 5 times when compared to ELSER v1 on our benchmark suite. Token pruning can currently be implemented using Elasticsearch DSL, but we're also working towards performing it automatically in the text_expansion query. To improve inference performance we prepared a quantized version of the model for x86 architecture and upgraded the libtorch backend we use. We found that these sped up inference by between 1.7 and 2.2 times depending on the text length. By using hybrid dynamic quantisation, based on an analysis of layer sensitivity to quantisation, we were able to achieve this with minimal loss in relevance. We believe that ELSER v2 represents a step change in performance, so encourage you to give it a try! This is an exciting time for information retrieval, which is being reshaped by rapid advances in NLP. We hope you've enjoyed this blog series in which we've tried to give a flavor of some of this field. This is not the end, rather the end of the beginning for us. We're already working on various improvements to retrieval in Elasticsearch and particularly in end-to-end optimisation of retrieval and generation pipelines. So stay tuned! The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid Retrieval Part 5: Optimizing inference for ELSER v2 Part 6: Optimizing retrieval with ELSER v2 Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Retrieval cost aware training Optimizing ELSER queries Implementing with an Elasticsearch query Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Improving information retrieval in the Elastic Stack: Optimizing retrieval with ELSER v2 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-2",
+    "meta_description": "Learn how we are reducing the retrieval costs of the Learned Sparse EncodeR (ELSER) v2."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds support for Azure OpenAI chat completions Azure OpenAI chat completions is available via the Elasticsearch inference API. Learn how to use this feature to answer questions. Integrations Generative AI How To TG By: Tim Grein On May 22, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. We’ve integrated Azure OpenAI chat completions in the inference API, which allows our customers to build powerful GenAI applications based on chat completion using large language models like GPT-4 Azure and Elasticsearch developers can utilize the unique capabilities of the Elasticsearch vector database and the Azure AI ecosystem to power unique GenAI applications with the model of their choice. This blog quickly goes over the catalog of supported providers in the open inference API and explains how to use Azure’s OpenAI chat completions to answer questions through an example. The inference API is growing…fast! We’re heavily extending the catalog of supported providers in the open inference API. Check out some of our latest blog posts on Elastic Search labs to learn more about recent integrations around embeddings, completions and reranking: Elasticsearch open inference API adds support for Azure Open AI Studio Elasticsearch open inference API adds support for Azure Open AI embeddings Elasticsearch open inference API adds support for OpenAI chat completions Elasticsearch open Inference API adds support for Cohere’s Rerank 3 model Elasticsearch open inference API adds support for Cohere Embeddings ...more to come! Azure OpenAI chat completions support is available through the open inference API in our stateless offering on Elastic Cloud. It’ll also be soon available to everyone in an upcoming versioned Elasticsearch release. This also complements the capability to use the Elasticsearch vector database in the Azure OpenAI service. Using Azure’s OpenAI chat completions to answer questions In my last blog post about OpenAI chat completions we’ve learned how to summarize text using OpenAI’s chat completions. In this guide we’ll use Azure OpenAI chat completions to answer questions during ingestion to have answers ready ahead of searching. Make sure you have your Azure OpenAI api key, deployment id and resource name ready by creating a free Azure account first and setting up a model suited for chat completions. You can follow Azure's OpenAI Service GPT quickstart guide to get a model up and running. In the following example we’ve used `gpt-4` with the version `2024-02-01`. You can read more about supported models and versions here . In Kibana, you'll have access to a console for you to input these next steps in Elasticsearch without even needing to set up an IDE. First, we configure a model, which will perform completions: You’ll get back a response similar to the following with status code `200 OK` on successful inference creation: You can now call the configured model to perform completion on any text input. Let’s ask the model what’s inference in the context of GenAI: You should get back a response with status code `200 OK` explaining what inference is: Now we can set up a small catalog of questions, which we want to be answered during ingestion. We’ll use the Bulk API to index three questions about products of Elastic: You’ll get back a response with status `200 OK` back similar to the following upon successful indexing: We’ll create now our question and answering ingest pipeline using the script- , inference- and remove-processor : This pipeline prefixes the content with the instruction “Please answer the following question: “ in a temporary field named `prompt`. The content of this temporary `prompt` field will be sent to Azure’s OpenAI Service through the inference API to perform a completion. Using an ingest pipeline allows for immense flexibility as you can change the pre-prompt to anything you would like. This allows you to summarize documents for example, too. Check out Elasticsearch open inference API adds support for OpenAI chat completions to learn about how to build a summarisation ingest pipeline! We now send our documents containing questions through the question and answering pipeline by calling the reindex API . You'll get back a response with status 200 OK similar to the following: In a real world setup you’ll probably use another ingestion mechanism to ingest your documents in an automated way. Check out our Adding data to Elasticsearch guide to learn more about the various options offered by Elastic to ingest data into Elasticsearch. We’re also committed to showcase ingest mechanisms and provide guidance on how to bring data into Elasticsearch using 3rd party tools. Take a look at Ingest Data from Snowflake to Elasticsearch using Meltano: A developer’s journey for example on how to use Meltano for ingesting data. You're now able to search for your pre-generated answers using the Search API : In the response you'll get back your pre-generated answers: Pre-generating answers for frequently asked questions is particularly effective in reducing operational costs. By minimizing the need for on-the-fly response generation, you can significantly cut down on the amount of computational resources required like token usage. Additionally, this method ensures that every user receives the same, precise information. Consistency is crucial, especially in fields requiring high reliability and accuracy such as medical, legal, or technical support. More to come! We’re already working on adding support for more task types using Cohere, Google Vertex AI and many more. Furthermore we’re actively developing an intuitive UI in Kibana for managing Inference endpoints. Lots of exciting stuff to come! Bookmark Elastic Search Labs now to keep with Elastic’s innovations in the GenAI space! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to The inference API is growing…fast! Using Azure’s OpenAI chat completions to answer questions More to come! Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch open inference API adds support for Azure OpenAI chat completions - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-azure-openai-completion-support",
+    "meta_description": "Azure OpenAI chat completions is available via the Elasticsearch inference API. Learn how to use this feature to answer questions."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to use Elasticsearch with popular Ruby tools Take a look at how to use Elasticsearch with some popular Ruby libraries. Ruby How To FB By: Fernando Briano On October 16, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this blog post we are going to take a look at how to use Elasticsearch with some popular Ruby tools. We'll implement the common use APIs found in the \"Getting Started\" guide for the Ruby client. If you follow that link, you can see how you can run these same actions with the official Elasticsearch client: elasticsearch-ruby . We run extensive tests on the client to make sure all the APIs in Elasticsearch are supported for every version, including the current version in development. This covers almost 500 APIs. However, there might be cases where you don't want to use the client and want to implement some of the functionality yourself in your Ruby code. Your code could depend heavily on a particular library, and you'd like to reuse it for Elasticsearch. You could be working in a setup where you only need a couple of the APIs and don't want to bring in a new dependency. Or you have limited resources and you don't want to use a full-fledged Client that can do everything in Elasticsearch. Whatever the reason, Elasticsearch makes it easy by exposing REST APIs that can be called directly, so you can access its functionality by making HTTP requests without the client. When working with the API, it's recommended to take a look at the API Conventions and Common options . Introduction The libraries used in these examples are Net::HTTP , HTTParty , exon , HTTP (a.k.a. http.rb ), Faraday and elastic-transport . On top of looking at how to interact with Elasticsearch from Ruby, this post will take a short look at each of these libraries, allowing us to get to know them and how to use them. It's not going to go in depth for any of the libraries, but it'll give an idea of what it's like to use each of them. The code was written and tested in Ruby 3.3.5. The versions of each tool will be mentioned in their respective sections. The examples use require 'bundler/inline' for the convenience of installing the necessary gems in the same file where the code is being written, but you can use a Gemfile instead too. Setup While working on these examples, I'm using start-local , a simple shell script that sets up Elasticsearch and Kibana in seconds for local development. In the directory where I'm writing this code, I run: This creates a sub directory called elastic-start-local , which includes a .env file with the information we need to connect and authenticate with Elasticsearch. We can either run source elastic-start-local/.env before running our Ruby code, or use the dotenv gem: The following code examples assume the ENV variables in this file have been loaded. We can authenticate with Elasticsearch by using Basic Auth or API Key Authentication . To use Basic Auth, we have to use the user name 'elastic' and the value stored in ES_LOCAL_PASSWORD as password. To use API Key Authentication, we need the value stored in ES_LOCAL_API_KEY in this .env file. Elasticsearch can be managed using Kibana, which will be running at http://localhost:5601 with start-local , and you can create an API Key manually in Kibana too. Elasticsearch will be running on http://localhost:9200 by default, but the examples load the host from the ES_LOCAL_URL environment variable. You could also use any other Elasticsearch cluster to run these, adjusting the host and credentials accordingly. If you're using start-local , you can stop the running instance of Elasticsearch with the command docker compose stop and restart it with docker compose up from the elastic-start-local directory. Net::HTTP Net::HTTP provides a rich library that implements the client in a client-server model that uses the HTTP request-response protocol. We can require this library in our code with require 'net-http' and start using it without installing any extra dependencies. It's not the most user-friendly one, but it's natively available in Ruby. The version used in these examples is 0.4.1 . This gives us the setup for performing requests to Elasticsearch. We can test this with an initial request to the root path of the server: And we can inspect the response for more information: We can now try to create an index : With our index, we can now start to work with Documents . Notice how we need to transform the document to JSON to use it in the request. With an indexed document, we can test a very simple Search request: And do some more work with the indexed data: Finally, we'll delete the index to clean up our cluster: HTTParty HTTParty is a gem \"with the goal to make HTTP fun\". It provides some helpful abstractions to make requests and work with the response. These examples use version 0.22.0 of the library. The initial request to the server: If the response Content Type is application/json , HTTParty will parse the response and return Ruby objects such as a hash or array. The default behavior for parsing JSON will return keys as strings. We can use the response as follows: The README shows how to use the class methods to make requests quickly and the option to create a custom class. It would be more convenient to implement an Elasticsearch Client class and add the different API methods we'd like to use. Something like this for example: We don't want to re-implement Elasticsearch Ruby with HTTParty in this blog post, but this could be an alternative when using just a few of the APIs. We'll take a look at how to build the rest of the requests: excon Excon was designed with the intention of being simple, fast and performant. It is particularly aimed at usage in API clients, so it is well suited for interacting with Elasticsearch. This code uses Excon version 0.111.0 . Excon requests return an Excon::Response object which has body , headers , remote_ip and status attributes. We can also access the data directly with the keys as symbols, similar to how Elasticsearch::API::Response works: We can reuse a connection across multiple requests to share options and improve performance. We can also use persistent connections to establish the socket connection with the initial request, and leave the socket open while we're running these examples: HTTP (http.rb) HTTP is an HTTP client which uses a chainable API similar to Python's Requests . It implements the HTTP protocol in Ruby and outsources the parsing to native extensions. The version used in this code is 5.2.0 . We can also use the auth method to take advantage of the chainable API: Or since we also care about the content type header, chain headers : With HTTP we can create a client with persistent connection to the host, and persist the headers too: So once we've created our persistent clients, it makes it shorter to build our requests: The documentation warns us that the response must be consumed before sending the next request in the persistent connection. That means calling to_s , parse , or flush on the response object. Faraday Faraday is the HTTP client library used by default by the Elasticsearch Client. It provides a common interface over many adapters which you can select when instantiating a client (Net::HTTP, Typhoeus, Patron, Excon and more). The version of Faraday used in this code was 2.12.0 . The signature for get is (url, params = nil, headers = nil) so we're passing nil for parameters in this initial test request: The response is a Faraday::Response object with the response status , headers , and body and we can also access lots of properties in a Faraday Env object . As we've seen with other libraries, the recommended way to use Faraday for our use case is to create a Faraday::Connection object: And now reusing that connection, we can see what the rest of the requests look like with Faraday: Elastic Transport The library elastic-transport is the Ruby gem that deals with performing HTTP requests, encoding, compression, etc. in the official Elastic Ruby clients. This library has been battle tested for years against every official version of Elasticsearch. It used to be known as elasticsearch-transport as it was the base for the official Elasticsearch client. However in version 8.0.0 of the client, we migrated the transport library to elastic-transport since it was also supporting the official Enterprise Search Client and more recently the Elasticsearch Serverless Client. It uses a Faraday implementation by default, which supports several different adapters as we saw earlier. You can also use Manticore and Curb (the Ruby binding for libcurl) implementations included with the library. You can even write your own, or an implementation with some of the libraries we've gone through here. But that would be the subject for a different blog post! Elastic Transport can also be used as an HTTP library to interact with Elasticsearch. It will deal with everything you need and has a lot of settings and different configurations related to the use at Elastic. The version used here is the latest 8.3.5 . A simple example: Conclusion As you can see, the Elasticsearch Ruby client does a lot of work to make it easy to interact with Elasticsearch in your Ruby code. We didn't even go too deep in this blog post working with more complex requests or handling errors. But Elasticsearch's REST API makes it possible to use it with any library that supports HTTP requests, in Ruby and any other language. The Elasticsearch REST APIs guide is a great reference to learn more about the available APIs and how to use them. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction Setup Net::HTTP HTTParty excon Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to use Elasticsearch with popular Ruby tools - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-ruby-tools",
+    "meta_description": "Learn how to use Elasticsearch with popular Ruby libraries like Net::HTTP, HTTParty, exon, HTTP (http.rb), Faraday and elastic-transport."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. Search Relevance PB By: Panagiotis Bailis On May 28, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our previous blog post we introduced the redesigned-from-scratch retrievers framework, which enables the creation of complex ranking pipelines. We also explored how the Reciprocal Rank Fusion (RRF) retriever enables hybrid search by merging results from different queries. While RRF is easy to implement, it has a notable limitation: it focuses purely on relative ranks, ignoring actual scores. This makes fine-tuning and optimization a challenge. Meet the linear retriever! In this post, we introduce the linear retriever , our latest addition for supporting hybrid search! Unlike rrf , the linear retriever calculates a weighted sum across all queries that matched a document. This approach preserves the relative importance of each document within a result set while allowing precise control over each query’s influence on the final score. As a result, it provides a more intuitive and flexible way to fine-tune hybrid search. Defining a linear retriever where the final score will be computed as: s c o r e = 5 ∗ k n n + 1.5 ∗ b m 25 score = 5 * knn + 1.5 * bm25 score = 5 ∗ knn + 1.5 ∗ bm 25 It is as simple as: Notice how simple and intuitive it is? (and really similar to rrf !) This configuration allows you to precisely control how much each query type contributes to the final ranking, unlike rrf , which relies solely on relative ranks. One caveat remains: knn scores may be strictly bounded, depending on the similarity metric used. For example, with cosine similarity or the dot product of unit-normalized vectors, scores will always lie within the [0, 1] range. In contrast, bm25 scores are less predictable and have no clearly defined bounds. Scaling the scores: kNN vs BM25 One challenge of hybrid search is that different retrievers produce scores on different scales. Consider for example the following scenario: Query A scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 100 1.5 1 0.5 Query B scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 0.63 0.01 0.3 0.4 You can see the disparity above: kNN scores range between 0 and 1, while bm25 scores can vary wildly. This difference makes it tricky to set static optimal weights for combining the results. Normalization to the rescue: the MinMax normalizer To address this, we’ve introduced an optional minmax normalizer that scales scores, independently for each query, to the [0, 1] range using the following formula: n o r m a l i z e d s c o r e = ( s c o r e − m i n ) / ( m a x − m i n ) normalized_score = (score - min) / (max - min) n or ma l i ze d s ​ core = ( score − min ) / ( ma x − min ) This preserves the relative importance of each document within a query’s result set, making it easier to combine scores from different retrievers. With normalization, the scores become: Query A scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 1.00 0.01 0.005 0.000 Query B scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 1.00 0.000 0.465 0.645 All scores now lie in the [0, 1] range and optimizing the weighted sum is much more straightforward as we now capture the (relative to the query) importance of a result instead of its absolute score and maintain consistency across queries. Example time! Let’s go through an example now to showcase what the above looks like and how the linear retriever addresses some of the shortcomings of rrf . RRF relies solely on relative ranks and doesn’t consider actual score differences. For example, given these scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 100 1.5 1 0.5 rrf score 0.03226 0.03252 0.03200 0.03125 rrf would rank the documents as: d o c 2 > d o c 1 > d o c 3 > d o c 4 doc2 > doc1 > doc3 > doc4 d oc 2 > d oc 1 > d oc 3 > d oc 4 However, doc1 has a significantly higher bm25 score than the others, which rrf fails to capture because it only looks at relative ranks. The linear retriever, combined with normalization, correctly accounts for both the scores and their differences, producing a more meaningful ranking: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 1 0.01 0.005 0 As we can see in the above, doc1’s great ranking and score for bm25 is properly accounted for and reflected on the final scores. In addition to that, all scores lie now in the [0, 1] range so that we can compare and combine them in a much more intuitive way (and even build offline optimization processes). Putting it all together To take full advantage of the linear retriever with normalization, the search request would look like this: This approach combines the best of both worlds: it retains the flexibility and intuitive scoring of the linear retriever, while ensuring consistent score scaling with MinMax normalization. As with all our retrievers, the linear retriever can be integrated into any level of a hierarchical retriever tree, with support for explainability, match highlighting, field collapsing, and more. When to pick the linear retriever and why it makes a difference The linear retriever: Preserves relative importance by leveraging actual scores, not just ranks. Allows fine-tuning with weighted contributions from different queries. Enhances consistency using normalization, making hybrid search more robust and predictable. Conclusion The linear retriever is already available on Elasticsearch Serverless, and the 8.18 and 9.0 releases! More examples and configuration parameters can also be found in our documentation. Try it out and see how it can improve your hybrid search experience — we look forward to your feedback. Happy searching! Report an issue Related content Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to Meet the linear retriever! Scaling the scores: kNN vs BM25 Normalization to the rescue: the MinMax normalizer Example time! Putting it all together Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Hybrid search revisited: introducing the linear retriever! - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/linear-retriever-hybrid-search",
+    "meta_description": "Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to use Elasticsearch to prompt ChatGPT with natural language This blog post presents an experimental project for querying Elasticsearch in natural language using ChatGPT. Generative AI PHP EZ By: Enrico Zimuel On June 21, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. These days everyone is talking about ChatGPT . One of the cool features of this large language model (LLM) is the ability to generate code. We used it to generate Elasticsearch DSL queries . The goal is to search in Elasticsearch ® with sentences like “Give me the first 10 documents of 2017 from the stocks index.” This experiment showed that it is possible, with some limitations. In this post, we describe this experiment and the open source library that we published for this use case. Can ChatGPT generate Elasticsearch DSL? We start the experiment with some tests focusing on the ability of ChatGPT to generate Elasticsearch DSL query. For this scope, you need to provide some context to ChatGPT about the structure of the data that you want to search. In Elasticsearch, data is stored in an index, which is similar to a \"table\" in a relational database. It has a mapping that defines multiple fields and their types. This means we need to provide the mapping information of the index that we want to query. By doing so, ChatGPT has the necessary context to translate the query into Elasticsearch DSL. Elasticsearch offers a get mapping API to retrieve the mapping of an index. In our experiment, we used a stocks index data set available here . This data set contains five years of stock prices of 500 Fortune companies, spanning from February 2013 to February 2018. Here we reported the first five lines of the CSV file containing the data set: Each line contains the date of the stock, the open value of the day, the high and the low values, the close value, the volume of the stocks exchanged, and finally the stock name — for example, American Airlines Group Inc. (AAL). The mapping associated to the stocks index is as follows: We can use the GET %2Fstocks%2F_mapping API to retrieve the mapping from Elasticsearch. [Related article: ChatGPT and Elasticsearch: OpenAI meets private data ] Let's build a prompt to find out In order to translate a query expressed in human language to Elasticsearch DSL, we need to find the right prompt to give to ChatGPT. This is the most difficult part of the process: to actually program ChatGPT using the correct question format (in other words, the right prompt). After some iterations, we ended up with the following prompt that seems to work quite well: The value {mapping} and query} in the prompt are two placeholders to be replaced with the mapping json string (for example, returned by the GET %2Fstocks%2F_mapping in the previous example) and the query expressed in human language (for example: Return the first 10 documents of 2017). Of course, ChatGPT is limited and in some cases it won’t be able to answer a question. We found that, most of the time, this happens because the sentence used in the prompt is too general or ambiguous. To solve this situation, we need to enhance the prompt using more details. This process is called iteration, and it requires multiple steps to define the proper sentence to be used. If you want to try out how ChatGPT can translate a search sentence in an Elasticsearch DSL query (or even SQL), you can use dsltranslate.com . Putting it all together Using the ChatGPT API offered by OpenAI and the Elasticsearch API for mapping and search, we put it all together in an experimental library for PHP. This library exposes a search() function with the following API: Where $index is the index name to be used, $prompt is the query expressed in human language and $bool is an optional parameter for using a cache (enabled by default). The process of this function is reported in the following diagram: The inputs are index and prompt (on the left). The index is used to retrieve the mapping from Elasticsearch (using the get mapping API) . The result is a mapping in JSON that is used to build the query string to send to ChatGPT using the following API code. We used the gpt-3.5-turbo model of OpenAI that is able to translate in code. The result from ChatGPT contains an Elasticsearch DSL query that we use to query Elasticsearch. The result is then returned to the user. To query Elasticsearch, we utilized the official elastic%2Felasticsearch-php client. To optimize response time and reduce the cost of using the ChatGPT API, we used a simple caching system based on files. We used a cache to: Store the mapping JSON returned by Elasticsearch: We store this JSON in a file named after the index. This allows us to retrieve the mapping information without making additional calls to Elasticsearch. Store the Elasticsearch DSL generated by ChatGPT: To cache the generated Elasticsearch DSL, we named the cache file using the hash (MD5) of the prompt used. This approach enables us to reuse previously generated Elasticsearch DSL for the same query, eliminating the need to call the ChatGPT API again. We also added the possibility to retrieve the Elasticsearch DSL programmatically using the getLastQuery() function. Running the experiment with financial data We used Elastic Cloud to store the stocks value reported here . In particular, we used a simple bulk script to read the stocks file in CSV and send it to Elasticsearch using the bulk API . For more details on how to set up an Elastic Cloud and retrieve the API key, read the documentation . Once we stored the stocks index, we used a simple PHP script for testing some query expressed in English. The script we used is examples%2Ftest.php . To execute this examples%2Ftest.php script, we need to set three environment variables: OPENAI_API_KEY: the API key of OpenAI ELASTIC_CLOUD_ENDPOINT: the url of the Elasticsearch instance ELASTIC_CLOUD_API_KEY: the API key of Elastic Cloud Using the stocks mapping, we tested the following queries recording all the Elasticsearch DSL responses: As you can see, the results are pretty good. The last one about the difference between closed and open fields was quite impressive! All the requests have been translated in a valid Elasticsearch DSL query that is correct according to the question expressed in natural language. Use the language you speak! A very nice feature of ChatGPT is the ability to specify questions in different languages. That means you can use this library and specify the query in different natural languages, like Italian, Spanish, French, German, and so on. Here is an example: All the previous search have the same results producing the following Elasticsearch query (more or less): Important: ChatGPT is an LLM that has been optimized for English, which means the best results are obtained using queries entered in English. Limitations of LLMs Unfortunately, ChatGPT and LLMs in general are not capable of verifying the correctness of the answer from a semantic point of view. They give answers that look right from a statistical point of view. This means, we cannot test if the Elasticsearch DSL query generated by ChatGPT is the right translation of the query in natural language. Of course, this is a big limitation at the moment. In some other use cases, like mathematical operations, we can solve the correctness problem using an external plugin, like the Wolfram Plugin of ChatGPT . In this case, the result of ChatGPT uses the Wolfram engine that checks the correctness of the response, using a mathematical symbolic model. Apart from the correctness limitation, which implies we should always check ChatGPT’s answers, there are also limitations to the ability to translate a human sentence in an Elasticsearch DSL query. For instance, using the previous stocks data set if we ask something as follows: The DSL query generated by ChatGPT is not valid producing this Elasticsearch error: Failed to parse date field [2015-01-01] with format [yyyy]. If we rephrase the sentence using more specific information, removing the apparent ambiguity about the date format, we can retrieve the correct answer, as follows: Basically, the sentence must be expressed using a description of how the Elasticsearch DSL should be rather than a real human sentence. Wrapping up In this post, we presented an experimental use case of ChatGPT for translating natural language search sentences into Elasticsearch DSL queries. We developed a simple library in PHP for using the OpenAI API to translate the query under the hood, providing also a caching system. The results of the experiment are promising, even with the limitation on the correctness of the answer. That said, we will definitely investigate further the possibility to query Elasticsearch in natural language using ChatGPT, as well as other LLM models that are becoming more and more popular. Learn more about the possibilities with Elasticsearch and AI . In this blog post, we may have used third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Can ChatGPT generate Elasticsearch DSL? Let's build a prompt to find out Putting it all together Running the experiment with financial data Use the language you speak! Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to use Elasticsearch to prompt ChatGPT with natural language - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-prompt-chatgpt-natural-language",
+    "meta_description": "This blog post presents an experimental project for querying Elasticsearch in natural language using ChatGPT."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. Elastic Cloud Serverless YL By: Yaru Lin On December 2, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. Elasticsearch Serverless is now generally available. We’ve re- architected Elastisearch as a fully managed service that autoscales with your data, usage, and performance needs. It has the power and flexibility of Elasticsearch without operational overhead. Since its technical preview this spring, we’ve introduced new capabilities to help developers build and manage applications faster. Whether you’re implementing semantic search, keyword search, or even image search, Elasticsearch Serverless simplifies the process, allowing you to focus on innovation instead of infrastructure. Designed to eliminate the complexity of managing resources, Elasticsearch Serverless makes it easier to run search, RAG, and AI-powered applications while maintaining the speed, relevance, and versatility Elasticsearch is known for. In this post, we’ll share how Elasticsearch Serverless simplifies building search applications with its modern architecture and developer-friendly features. Elasticsearch is the backbone of search experience Elasticsearch has long been the trusted engine for developers, data scientists, and full-stack engineers seeking high-performance, scalable search, and vector database capabilities. Its powerful relevance features and flexibility have made it the backbone for countless search-driven applications. Elasticsearch’s innovations in query speed and vector quantization have positioned it as a leading vector database, supporting scalable AI-driven use cases like semantic and hybrid search. Today, Elasticsearch continues to set the gold standard for search by combining: High speed and relevance for text search. Flexible query capabilities to tailor search workflows. Seamless handling of hybrid queries , combining vector and lexical search. An open-source core, rooted in Lucene , with continuous optimizations that push the boundaries of search technology. As search use cases evolve—incorporating hybrid search, AI and inference, and dynamic workloads—teams have more options than ever for scaling and managing infrastructure to meet their unique needs. These evolving demands present an exciting opportunity to rethink how we design for scale. Elasticsearch with serverless speed and simplicity Elasticsearch Serverless builds on Elasticsearch’s strengths to address the demands of modern workloads, characterized by large datasets, AI search, and unpredictable traffic. Elasticsearch Serverless meets these challenges head-on with a reimagined architecture purpose-built for today’s demands. Foundationally, Elasticsearch Serverless is built on a decoupled compute and storage model. This is an architectural change that removes the inefficiencies of repeated data transfers and leverages the reliability of object storage. From here, separating critical components enables independent scaling of indexing and search workloads, and resolves the long-standing challenges of balancing performance and cost-efficiency in high-demand scenarios. Decoupled compute and storage Elasticsearch Serverless uses object storage for reliable data storage and cost-efficient scalability. By eliminating the need for multiple replicas, indexing costs and data duplication are reduced. This approach ensures that storage is used only for what’s necessary, eliminating waste while maximizing efficiency. To maintain Elasticsearch’s speed, segment-level query parallelization optimizes data retrieval from object stores like S3, while advanced caching strategies ensure fast access to frequently used data. Dynamic autoscaling without compromise The decoupled architecture also enables smarter resource management by separating search and ingest workloads, allowing them to scale independently based on specific demands. This separation ensures that: Concurrent updates and searches no longer compete for resources. CPU cycles, memory, and I/O are allocated independently, ensuring consistent performance even during high-ingest operations. Ingest-heavy use cases benefit from isolated compute. Ensure fast and reliable search performance, even while indexing massive volumes of data. Vector search workflows thrive. Decoupling allows for compute-intensive indexing (like embedding generation) without impacting query speeds. Resources for ingest, search, and machine learning scale dynamically and independently to accommodate diverse workloads. There’s no need to overprovision for peak loads or worry about downtime during demand spikes. Read more about our dynamic and load-based ingest and search autoscaling. High-performance query execution Elasticsearch Serverless enhances query execution by building on Elasticsearch’s strengths as a vector database. Innovations in query performance and vector quantization ensure fast and efficient search experiences for modern use cases. Highlights include: Faster data retrieval via segment-level query parallelization, enabling multiple concurrent requests to fetch data from object storage and significantly reducing latency to ensure faster access even when data isn't cached locally Smarter caching through intelligent query results reuse and optimized data structures in Lucene that allow for caching only the utilized portion of indexes, Tailored Lucene index structures maximize performance for various data formats, ensuring that each data type is stored and retrieved in the most efficient manner. Advanced vector quantization significantly reduces the storage footprint and retrieval latency of high-dimensional data, making AI and vector search more scalable and cost-effective. This new architecture preserves Elasticsearch’s flexibility—supporting faceting, filtering, aggregations, and diverse data types—while simplifying operations and accelerating performance for modern search needs. For teams seeking a hands-off solution that adapts to changing needs, Elasticsearch Serverless offers all the power and versatility of Elasticsearch without the operational overhead. Whether you're a developer looking to integrate hybrid search, a data scientist working with high-cardinality datasets, or a full-stack engineer optimizing relevance with AI models, Elasticsearch Serverless empowers you to focus on delivering exceptional search experiences. Access to the newest search and AI features in Elasticsearch Serverless Elasticsearch Serverless is more than just a managed service—it’s a platform designed to accelerate development and optimize search experiences. It’s where you can access the latest search and generative AI features: Elastic AI Assistant : Quickly access documentation, guidance, and resources to simplify prototyping and implementation. ELSER Embedding Model : Enable semantic or hybrid search capabilities, opening new ways to query your data. Semantic Text Field Type: Generate vectors for text fields with ease. Better Binary Quantization ( BBQ ) : Optimize vector storage and memory usage without compromising accuracy or performance. Elastic Rerank and Reciprocal Rank Fusion (RRF) : Improve result relevance with simplified reranking and hybrid scoring capabilities. Playground and Developer Console : Experiment with new features, including Gen AI integrations, using a unified interface and API workflows. ES|QL, Elastic’s intuitive command language , fully compatible with Elasticsearch Serverless. Usage and Performance Transparency : Manage search speed and costs through the Cloud console with detailed performance insights. Get started with Elasticsearch Serverless Ready to start building? Elasticsearch Serverless is available now, and you can try it today with our free trial. Developers love Elasticsearch for its speed, relevance, and flexibility. With Elasticsearch Serverless, you’ll love it for its simplicity. Explore Elasticsearch Serverless today and experience search, reimagined. Learn about serverless pricing . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Elastic Cloud Serverless Ingestion September 20, 2024 Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. VR MR By: Vishal Raj and Marc Lopez Rubio Jump to Elasticsearch is the backbone of search experience Elasticsearch with serverless speed and simplicity Decoupled compute and storage Dynamic autoscaling without compromise High-performance query execution Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch Serverless is now generally available - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-serverless-now-ga",
+    "meta_description": "Elasticsearch Serverless is generally available. Learn how its architecture and features simplify building search applications."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Ingesting data from Snowflake to Elasticsearch using Meltano Ingest data from Snowflake to Elasticsearch with Meltano. Follow along to setup Meltano, connect to Snowflake & ingest data to Elasticsearch. How To DB By: Dmitrii Burlutskii On April 7, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the Search team At Elastic, we have been exploring different ETL tools and how we can leverage them to ship data into Elasticsearch and enable AI-powered search on the ingested data. Today, I’d like to share our story with the Meltano ecosystem and Meltano Elasticsearch loader . Meltano is a declarative code-first data integration engine that allows you to sync data between different storages. There are many extractors and loaders available at the hub.meltano.com . If you store your data in Snowflake and you want to build a search experience out-of-the-box for your customers, you might want to think about using Elasticsearch, where you can build a semantic search for your customers based on the data you have. Today, we will focus on syncing data from Snowflake to Elasticsearch. Requirements Snowflake credentials You will have received all the below credentials after signup , or you can get them from the Snowflake panel. Account username Account password Account Identifier (see here for instructions on how to get it) Snowflake dataset If you create a new Snowflake account you will have sample data to experiment with. However, I will be using one of the public air quality datasets that contains Nitrogen Dioxide (NO2) measurements. Elastic credentials Visit https://cloud.elastic.co and sign up. Click on Create deployment . In the pop-up, you can change or keep the default settings. Once you’re ready for deployment, click on Continue (or click on Open Kibana ). It will redirect you to the Kibana dashboard. Go to Stack Management -> Security -> API keys and generate a new API key. Steps to ingest data from Snowflake to Elasticsearch using Meltano 1. Install Meltano In my example, I will be using the Meltano Python package but you can also install it as a Docker container. Add the snowflake extractor Verify the extractor Add Elasticsearch loader 2. Configure the extractor and the loader There are multiple ways to configure Meltano extractors and loaders: Edit meltano.yml Using CLI commands like Using CLI interactive mode I will be using the interactive mode. To configure the Snowflake extractor run the following command and provide at least the Account Identifier, Username, Password, and Database. You should see the following screen where you can choose an option to configure. After you configured the extract you can test the connection. Simply run the following command: Configure the Elasticsearch loader and provide Host, Port, Schema, and the API key, If you want to change the index name you can run this command and change it: ie. the default index string defined as ecs-{{ stream_name }}-{{ current_timestamp_daily}} that results in ecs-animals-2022-12-25 where the stream name was animals. When everything is configured we can start syncing data. Once the sync starts you can go to Kibana and see that a new index is created and there are some indexed documents. You can view the documents by clicking on the index name. You should see your documents. 3. Use your index settings (or mapping) If we start syncing data, the loader will automatically create a new index with dynamic mapping, which means Elasticsearch will take care of the fields and their types in the index. We can change this behavior if we want to by creating an index in advance and applying the settings we need. Let’s try. Navigate to the Kibana -> DevTools and run the following commands: 3.1 Create a new pipeline This will drop all the documents with datavalue < 10 . 3.2 Create a new index 3.3 Apply index settings 3.4 Change the index name in Meltano 4. Start a sync job When the job is done you can see that the index has fewer documents than the one we created before Conclusion We have successfully synced the data from Snowflake to Elastic Cloud. We let Meltano create a new index for us and take care of the index mapping and we synced data to the existing index with a predefined pipeline. I would like to highlight some key points I wrote down during my journey: Elasticsearch loader ( page on Meltano Hub ) It is not ready to process a big chunk of data. You need to adjust the default Elasticsearch configuration to make it more resilient. I’ve submitted a Pull Request to expose “request_timeout” and “retry_on_timeout” options that will help. It uses the 8.x branch of Elasticsearch Python client so you can make sure it supports the latest Elasticsearch features. It sends data synchronously (doesn’t use Python AsyncIO) so might be quite slow when you need to transfer a huge data volume. Meltano CLI It is just awesome. You don’t need a UI so everything can be configured in the terminal which gives engineers a lot of options for automation. You can simply run on-demand sync with one command. No other running services are required. Replication/Incremental sync If your pipeline requires data replication or incremental syncs, you can visit this page to read more. Also, I would like to mention that Meltano Hub is amazing. It is easy to navigate and find what you need. Also, you can easily compare different loaders or extractors by just looking at how many customers use them. Find more information in the following blog posts if you’re interested in building AI-based apps: Full text and semantic search capabilities on your data set. Connect your data with LLMs to build Question - Answer . Build a Chatbot that uses a pattern known as Retrieval-Augmented Generation (RAG). Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Requirements Snowflake credentials Snowflake dataset Elastic credentials Steps to ingest data from Snowflake to Elasticsearch using Meltano Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Ingesting data from Snowflake to Elasticsearch using Meltano - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/data-ingestion-from-snowflake-to-elasticsearch-using-meltano",
+    "meta_description": "Ingest data from Snowflake to Elasticsearch with Meltano. Follow along to setup Meltano, connect to Snowflake & ingest data to Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog ChatGPT and Elasticsearch: OpenAI meets private data Integrate Elasticsearch's search relevance with ChatGPT's question-answering capability to enhance your domain-specific knowledge base. Generative AI Python JV By: Jeff Vestal On June 21, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. NOTE: This blog has been revisited with an update incorporating new features Elastic has released since this was first published. Please check out the new blog here! Combine Elasticsearch's search relevance with OpenAI's ChatGPT's question-answering capabilities to query your data. In this blog, you'll learn how to connect ChatGPT to proprietary data stores using Elasticsearch and build question/answer capabilities for your data. What is ChatGPT? In recent months, there has been a surge of excitement around ChatGPT, a groundbreaking AI model created by OpenAI. But what exactly is ChatGPT? Based on the powerful GPT architecture, ChatGPT is designed to understand and generate human-like responses to text inputs. GPT stands for \"Generative Pre-trained Transformer.” The Transformer is a cutting-edge model architecture that has revolutionized the field of natural language processing (NLP). These models are pre-trained on vast amounts of data and are capable of understanding context, generating relevant responses, and even carrying on a conversation. To learn more about the history of transformer models and some NLP basics in the Elastic Stack, be sure to check out the great talk by Elastic ML Engineer Josh Devins . The primary goal of ChatGPT is to facilitate meaningful and engaging interactions between humans and machines. By leveraging the recent advancements in NLP, ChatGPT models can provide a wide range of applications, from chatbots and virtual assistants to content generation, code completion, and much more. These AI-powered tools have rapidly become an invaluable resource in countless industries, helping businesses streamline their processes and enhance their services. Limitations of ChatGPT & how to minimize them Despite the incredible potential of ChatGPT, there are certain limitations that users should be aware of. One notable constraint is the knowledge cutoff date. Currently, ChatGPT is trained on data up to September 2021, meaning it is unaware of events, developments, or changes that have occurred since then. Consequently, users should keep this limitation in mind while relying on ChatGPT for up-to-date information. This can lead to outdated or incorrect responses when discussing rapidly changing areas of knowledge such as software enhancements and capabilities or even world events. ChatGPT, while an impressive AI language model, can occasionally hallucinate in its responses, often exacerbated when it lacks access to relevant information. This overconfidence can result in incorrect answers or misleading information being provided to users. It is important to be aware of this limitation and approach the responses generated by ChatGPT with a degree of skepticism, cross-checking and verifying the information when necessary to ensure accuracy and reliability. Another limitation of ChatGPT is its lack of knowledge about domain-specific content. While it can generate coherent and contextually relevant responses based on the information it has been trained on, it is unable to access domain-specific data or provide personalized answers that depend on a user's unique knowledge base. For instance, it may not be able to provide insights into an organization’s proprietary software or internal documentation. Users should, therefore, exercise caution when seeking advice or answers on such topics from ChatGPT directly. One way to minimize these limitations is by providing ChatGPT access to specific documents relevant to your domain and questions, and enabling ChatGPT’s language understanding capabilities to generate tailored responses. This can be accomplished by connecting ChatGPT to a search engine like Elasticsearch. Elasticsearch — you know, for search! Elasticsearch is a scalable data store and vector database designed to deliver relevant document retrieval, ensuring that users can access the information they need quickly and accurately. Elasticsearch’s primary focus is on delivering the most relevant results to users, streamlining the search process, and enhancing user experience. Elasticsearch boasts a myriad of features to ensure top-notch search performance, including support for traditional keyword and text-based search ( BM25 ) and an AI-ready vector search with exact match and approximate kNN ( k-Nearest Neighbor ) search capabilities. These advanced features allow Elasticsearch to retrieve results that are not only relevant but also for queries that have been expressed using natural language. By leveraging traditional, vector, or hybrid search (BM25 + kNN), Elasticsearch can deliver results with unparalleled precision, helping users find the information they need with ease. One of the key strengths of Elasticsearch is its robust API, which enables seamless integration with other services to extend and enhance its capabilities. By integrating Elasticsearch with various third-party tools and platforms, users can create powerful and customized search solutions tailored to their specific requirements. This flexibility and extensibility makes Elasticsearch an ideal choice for businesses looking to improve their search capabilities and stay ahead in the competitive digital landscape. By working in tandem with advanced AI models like ChatGPT, Elasticsearch can provide the most relevant documents for ChatGPT to use in its response. This synergy between Elasticsearch and ChatGPT ensures that users receive factual, contextually relevant, and up-to-date answers to their queries. In essence, the combination of Elasticsearch's retrieval prowess and ChatGPT's natural language understanding capabilities offers an unparalleled user experience, setting a new standard for information retrieval and AI-powered assistance. How to use ChatGPT with Elasticsearch Python interface accepts user questions. Generate a hybrid search request for Elasticsearch BM25 match on the title field kNN search on the title-vector field Boost kNN search results to align scores Set size=1 to return only the top scored document Search request is sent to Elasticsearch. Documentation body and original url are returned to python. API call is made to OpenAI ChatCompletion. Prompt: \"answer this question <question> using only this document <body_content from top search result>\" Generated response is returned to python. Python adds on original documentation source url to generated response and prints it to the screen for the user. The ElasticDoc ChatGPT process utilizes a Python interface to accept user questions and generate a hybrid search request for Elasticsearch, combining BM25 and kNN search approaches to find the most relevant document from the Elasticsearch Docs site, now indexed in Elasticsearch. However, you do not have to use hybrid search or even vector search. Elasticsearch provides the flexibility to use whichever search pattern best fits your needs and provides the most relevant results for your specific data sets. After retrieving the top result, the program crafts a prompt for OpenAI's ChatCompletion API, instructing it to answer the user's question using only the information from the selected document. This prompt is key to ensuring the ChatGPT model only uses information from the official documentation, lessening the chance of hallucinations. Finally, the program presents the API-generated response and a link to the source documentation to the user, offering a seamless and user-friendly experience that integrates front-end interaction, Elasticsearch querying, and OpenAI API usage for efficient question-answering. Note that while we are only returning the top-scored document for simplicity, the best practice would be to return multiple documents to provide more context to ChatGPT. The correct answer could be found in more than one documentation page, or if we were generating vectors for the full body text, those larger bodies of text may need to be chunked up and stored across multiple Elasticsearch documents. By leveraging Elasticsearch's ability to search across numerous vector fields in tandem with traditional search methods, you can significantly enhance your top document recall. Technical setup The technical requirements are fairly minimal, but it takes some steps to put all the pieces together. For this example, we will configure the Elasticsearch web crawler to ingest the Elastic documentation and generate vectors for the title on ingest. You can follow along to replicate this setup or use your own data. To follow along we will need: Elasticsearch cluster Eland Python library OpenAI API account Somewhere to run our python frontend and api backend Elastic Cloud setup The steps in this section assume you don’t currently have an Elasticsearch cluster running in Elastic Cloud. If you do you, can skip to the next section. Sign up If you don’t already have an Elasticsearch cluster, you can sign up for a free trial with Elastic Cloud . Create deployment After you sign up, you will be prompted to create your first deployment. Create a name for your deployment. You can accept the default cloud provider and region or click Edit Settings and choose another location. Click Create deployment. Shortly a new deployment will be provisioned for you and you will be logged in to Kibana. Back to the Cloud We need to do a couple of things back in the Cloud Console before we move on: Click on the Navigation Icon in the upper left and select Manage this deployment. Add a machine learning node. Back in the Cloud Console, click on Edit under your Deployment’s name in the left navigation bar. Scroll down to the Machine Learning instances box and click +Add Capacity. Under Size per zone, click and select 2 GB RAM. Scroll down and click on Save. In the pop-up summarizing the architecture changes, click Confirm. In a few moments, your deployment will now have the ability to run machine learning models! Reset Elasticsearch Deployment User and password: Click on Security on the left navigation under your deployment’s name. Click on Reset Password and confirm with Reset. (Note: as this is a new cluster nothing should be using this Elastic password.) Download the newly created password for the “elastic” user. (We will use this to load our model from Hugging Face and in our python program.) Copy the Elasticsearch Deployment Cloud ID. Click on your Deployment name to go to the overview page. On the right-hand side click the copy icon to copy your Cloud ID. (Save this for use later to connect to the Deployment.) Eland We next need to load an embedding model into Elasticsearch to generate vectors for our blog titles and later for our user’s search questions. We will be using the all-distilroberta-v1 model trained by SentenceTransformers and hosted on the Hugging Face model hub. This particular model isn’t required for this setup to work. It is good for general use as it was trained on very large data sets covering a wide range of topics. However, with vector search use cases, using a model fine-tuned to your particular data set will usually provide the best relevancy. To do this, we will use the Eland python library created by Elastic. The library provides a wide range of data science functions, but we will be using it as a bridge to load the model into Elasticsearch from the Hugging Face model hub so it can be deployed on machine learning nodes for inference use. Eland can either be run as part of a python script or on the command line. The repo also provides a Docker container for users looking to go that route. Today we will run Eland in a small python notebook , which can run in Google’s Colab in the web browser for free. Open the program link and click the “Open in Colab” button at the top to launch the notebook in colab. Set the variable hf_model_id to the model name. This model is set already in the example code but if you want to use a different model or just for future information: hf_model_id='sentence-transformers/all-distilroberta-v1' Copy model name from Hugging Face. The easiest way to do this is to click the copy icon to the right of the model name. Run the cloud auth section, and you will be prompted to enter: Cloud ID (you can find this in the Elastic Cloud Console) Elasticsearch Username (easiest will be to use the “Elastic” user created when the deployment was created) Elasticsearch User Password Run the remaining steps. This will download the model from Hugging face, chunk it up, and load it into Elasticsearch. Deploy (start) the model onto the machine learning node. Elasticsearch index and web crawler Next up we will create a new Elasticsearch index to store our Elastic Documentation, configure the web crawler to automatically crawl and index those docs, as well as use an ingest pipeline to generate vectors for the doc titles. Note that you can use your proprietary data for this step, to create a question/answer experience tailored to your domain. Open Kibana from the Cloud Console if you don’t already have it open. In Kibana, Navigate to Enterprise Search -> Overview. Click Create an Elasticsearch Index. Using the Web Crawler as the ingestion method, enter elastic-docs as the index name. Then, click Create Index. Click on the “Pipelines” tab. Click Copy and customize in the Ingest Pipeline Box. Click Add Inference Pipeline in the Machine Learning Inference Pipelines box. Enter the name elastic-docs_title-vector for the New pipeline. Select the trained ML model you loaded in the Eland step above. Select title as the Source field. Click Continue, then click Continue again at the Test stage. Click Create Pipeline at the Review stage. Update mapping for dense_vector field. (Note: with Elasticsearch version 8.8+, this step should be automatic.) In the navigation menu, click on Dev Tools. You may have to click Dismiss on the flyout with documentation if this is your first time opening Dev Tools. In Dev Tools in the Console tab, update the mapping for our dense vector target field with the following code. You simply paste it in the code box and click the little arrow to the right of line 1. You should see the following response on the right half of the screen: This will allow us to run kNN search on the title field vectors later on. Configure web crawler to crawl Elastic Docs site: Click on the navigation menu one more time and click on Enterprise Search -> Overview. Under Content, click on Indices. Click on search-elastic-docs under Available indices. Click on the Manage Domains tab. Click “Add domain.” Enter https://www.elastic.co/guide/en , then click Validate Domain. After the checks run, click Add domain. Then click Crawl rules. Add the following crawl rules one at a time. Start with the bottom and work up. Rules are evaluated according to first match. Disallow Contains release-notes Allow Regex /guide/en/.*/current/.* Disallow Regex .* With all the rules in place, click Crawl at the top of the page. Then, click Crawl all domains on this index. Elasticsearch’s web crawler will now start crawling the documentation site, generating vectors for the title field, and indexing the documents and vectors. The first crawl will take some time to complete. In the meantime, we can set up the OpenAI API credentials and the Python backend. Connecting with OpenAI API To send documents and questions to ChatGPT, we need an OpenAI API account and key. If you don’t already have an account, you can create a free account and you will be given an initial amount of free credits. Go to https://platform.openai.com and click on Signup. You can go through the process to use an email address and password or login with Google or Microsoft. Once your account is created, you will need to create an API key: Click on API Keys . Click Create new secret key. Copy the new key and save it someplace safe as you won’t be able to view the key again. Python backend setup Clone or download the python program Github Link to code Install required python libraries. We are running the example program in Replit, which has isolated environments. If you are running this on a laptop or VM, best practice is to set up a virtual ENV for python . Run pip install -r requirements.txt Set authentication and connection environment variables (e.g., if running on the command line: export openai_api=”123456abcdefg789”) openai_api - OpenAI API Key cloud_id - Elastic Cloud Deployment ID cloud_user - Elasticsearch Cluster User cloud_pass - Elasticsearch User Password Run the streamlit program. More info about streamlit can be found in its docs . Streamlit has its own command to start: streamlit run elasticdocs_gpt.py This will start a web browser and the url will be printed to the command line. Sample chat responses With everything ingested and the front end up and running, you can start asking questions about the Elastic Documentations. Asking “Show me the API call for an inference processor” now returns an example API call and some information about the configuration settings. Asking for steps to add a new integration to Elastic Agent will return: As mentioned earlier, one of the risks of allowing ChatGPT to answer questions based purely on data it has been trained on is its tendency to hallucinate incorrect answers. One of the goals of this project is to provide ChatGPT with the data containing the correct information and let it craft an answer. So what happens when we give ChatGPT a document that does not contain the correct information? Say, asking it to tell you how to build a boat (which isn’t currently covered by Elastic’s documentation): When ChatGPT is unable to find an answer to the question in the document we provided, it falls back on our prompt instruction simply telling the user it is unable to answer the question. Elasticsearch’s robust retrieval + the power of ChatGPT In this example, we've demonstrated how integrating Elasticsearch's robust search retrieval capabilities with cutting-edge advancements in AI-generated responses from GPT models can elevate the user experience to a whole new level. The individual components can be tailored to suit your specific requirements and adjusted to provide the best results. While we used the Elastic web crawler to ingest public data, you're not limited to this approach. Feel free to experiment with alternative embedding models, especially those fine-tuned for your domain-specific data. You can try all of the capabilities discussed in this blog today! To build your own ElasticDocs GPT experience, sign up for an Elastic trial account , and then look at this sample code repo to get started. If you would like ideas to experiment with search relevance, here are two to try out: [BLOG] Deploy NLP text embeddings and vector search using Elasticsearch [BLOG] Implement image similarity search with Elastic In this blog post, we may have used third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Jump to What is ChatGPT? Limitations of ChatGPT & how to minimize them Elasticsearch — you know, for search! How to use ChatGPT with Elasticsearch Technical setup Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "ChatGPT and Elasticsearch: OpenAI meets private data - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/chatgpt-elasticsearch-openai-meets-private-data",
+    "meta_description": "Integrate Elasticsearch's search relevance with ChatGPT's question-answering capability to enhance your domain-specific knowledge base."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch retrievers architecture and use-cases Elasticsearch retrievers have gone through a significant revamp and are now generally available for all to use. Learn all about their c and use-cases. Search Relevance PB By: Panagiotis Bailis On November 14, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this blog post we'll take another deep dive with retrievers. We've already talked about them in previous blogs from their very introduction to semantic reranking using retrievers . Now, we're happy to announce that retrievers are becoming generally available with Elasticsearch 8.16.0, and in this blog post we'll take a technical tour on how we implemented them, as well as we'll get the chance to discuss the newly available capabilities! Elasticsearch retriever The main concept of a retriever remains the same as with their initial release; retrievers is a framework that provides the basic building blocks that can be stacked hierarchically to build multi-stage complex retrieval and ranking pipelines. E.g. of a simple standard retriever, which just bring backs all documents: Pretty straightforward, right? In addition to the standard retriever, which is essentially just a wrapper around the standard query search API element, we also support the following types: knn - return the top documents from a kNN (k Nearest Neighbor) search rrf - combine results from different retrievers based on the RRF (Reciprocal Rank Fusion) ranking formula text_similarity_reranker - rerank the top results of a nested retriever using a rerank type inference endpoint More detailed information along with the specific parameters for each retriever can also be found in the Elasticsearch documentation . Let's briefly go through some of the technical details first, which will help us understand the architecture and what has changed and why all these previous limitations have now been lifted! Technical drill down of retrievers One of the most important (and requested) things that we wanted to address was the ability to use any retriever, at any nesting level. Whether this means having 2 or more text_similarity_reranker stacked together, or an rrf retriever operating on top of another rrf along with a text_similarity_reranker , or any combination and nesting you can think of, we wanted to make sure that this would be something one could express with retrievers! To account for this, we have introduced some significant changes to the retriever execution plan. Up until now, retrievers were evaluated as part of the standard search execution flow, where (in a simplified scenario for illustration purposes) we reach out to the shards twice: once for querying the shards and bringing back from + size documents from each shard, and once for fetching all field data and perform any additional operations (e.g. highlighting) for the true top [from, from+size] results. This is a nice linear execution flow that is (relatively) easy to follow, but introduces some significant limitations if we want to execute multiple queries, operate on different results sets, etc. In order to work around this, we have moved to an eager evaluation of all sub-retrievers of a retriever pipeline at the very early stages of query execution. This means that, if needed, we are recursively rewriting any retriever query to a simpler form, the specifics of which depend on the retriever type. For non-compound retrievers we rewrite similar to how we do in a standard query, as they could still follow the linear execution plan. For compound retrievers, i.e. for retrievers that operate on top of other retriever(s), we flatten them to a single rank_window_size result set, which is essentially a <doc, shard> tuple list that represents the top ranked documents for this retriever. Let's see what this actually looks like, by working through the following (rather complex) retriever request: The rrf retriever above is a compound one, as it operates on the results of some other retrievers, so we'll try to rewrite it to a simpler, flattened, list of <doc, shard> tuples, where each tuple specifies a document and the shard that it was found on. This rewrite will also enforce a strict ranking, so no different sort options are currently supported. Let's proceed now to identify all components and describe the process of how this will be evaluated: [1] top level rrf retriever; this is the parent of all sub-retrievers which will be rewritten and evaluated last, as we'd first need to know the top 10 (based on rank_window_size ) results from each of its sub-retrievers. [2] This knn retriever is the first child of the top level rrf retriever and uses an embedding service ( my-text-embedding-model ) to compute the actual query vector that will be used. This will be rewritten as the usual knn query by making an async request to the embedding service to compute the vector for the given model_text . [3] A standard retriever that is also part of the top-level's rrf retriever's children, which returns all documents matching topic: science query. [4] Last child of the top-level rrf retriever which is also an rrf retrievers that needs to be flattened. [5] [6] similar to [2] and [3], these are retrievers that are direct children of an rrf retriever, for which we will fetch the top 100 results (based on the rrf retriever's rank_window_size [4]) for each one, combine them using the rrf formula, and then rewrite to a flattened <doc, shard> list of the true top 100 results. The updated execution flow for retrievers is now as follows: We'll start by rewriting all leaves that we can. This means that we'll rewrite the knn retrievers [2] and [6] to compute the query vector, and once we have that we can move up one level in the tree. At the next rewrite step, we are now ready to evaluate the nested rrf retriever [4], which we will eventually rewrite to a flattened RankDocsQuery query (i.e. a list of <doc, shard> tuples). Finally, all inner rewritten steps for the top-level rrf retriever [1] will have taken place, so we should be ready to combine and rank the true top 10 results as requested. Even this top-level rrf retriever will rewrite itself to a flattened RankDocsQuery which will be later used to proceed with the standard linear search execution flow. Visualizing all the above, we have: Looking at the example above, we can see how a hierarchical retriever tree is asynchronously rewritten to just a simple RankDocsQuery . This simplification gives us the nice (and desired!) side effect of eventually executing a normal request with explicit ranking, and in addition to that we can also perform any complementary operations we choose. Playing with the (golden) retrievers! As we briefly mentioned above, with the rework in place, we can now support a plethora of additional search features! In this section we'll go through some examples and use-cases, but more can also be found in the documentation . We'll start with the most coveted one which is composability, i.e. the option to have any retriever at any level of the retriever tree. Composability In the following example, we want to perform a semantic query (using an embedding service like ELSER ), and then merge those results along with a knn query, using rrf . Finally, we'd want to rerank those using the text_similarity_reranker retriever using a reranker. The retriever to express the above would look like this: Aggregations Recall that with the rework we discussed, we rewrite a compound retriever to just a RankDocsQuery (i.e. a flattened explicitly ranked result list). This however does not block us from computing aggregations, as we also keep track of the source queries that were part of a compound retriever. This means that we can fallback to the nested standard retrievers below, to properly compute aggregations for the topic field, based on the union of the results of the two nested retrievers. So in the example above, we'll compute a term aggregation for the topic field, where either the year field is greater than 2023, or the document has the topic elastic associated with it. Collapsing In addition to the aggregation option we discussed above, we can now also collapse results, as we'd do with a standard query request. In the following example, we compute the top 10 results of the rrf retriever, and then collapse them under the year field. The main difference with standard searches is that here we're collapsing just the top rank_window_size results, and not the ones within the nested retrievers. Pagination As is also specified in the docs compound retrievers also support pagination. There is a significant difference with standard queries where, similarly to collapse above, the rank_window_size parameter is the whole result set upon which we can perform navigation. This means that if from + size > rank_window_size then we would bring no results back (but we'd still return aggregations). In the example above, we would compute the top 10 results (as defined in rrf's rank_window_size ) from the combination of the two nested retrievers ( standard and knn ) and then we'd perform pagination by consulting the from and size parameters. So, in this case, we'd skip the first 2 results ( from ) and pick the next 2 ( size ). Consider now a different scenario, where, in the same query above, we would instead have from: 10 and size: 2 . Given that rank_window_size is 10, and that these would be all the results that we can paginate upon, requesting to get 2 results after skipping the first 10 would fall outside of the navigatable result set, so we'd get back empty results. Additional examples and a more detailed break-down can also be found in the documentation for the rrf retriever . Explain We know that with great power comes great responsibility. Given that we can now combine retrievers in arbitrary ways, it could be rather difficult to understand why a result was eventually returned first, and how to optimize our retrieval strategy. For this very specific reason, we have worked to ensure that the explain output of a retriever request (i.e. by specifying explain: true ) will convey all necessary information from all sub-retrievers, so that we can have a proper understanding of all the factors that contributed to the final ranking of a result. Taking the rather complex query in the Collapsing section, the explain for the first result looks like this: Still a bit verbose, but it conveys all necessary information on why a document is at a specific position. For the top-level rrf retriever, we have 2 details specified, one for each of its nested retrievers. The first one is a text_similarity_reranker retriever, where we can see on explain the weight for the rerank operation, and the second one is a knn query informing us of the doc's computed similarity with the query vector. It might take a bit to familiarize with, but each retriever ensures to output all the information you might need to evaluate and optimize your search scenario! Conclusion That's all for now! We hope you stayed with us until now and you enjoyed this topic! We're really excited with the release of the retriever framework and all the new use-cases that we can now support! Retrievers were built in order to support from very simple searches, to advanced RAG and hybrid search scenarios! As mentioned above, watch this space and more will be available soon! Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Jump to Elasticsearch retriever Technical drill down of retrievers Playing with the (golden) retrievers! Composability Aggregations Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch retrievers architecture and use-cases - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-retrievers-ga-8.16.0",
+    "meta_description": "Elasticsearch retrievers are GA with 8.16.0. Learn all about their c, use-cases and how to implement them, including the rrf retriever."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elastic Jira connector tutorial part I Reviewing a use case for the Elastic Jira connector. We'll be indexing our Jira content into Elasticsearch to create a unified data source and do search with Document Level Security. Integrations Ingestion How To GL By: Gustavo Llermaly On January 15, 2025 Part of Series Jira connector tutorials Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In this article, we'll review a use case for the Elastic Jira native connector . We'll use a mock project where a bank is developing a money transfer app and needs to integrate the information in Jira into Elastic. The native connector allows us to get into our Elastic cluster information from tickets, tasks, and other documents, centralizing data and enabling advanced search features. The main benefits of using this connector are: Data from Jira is synchronized with Elasticsearch. Access to advanced search features. Document Level Security (DLS) matching source security. You can only search for what you're allowed to see in Jira. Steps Configuring Jira connector Indexing documents into Elasticsearch Querying data Document Level Security (DLS) Configuring Jira connector You'll first need to get an API token from Jira to connect to Elasticsearch. Go to this link to learn how to create it. Name it \"elastic-connector.\" It should look like this: Get the token and to your Kibana dashboard. Then, go to native connectors and select New Jira Cloud connector. Replace YOUR_KIBANA_URL with Kibana endpoint . Name the connector “bank” and click “Create and attach an index named bank” to create a new index with the same name. Done! Now we need to configure our Jira data. We'll keep \"Enable SSL\" off since we won't be using our own SSL certificates. You can see the details of each field in the official documentation . Activate Document Level Security (DLS) so you get your documents with the users and groups authorized to see them. Once the connector is correctly configured, you can continue to synchronize data as you can see below. It might take a couple of minutes to get the data from Jira. Full Content: indexes all Jira documents. Incremental Content: only indexes changes from the last Full Content Sync. Access Control: indexes Jira users in the security index to activate DLS. We can check the connector's Overview to see if the sync was successful. In the Documents tab, we can see exactly what data we got with the connector. The objects from this first sync are: Projects Issues Attachments Indexing documents into Elasticsearch We are not limited to searching across the connector documents. Elasticsearch allows you to search on many indices with a single query. For our example, we'll index additional documents into the galactic_documents index to see how search works with more than one datasource: Compliance Manual of the GBFF User Guide for the Galactic Banking App Technical Specifications Report But before indexing, we'll create optimized mappings for each field: With the mappings configured, we can now index: Querying data Now that we have both Jira objects and documents, we can search for them together. Querying \"galactic moon\" will get us both Jira objects and the documents we indexed: If a document is too long, you can add the option _source to the query to only include the fields that you need. If you just want to remove some fields, we'll cover that option in the second part of this series. Document Level Security (DLS) We will now configure Document Level Security (DLS) to match Jira permissions to the ones in Elasticsearch so that when users search, they can only see what they are allowed to see in Jira. To begin, we'll go to the connector's Control Panel in Elastic Cloud and click on Access Control Sync. This sync will bring the access and permission info from the Jira users. To test this, I've made another Jira board to which the user \"Gustavo\" does not have access. Note: Do not forget to run content sync after creating the board.You can run one time syncs , or schedule based . Let's begin checking that the documents from the new board are there: We can effectively see the issues: However, since the user \"Gustavo\" does not have access, he should not be able to see them. Let's look for the user's document in the ACL filter index to see their permissions. Response: This index includes the user id and all of their Jira groups. We need to make a match between the content in the user's access control and the field _allowed_access_control in each document. We'll create an API Key for Gustavo using the command below. You must copy the query.template value from the previous step: Note that we're only giving access to the indices in this article through this option. The response for the creation of the API Key for Gustavo is this: You can use curl to test that we can run searches using the API KEY and it won't bring info from the Marketing board, since Gustavo does not have access to it. Response: We can see that Gustavo did not get any info since he did not have access. Now, let's test with the documents from the board that he is allowed to see: Response: Conclusion As you can see, integrating Elasticsearch with Jira has many benefits, like being able to get a unified search on all the projects you're working on as well as being able to run more advanced searches in more than one data source. The added DLS is a quick and easy way to guarantee that users will maintain the access they already had in the original sources. Check out the second part of this tutorial , where we'll review best practices and advanced configurations to escalate the connector. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Configuring Jira connector Indexing documents into Elasticsearch Querying data Document Level Security (DLS) Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elastic Jira connector tutorial part I - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-jira-connector-tutorial",
+    "meta_description": "Learn how to integrate Elasticsearch with Jira using the Elastic Jira connector and implement DLS through a practical use case."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Finding your best music friend with vectors: Spotify Wrapped, part 5 Understanding vectors has never been easier. Handcrafting vectors and figuring out various techniques to find your music friend in a heavily biased dataset. How To PK VB By: Philipp Kahr and Vincent Bosc On April 10, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the first part we talked about how to retrieve your Spotify data and visualize it. In the second part we talked about how to process the data and how to visualize it. In the third part we explored anomaly detection and how it helps us find interesting listening behavior. The fourth part uncovered relationships between the artists by using Kibana Graph. In this part, we talk about how to use vectors to find your music friend. Discover your musical friends with vectors A vector is a mathematical entity that has both magnitude (size) and direction. In this context, vectors are used to represent data, such as the number of songs listened to by a user for each artist. The magnitude corresponds to the count of songs played for an artist, while the direction is determined by the relative proportions of the counts for all artists within the vector. Although the direction is not explicitly set or visualized, it is implicitly defined by the values in the vector and their relationships to one another. The idea is to simply create a huge array where we do a key => value sorting approach. The key is the artist and the value is the count of listened to songs. This is a very simple approach and can be done with a few lines of code. We create this vector: Which is super interesting because it is now sorted by the artist name. This gives us zero values for all artists we didn't listen to, or which the user didn't even know existed. Finding your musical match then becomes a straightforward task of calculating the distance between two vectors and identifying the closest match. Several methods can be used for this, such as the dot product , euclidean distance , and cosine similarity . Each method behaves differently and may yield varying results. It is important to experiment and determine which approach best suits your needs. How does cosine similarity, Euclidian distance and dot product work? We will not delve into the mathematical details of each method, but we will provide a brief overview of how they work. To simplify, let’s break this down into just two dimensions: Ariana Grande and Taylor Swift. User A listens to 100 songs by Taylor Swift, user B listens to 300 songs by Ariana Grande, and user C falls in the middle, listening to 100 songs by Taylor Swift and 100 songs by Ariana Grande. The Cosine Similarity is better the smaller the angle and focuses on the direction of the vectors, ignoring their magnitude. In our case, user C will match with user A and user B equally because the angle between their vectors is the same (both are 45^\\circ). The Euclidian distance measures the direct distance between two points, with shorter distances indicating higher similarity. This method is sensitive to both direction and magnitude. In our case, user C is closer to user A than to user B because the difference in their positions results in a shorter distance. The dot product calculates similarity by summing the products of the corresponding entries of two vectors. This method is sensitive to both magnitude and alignment. For example, user A and user B result in a dot product of 0 because they have no overlap in preferences. User C matches more strongly with user B (300 × 100 = 30,000) than with user A (100 × 100 = 10,000) due to the larger magnitude of user B’s vector. This highlights the dot product’s sensitivity to scale, which can skew results when magnitudes differ significantly. In our specific use case, the magnitude of the vectors should not significantly impact the similarity results. This highlights the importance of applying normalization (more on that later) before using methods like Euclidean distance or dot product to ensure that comparisons are not skewed by differences in scale. Data distribution The distribution of our dataset is a crucial factor, as it will play a significant role later when we work on finding your best musical match. User Count of records Unique Artists Unique Titles Responsible for % of dataset philipp 202907 14183 24570 35% elisheva 140906 9872 23770 24% stefanie 70373 2647 5471 12% emil 53568 5663 14227 9% karolina 41232 7988 12427 7% iulia 39598 5114 8976 6% chris 23598 6124 8654 4% Summary: 7 572182 35473 77942 100% More details about the diversity of the dataset are discussed in the subheading Dataset Issues within the dense_vector section. The primary issue lies in the distribution of listened-to artists for each user. Each color represents a different user, and we can observe various listening styles: some users listen to a wide range of artists evenly distributed, while others focus on just a handful, a single artist, or a small group. These variations highlight the importance of considering user listening patterns when creating your vector. Using dense_vector type First of all, we created the vector above already, now we can store that in a field of dense_vector . We will auto create the dimensions we need in the Python code, based on our vector length. Whoops that errored in our case with this message: Error: BadRequestError(400, 'mapper_parsing_exception', 'The number of dimensions should be in the range [1, 4096] but was [33806]') : Ok, so that means our vector artists is too large. It is 33806 items long. Now, that is interesting, and we need to find a way to reduce that. This number 33806 represents the cardinality of artists. Cardinality is another term for uniqueness. It is the number of unique values in a dataset. In our case, it is the number of unique artists across all users. One of the easiest ways is to rework the vector. Let's focus on the top 1000 commonly used artists. This will reduce the vector size to exactly 1000. We can always increase it to 4096 and see if there is something else going on then. This method of aggregation gives us the top 1000 artists per user. However, this can lead to issues. For instance, if there are 7 users and none of the top 1000 artists overlap, we end up with a vector dimension of 7000. When testing this approach, we encountered the following error: Error: BadRequestError(400, 'mapper_parsing_exception', 'The number of dimensions should be in the range [1, 4096] but was [4456]') . This indicates that our vector dimensions were too large. To resolve this, there are several options. One straightforward approach is to reduce the top 1000 artists to 950, 900, 800, and so on until we fit within the 4096 dimension limit. Reducing the top n artists per user to fit within the 4,096-dimension limit may temporarily resolve the issue, but the risk of exceeding the limit will resurface each time new users are added, as their unique top artists will increase the overall vector dimensions. This makes it an unsustainable long-term solution for scaling the system. We already sense that we will need to find a different solution. Dataset issues We adjusted the aggregation method by switching from calculating the top 1000 artists per user to calculating the overall top 1000 artists and then splitting the results by user. This ensures the vector is exactly 1000 artists long. However, this adjustment does not address a significant issue in our dataset: it is heavily biased toward certain artists, and a single user can disproportionately influence the results. As shown earlier, Philipp contributes roughly 35% of all data, heavily skewing the results. This could result in a situation where smaller contributors, like Chris, have their top artists excluded from the top 1000 terms or even the 4096 terms in a larger vector. Additionally, outliers like Stefanie, who might listen repeatedly to a single artist, can further distort the results. To illustrate this, we converted the JSON response into a table for better readability. Artist Total Count User Casper 15100 14924 stefanie 170 philipp 4 emil 2 chris Taylor Swift 12961 9557 elisheva 2240 stefanie 664 iulia 409 philipp 53 karolina 23 chris 15 emil Ariana Grande 7247 3508 philipp 1873 elisheva 1525 iulia 210 stefanie 107 karolina 24 chris K.I.Z 6683 6653 stefanie 23 philipp 7 emil It is immediately apparent that there is an issue with the dataset. For example, Casper and K.I.Z, both German artists, appear in the top 5, but Casper is overwhelmingly influenced by Stefanie, who accounts for approximately 99% of all tracks listened to for this artist. This heavy bias places Casper at the top spot, even though it might not be representative across the dataset. To address this issue while still using the 4096 artists in a dense vector, we can apply some data manipulation techniques. For instance, we could consider using the diversified_sampler or methods like softmax to calculate the relative importance of each artist. However, if we aim to avoid heavy data manipulation, we can take a different approach by using a sparse_vector instead. Using a sparse_vector type We tried squeezing our vector where each position represented an artist into a dense_vector field, however it's not the best fit as you can tell. We are limited to 4096 artists and we end up with a large array that has a lot of null values. Philipp might never listen to Pink Floyd yet in the dense vector approach, Pink Floyd will take up one position with a 0. Essentially, we were using a dense vector format for data that is inherently sparse. Fortunately, Elasticsearch supports sparse vector representation through the sparse_vector type. Let’s explore how it works! Instead of creating one large array, we will create a key => value pair and store the artists name next to the listened count. This is a much more efficient way of storing the data and will allow us to store a higher cardinality. There is no real limit to how many key value pairs you can have inside the sparse_vector. At some point the performance will degrade, but that is a discussion for another day. Any null pairs will simply be skipped. What does a search look like? We take the entire content of artists and put that inside the query_vector and we use the sparse_vector query type and only retrieve the user and the score. Normalization Using a sparse_vector allows us to store data more efficiently and handle higher cardinality without hitting the dimension limit. The tradeoff, however, is that we are limited to using the dot product for similarity calculations, which means we cannot directly use methods such as cosine similarity or Euclidean distance. As we saw earlier, the dot product is heavily influenced by the amplitude of vectors. To minimize or avoid this effect, we will first need to normalize our data. We provide the full sparse vector to identify our “music best friend.” This straightforward approach has yielded some interesting results, as shown here. However, we are still encountering a similar issue as before: the influence of vector magnitudes. While the impact is less severe compared to the dense_vector approach, the distribution of the dataset still creates imbalances. For example, Philipp might match disproportionately with many users simply due to the vast number of artists he listens to. This raises an important question: does it matter if you listen to an artist 100, 500, 10,000, or 25,000 times? The answer is no—it’s the relative distribution of preferences that matters. To address this, we can normalize the data using a normalizing function like Softmax, which transforms raw values into probabilities. It exponentiates each value and divides it by the sum of the exponentials of all values, ensuring that all outputs are scaled between 0 and 1 and sum to 1. You can normalize directly in Elasticsearch using the normalize aggregation or programmatically in Python using Numpy . With this normalization step, each user is represented by a single document containing a list of artists and their normalized values. The resulting document in Elasticsearch looks like this: Finding your music match is rather easy. We take the entire document for the user Philipp since we want to match him against everyone else. The search looks like this: The response is in JSON and contains the score and the user; we altered it to a table for better readability, then the score is multiplied by 1.000 to remove leading zeros. User Score philipp 0.36043773 karolina 0.050112092 stefanie 0.04934514 iulia 0.048445952 chris 0.039548675 elisheva 0.037409707 emil 0.036741032 On an untuned and out of the box softmax we see that Philipp's best friend is Karolina with a score of 0.050... followed relatively closely by Stefanie with 0.049... . Emil is furthest away from Philipp's taste. After comparing the data for Karolina and Philipp ( using the dashboard from the second blog ), this seems a bit odd. Let's explore how the score is calculated. The issue is that in untuned softmax, the top artist can get a value near 1 and the second artist is already on 0.001..., which emphasises your top artist even more. This is important because the dot product calculation used to identify your closest match works like this: When we calculate the dot product we do 1 * 0.33 = 0.33 , which boosts my compatibility with Karolina a lot. When Philipp is not matching on the top artist of anyone else with a higher value than 0.33, Karolina is my best friend, even though we might have barely anything else in common. To illustrate this here is a table of our top 5 artists, side by side. The number represents the spot in the top artists. Artist Karolina Philipp Fred Again .. 1 Ariana Grande 2 Harry Styles 3 Too Many Zooz 4 Kraftklub 5 Dua Lipa 1 15 David Guetta 2 126 Calvin Harris 3 32 Jax Jones 4 378 Ed Sheeran 5 119 We can observe that Philipp overlaps with Karolina's top 5 artists. Even though they range from place 15, 32, 119, 126, 378 for Philipp, any value that Karolina has is multiplied by Philipp's ranking. In this case, the order of Karolina's top artists weighs more than Philipp's. There are a few ways to fix softmax by adjusting temperature and smoothness . Just trialing out some numbers for temperature and smoothness, we end up with this result (score multiplied by 1.000 to remove leading zeros). A higher temperature describes how sharply softmax assigns the probabilities, this distributes the data more evenly, whilst a lower temperature emphasises a few dominant values, with a sharp decline. User Score philipp 3.59 stefanie 0.50 iulia 0.484 karolina 0.481 chris 0.395 elisheva 0.374 emil 0.367 Adding the temperature and smoothness altered the result. Instead of Karolina being Philipp's best match, it moved to Stefanie. It's interesting to see how adjusting the method of calculating the importance of an artist heavily impacts the search. There are many other options available for building the values for the artists. We can look at the total percentage of an artist represented in a dataset per user. This could lead to better distribution of values than softmax and ensure that the dot product, like described above with Karolina and Philipp for Dua Lipa, wouldn't be that significant anymore. One other option would be to take the total listening time into consideration and not just the count of songs, or their percentage. This would help with artists that publish longer songs that are above ~5-6 minutes. One Fred Again.. a song might be around 2:30 and that would allow Philipp to listen to twice as many songs as someone else. The listened_to_ms is in milliseconds and we end up with a similar discussion around, if a sum() is the correct approach, similar to count of songs played. It is an absolute number, where the higher it gets, the less importance the higher number should get accounted for. We could account for listening completion, there is a listened_to_pct and we could pre-filter the data to only songs that our users finish to at least 80%. Why bother with songs that are skipped in the first few seconds or minutes? The listening percentage punishes people that listen to a lot of songs from random artists using the daily recommended playlists, whilst it emphasises those who like to listen to the same artists over and over again. There are many many opportunities to tweak and alter the dataset to get the best results. All of them take time and have different drawbacks. Conclusion In this blog we took you with us on our journey to identify your music friend. We started off with a limited know-how of Elasticsearch and thought that dense vectors are the answers, and that lead to looking into our dataset and diverting to sparse vectors. Along the way we looked into a few optimisations on the search quality and how to reduce any sort of bias. And then we figured out a way that works best for us and that is the sparse vector with the percentages. Sparse vectors are what powers ELSER as well; instead of artists, it is words. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Discover your musical friends with vectors How does cosine similarity, Euclidian distance and dot product work? Data distribution Using dense_vector type Dataset issues Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Finding your best music friend with vectors: Spotify Wrapped, part 5 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/vectors-spotify-wrapped-part-05",
+    "meta_description": "Understanding vectors has never been easier. Learn how to use use vectors to through a practical example."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog When hybrid search truly shines Demonstrating when hybrid search is better than lexical or semantic search on their own. Generative AI Vector Database How To GL By: Gustavo Llermaly On January 1, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this article, we are going to explore hybrid search by examples, and show when it truly shines against using lexical or semantic search techniques alone. What is hybrid search? Hybrid search is a technique that combines different search approaches such as traditional lexical term matching, and semantic search . Lexical search is good when users know the exact words. This approach will find the relevant documents, and sort them in a way that makes sense by using TF-IDF which means: The more common across the dataset the term you are searching is, the less it contributes to the score; and the more common it is within a certain document the more it contributes to the score. But, what if the words in the query are not present in the documents? Sometimes the user is not looking for something in concrete, but for a concept . They may not be looking for a specific restaurant, but for \"a nice place to eat with family\". For this kind of queries, semantic search is useful because it takes into consideration the context of the search query and brings similar documents. You can expect to get more related documents back than with the previous approach, but in return, this approach struggles with precision, especially with numbers. Hybrid search gives us the best of both worlds by blending the precision of term-matching together with the context-aware matching of semantic search. You can read a deep dive on hybrid search on this article , and more about lexical and semantic search differences in this one . Let's create an example using real estate units. The query will be: quiet home in Pinewood with 2 rooms , with quiet place being the semantic component of the query while Pinewood with 2 rooms will be the textual or lexical portion. Configuring ELSER We are going to use ELSER as our model provider. Start by creating the inference endpoint: If this is your first time using ELSER, you may encounter a 502 Bad Gateway error as the model loads in the background.You can check the status of the model in Machine Learning > Trained Models in Kibana.Once is deployed, you can proceed to the next step. Configuring index For the index, we are going to use text fields, and semantic_text for the semantic field. We are going to copy the descriptions, because we want to use them for both match and semantic queries. Indexing data Querying data Let's start by the classic match query, that will search by the content of the title and description: This is the first result: It is not bad. It managed to capture the neighborhood Pinewood , and also the 2 bedroom requirement, however, this is not a quiet place at all. Now, a pure semantic query: This is the first result: Now the results considered the quiet home piece by relating it to things like \"secluded and private\", but this one is a 3 bedroom and we are looking for 2. Let's run a hybrid search now. We will use RRF (Reciprocal rank fusion) to achieve this purpose and combine the two previous queries. The RRF algorithm will blend the scores of both queries for us. This is the first result: Now the results considered both being a quiet place, but also having 2 bedrooms. Evaluating results For the evaluation, we are going to use the Ranking Evaluation API which allows us to automate the process of running queries and then checking the position of the relevant results. You can choose between different evaluation metrics. For this example I will pick Mean reciprocal ranking (MRR) which takes into consideration the result position and reduces the score as the position gets lower by 1/position#. For this scenario, we are going to test our 3 queries ( multi_match , semantic , hybrid ) against the initial question: quiet home 2 bedroom in Pinewood Expecting the following apartment to be in the first position as it meets all the criteria. Retired apartment in a serene neighborhood, perfect for those seeking a retreat. This well-maintained residence offers two bedrooms with abundant natural light and silence.\" We can configure as many queries as we need, and put on ratings the id of the documents we expect to be in the first positions: As you can see on the image, the query got a score of 1 for hybrid search (1st position), and 0.5 on the other ones, meaning the expected result was returned on the second position. Conclusion Full-text search techniques–which find terms and sort the results by term frequency–and semantic search–which will search by semantic proximity–are powerful in different scenarios. On the one hand, text search shines when users are specific with what they want to search, for example providing the exact SKU for an article or words present on a technical manual. On the other hand, semantic search is useful when users are looking for concepts or ideas not explicitly defined in the documents. Combining both approaches with hybrid search, gives you both full-text search capabilities as well as adding semantically related documents, which can be useful in specific scenarios that require keyword matching and contextual understanding. This dual approach enhances search accuracy and relevance, making it ideal for complex queries and diverse content types. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to What is hybrid search? Configuring ELSER Configuring index Indexing data Querying data Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "When hybrid search truly shines - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-hybrid-search",
+    "meta_description": "Exploring hybrid search and demonstrating when hybrid search is better than lexical or semantic search on their own."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Perform text queries with the Elasticsearch Go client Learn how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client through a practical example. Go How To CR LS By: Carly Richmond and Laurent Saint-Félix On October 31, 2023 Part of Series Using the Elasticsearch Go client for keyword search, vector search & hybrid search Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Building software in any programming language, including Go, is committing to a lifetime of learning. Throughout her university and working career, Carly has needed to adapt to being a polyglot and dabble in many programming languages, including Python, C, JavaScript, TypeScript, and Java. But that wasn't enough! So recently she started playing with Go too! Just like animals, programming languages, and one of your friendly authors, search has undergone an evolution of different practices that can be difficult to decide between for your own search use case. In this blog, we'll share an overview of traditional keyword search along with an example using Elasticsearch and the Elasticsearch Go client . Prerequisites To follow with this example, ensure the following prerequisites are met: Installation of Go version 1.21 or later Create your own Go repo using the recommended structure and package management covered in the Go documentation Creation of your own Elasticsearch cluster, populated with a set of rodent-based pages including for our friendly Gopher , from Wikipedia: Connecting to Elasticsearch In our examples, we will make use of the Typed API offered by the Go client. Establish a secure connection for any query requires configuring the client using either: Cloud ID and API key if making use of Elastic Cloud. Cluster URL, username, password and the certificate. Connecting to our cluster located on Elastic Cloud would look like this: The client connection can then be used for searching, as shown later. Keyword search Keyword search is the foundational search type that we have been familiar with since the inception of Archie , the first documented internet search engine written in 1990. A central component of keyword search is the translation of documents into an inverted index. Exactly like the index found at the back of a textbook, an inverted index contains a mapping between a list of tokens and their location in each document. The below diagram shows the key stages of generating the index: As shown above, the generation of tokens in Elasticsearch comprises three key stages: Stripping of unnecessary characters via zero or more char_filters . In our example we are stripping out HTML elements within the body_content field via the html_strip filter. Splitting the tokens from the content with the standard tokenizer , which will split by spacing and key punctuation. Removing unwanted tokens or transforming tokens from the output stream of the tokenizer using zero or more filter options, such as the lowercase token filter or stemmers such as the snowball stemmer to transform tokens back to their language root. Searching in Elasticsearch with Go When querying with the Go client, we specify the index we want to search and pass in the query and other options, just like in the below example: In the above example, we perform a standard match query to find any document in our index that contains the specified string passed into our function. Note we pass a new empty context to the search execution via Do(context.Background()) . Furthermore, any errors returned by Elasticsearch are output to the err attribute for logging and error handling. Results are returned in res.Hits.Hits with the _Source attribute containing the document itself in a JSON format. To convert this source to a Go-friendly struct, we need to unmarshal the JSON response using the Go encoding/json package , as shown in the below example: Searching and unmarshalling the query gopher will return the Wikipedia page for Gopher as expected: However, if we ask What do Gophers eat? we don't quite get the results we want: A simple keyword search allows results returned to your Go application in a performant way that works in a way we are familiar with from the applications we use. It also works great for exact term matches that are relevant for scenarios such as looking for a particular company or term. However, as we see above, it struggles to identify context and semantics due to the vocabulary mismatch problem. Furthermore, support for non-text file formats such as images and audio is challenging. Conclusion Here we've discussed how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client . Given Go is widely used for infrastructure scripting and building web servers, it's useful to know how to search in Go. Check out the GitHub repo for all the code in this series. Follow on to part 2 to gain an overview of vector search and how to perform vector search in Go. Until then, happy gopher hunting! Resources Elasticsearch Guide Elasticsearch Go client Understanding Analysis in Elasticsearch (Analyzers) by Bo Andersen | #CodingExplained Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Prerequisites Connecting to Elasticsearch Keyword search Searching in Elasticsearch with Go Conclusion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Perform text queries with the Elasticsearch Go client - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/perform-text-queries-with-the-elasticsearch-go-client",
+    "meta_description": "Learn how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client through a practical example."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog ChatGPT and Elasticsearch revisited: Building a chatbot using RAG Learn how to create a chatbot using ChatGPT and Elasticsearch, utilizing all of the newest RAG features. Generative AI Python JV By: Jeff Vestal On August 19, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Follow up to the blog ChatGPT and Elasticsearch: OpenAI meets private data . In this blog, you will learn how to: Create an Elasticsearch Serverless project Create an Inference Endpoint to generate embeddings with ELSER Use a Semantic Text field for auto-chunking and calling the Inference Endpoint Use the Open Crawler to crawl blogs Connect to an LLM using Elastic’s Playground to test prompts and context settings for a RAG chat application. If you want to jump right into the code, you can view the accompanying Jupyter Notebook here . ChatGPT and Elasticsearch (April 2023) A lot has changed since I wrote the initial ChatGPT and Elasticsearch: OpenAI meets private data . Most people were just playing around with ChatGPT, if they had tried it at all. And every booth at every tech conference didn’t feature the letters “AI” (whether it is a useful fit or not). Updates in Elasticsearch (August 2024) Since then, Elastic has embraced being a full featured vector database and is putting a lot of engineering effort into making it the best vector database option for anyone building a search application. So as not to spend several pages talking about all the enhancements to Elasticsearch, here is a non-exhaustive list in no particular order: ELSER - The Elastic Learned Sparse Encoder Elastic Serverless Service was built and is in public beta Elasticsearch open Inference API Embeddings Chat completion Semantic rerankers Semantic_text type - Simplify semantic search Automatic chunking Playground - Visually experiment with RAG application building in Elasticsearch Retrievers Open web crawler With all that change and more, the original blog needs a rewrite. So let’s get started. Updated flow: ChatGPT, Elasticsearch & RAG The plan for this updated flow will be: Setup Create a new Elasticsearch serverless search project Create an embedding inference API using ELSER Configure an index template with a semantic_text field Create a new LLM connector Configure a chat completion inference service using our LLM connector Ingest and Test Crawl the Elastic Labs sites (Search, Observability, Security) with the Elastic Open Web Crawler. Use Playground to test prompts using our indexed Labs content Configure and deploy our App Export the generated code from Playground to an application using FastAPI as the backend and React as the front end. Run it locally Optionally deploy our chatbot to Google Cloud Run Setup Elasticsearch Serverless Project We will be using an Elastic serverless project for our chatbot. Serverless removes much of the complexity of running an Elasticsearch cluster and lets you focus on actually using and gaining value from your data. Read more about the architecture of Serverless here . If you don’t have an Elastic Cloud account, you can create a free two-week trial at elastic.co (Serverless pricing available here ). If you already have one, you can simply log in. Once logged in, you will need to create a cloud API key . NOTE: In the steps below, I will show the relevant parts of Python code. For the sake of brevity, I’m not going to show complete code that will import required libraries, wait for steps to complete, catch errors, etc. For more robust code you can run, please see the accompanying Jypyter notebook ! Create Serverless Project We will use our newly created API key to perform the next setup steps. First off, create a new Elasticsearch project. url - This is the standard Serverless endpoint for Elastic Cloud project_data - Your Elasticsearch Serverless project settings name - Name we want for the project region_id - Region to deploy optimized_for - Configuration type - We are using vector which isn’t strictly required for the ELSER model but can be suitable if you select a dense vector model such as e5. Create Elasticsearch Python client One nice thing about creating a programmatic project is that you will get back the connection information and credentials you need to interact with it! ELSER Embedding API Once the project is created, which usually takes less than a few minutes, we can prepare it to handle our labs’ data. The first step is to configure the inference API for embedding . We will be using the Elastic Learned Sparse Encoder (ELSER). Command to create the inference endpoint Specify this endpoint will be for generating sparse embeddings model_config - Settings we want to use for deploying our semantic reranking model service - Use the pre-defined elser inference service service_settings.num_allocations - Deploy the model with 8 allocations service_settings.num_threads - Deploy with one thread per allocation inference_id - The name you want to give to you inference endpoint task_type - Specifies this endpoint will be for generating sparse embeddings This single command will trigger Elasticsearch to perform a couple of tasks: It will download the ELSER model. It will deploy (start) the ELSER model with eight allocations and one thread per allocation. It will create an inference API we use in our field mapping in the next step. Index Mapping With our ELSER API created, we will create our index template. index_patterns - The pattern of indices we want this template to apply to. body - The main content of a web page the crawler collects will be written to type - It is a text field copy_to - We need to copy that text to our semantic text field for semantic processing semantic_body is our semantic text field This field will automatically handle chunking of long text and generating embeddings which we will later use for semantic search inference_id specifies the name of the inference endpoint we created above, allowing us to generate embeddings from our ELSER model headings - Heading tags from the html id - crawl id for this document meta_description - value of the description meta tag from the html title is the title of the web page the content is from Other fields will be indexed but auto-mapped. The ones we are focused on pre-defining in the template will not need to be both keyword and text type, which is defined automatically otherwise. Most importantly, for this guide, we must define our semantic_text field and set a source field to copy from with copy_to . In this case, we are interested in performing semantic search on the body of the text, which the crawler indexes into the body . Crawl All the Labs! We can now install and configure the crawler to crawl the Elastic * Labs. We will loosely follow the excellent guide from the Open Crawler released for tech-preview Search Labs blog. The steps below will use docker and run on a MacBook Pro. To run this with a different setup, consult the Open Crawler Github readme . Clone the repo Open the command line tool of your choice. I’ll be using Iterm2. Clone the crawler repo to your machine. Build the crawler container Run the following command to build and run the crawler. Configure the crawler Create a new YAML in your favorite editor (vim): We want to crawl all the documents on the three labs’ sites, but since blogs and tutorials on those sites tend to link out to other parts of elastic.co, we need to set a couple of runs to restrict the scope. We will allow crawling the three paths for our site and then deny anything else. Paste the following in the file and save Copy the configuration into the Docker container: Validate the domain Ensure the config file has no issues by running: Start the crawler When you first run the crawler, processing all the articles on the three lab sites may take several minutes. Confirm articles have been indexed We will confirm two ways. First, we will look at a sample document to ensure that ELSER embeddings have been generated. We just want to look at any doc so we can search without any arguments: Ensure you get results and then check that the field body contains text and semantic_body.inference.chunks.0.embeddings contains tokens. We can check we are gathering data from each of the three sites with a terms aggregation: You should see results that start with one of our three site paths. To the Playground! With our data ingested, chunked, and inference, we can start working on the backend application code that will interact with the LLM for our RAG app. LLM Connection We need to configure a connection for Playground to make API calls to an LLM. As of this writing, Playground supports chat completion connections to OpenAI, AWS Bedrock, and Google Gemini. More connections are planned, so check the docs for the latest list. When you first enter the Playground UI, click on “Connect to an LLM” Since I used OpenAI for the original blog, we’ll stick with that. The great thing about the Playground is that you can switch connections to a different service, and the Playground code will generate code specifically to that service’s API specification. You only need to select which one you want to use today. In this step, you must fill out the fields depending on which LLM you wish to use. As mentioned above, since Playground will abstract away the API differences, you can use whichever supported LLM service works for you, and the rest of the steps in this guide will work the same. If you don’t have an Azure OpenAI account or OpenAI API account, you can get one here (OpenAI now requires a $5 minimum to fund the API account). Once you have completed that, hit “Save,” and you will get confirmation that the connector has been added. After that, you just need to select the indices we will use in our app. You can select multiple, but since all our crawler data is going into elastic-labs, you can choose that one. Click “Add data sources” and you can start using Playground! Select the “restaurant_reviews” index created earlier. Playing in the Playground After adding your data source you will be in the Playground UI. To keep getting started as simple as possible, we will stick with all the default settings other than the prompt. However, for more details on Playground components and how to use them, check out the Playground: Experiment with RAG applications with Elasticsearch in minutes blog and the Playground documentation . Experimenting with different settings to fit your particular data and application needs is an important part of setting up a RAG-backed application. The defaults we will be using are: Querying the semantic_body chunks Using the three nearest semantic chunks as context to pass to the LLM Creating a more detailed prompt The default prompt in Playground is simply a placeholder. Prompt engineering continues to develop as LLMs become more capable. Exploring the ever-changing world of prompt engineering is a blog, but there are a few basic concepts to remember when creating a system prompt: Be detailed when describing the app or service the LLM response is part of. This includes what data will be provided and who will consume the responses. Provide example questions and responses. This technique, called few-shot-prompting , helps the LLM structure its responses. Clearly state how the LLM should behave. Specify the Desired Output Format. Test and Iterate on Prompts. With this in mind, we can create a more detailed system prompt: Feel free to to test out different prompts and context settings to see what results you feel are best for your particular data. For more examples on advanced techiques, check out the Prompt section on the two part blog Advanced RAG Techniques . Again, see the Playground blog post for more details on the various settings you can tweak. Export the Code Behind the scenes, Playground generates all the backend chat code we need to perform semantic search, parse the relevant contextual fields, and make a chat completion call to the LLM. No coding work from us required! In the upper right corner click on the “View Code” button to expand the code flyout You will see the generated python code with all the settings your configured as well as the the functions to make a semantic call to Elasticsearch, parse the results, built the complete prompt, make the call to the LLM, and parse those results. Click the copy icon to copy the code. You can now incorporate the code into your own chat application! Wrapup A lot has changed since the first iteration of this blog over a year ago, and we covered a lot in this blog. You started from a cloud API key, created an Elasticsearch Serverless project, generated a cloud API key, configured the Open Web Crawler, crawled three Elastic Lab sites, chunked the long text, generated embeddings, tested out the optimal chat settings for a RAG application, and exported the code! Where’s the UI, Vestal? Be on the lookout for part two where we will integrate the playground code into a python backend with a React frontend. We will also look at deploying the full chat application. For a complete set of code for everything above, see the accompanying Jypyter notebook Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Jump to ChatGPT and Elasticsearch (April 2023) Updates in Elasticsearch (August 2024) Updated flow: ChatGPT, Elasticsearch & RAG Setup Elasticsearch Serverless Project Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "ChatGPT and Elasticsearch revisited: Building a chatbot using RAG - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/chatgpt-elasticsearch-rag-enhancements",
+    "meta_description": "Learn how to build a chatbot using ChatGPT, Elasticsearch, and the newest RAG features."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Anomaly detection population jobs: Spotify Wrapped, part 3 Anomaly detection can be a daunting task at first, but in this blog, we'll dive into it and figure out how the different jobs can help us find unusual patterns in our Spotify Wrapped data. How To PK By: Philipp Kahr On March 24, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the first part , we talked about how to get your Spotify Wrapped data and how to visualize it. In the second part , we talked about how to process the data and how to visualize it. In this third part, we will talk about how to detect anomalies in your Spotify Wrapped data. What is an anomaly detection job? An anomaly detection job in Elastic tries to find unusual patterns in your data. It uses machine learning to learn the normal behavior of your data and then tries to find data points that do not fit this normal behavior. This can be useful for finding unusual patterns in your data, like a sudden increase in the number of songs you listened to or a sudden change in the average duration of the songs you listened to. There are many different types of anomaly detection jobs: Single metric Multi-metric Population Categorization Rare Geo Advanced Now, we will take a look at a few of those. Population job A population job is a type of anomaly detection job that tries to find unusual patterns in the distribution of your data. The distribution is defined by the population. Let's create that job together! 1. Go to the Machine Learning app in Kibana. 2. Click on \"Create job.\" 3. Select the Spotify-History data view (if you do not have that one yet, don't forget to import the dashboard and data view from here ). 4. Select Population. 5. Select a proper timeframe, for me that is 1st January 2017 and 31st December 2024. 6. Select the field you want to define your population on. Let's use artist . 7. For the Add Metric , we want to use sum(listened_to_ms) . This will sum up the total time you listened to a specific artist. 8. In the right bottom for influencers, add the title . This will give us more information when an anomaly occurs. 9. Bucket Span so that is a big point to discuss. In my case, I do not think there is any sense in looking at something different than a daily pattern. You might even want to check out weekly patterns. It really depends on a lot of factors. Going lower than 1 day could be a bit too fine, and therefore, you'll get not optimal results. 10. Click on next, give it a valid name and click next until it says create job . Ensure that the toggle for Start immediately is turned on. Click create job . It will start immediately and probably look like this: Perfect, let's dive into the details. Click View Results —this page will be very interesting for our analysis Interpreting that is simple. Everything red is a higher scoring anomaly and everything blue or lighter shaded is less high. We immediately spot that on the right-hand side, the band Kraftklub has some bars in red, orange, red, and shading out to blue. When focusing on Kraftklub by doing artist: Kraftklub in the search bar on top, it immediately tells me: September 30th, 2022 Actual 3.15 hours instead of 3.75 minutes . For me, that means that I regularly listen to roughly one song of Kraftklub per day, on this day I listened to over 3 hours of Kraftklub. That is clearly an anomaly. What could have triggered such a listening behavior? The concert was a bit far off, it was on the 19th of November, 2022. Maybe a new album that came out? We can actually spot that by clicking on the anomaly and selecting the View in Discover from the dropdown. Once we are in Discover, we can do additional visualizations and breakdowns to investigate this further. We will pull up the title , album and spotify_metadata.album.release_date to see if a new album came out on that day. We can immediately see that on 22nd September 2022, the album KARGO was released. 8 days later, it appears that I took an interest and started listening to it. What else can we find? Maybe something seasonal? Let's zoom into Fred Again.. which I listen to a lot (as you can tell from blog number two) . There are roughly 10 days back to back as an anomaly. on average, I listened roughly an hour per day to Fred Again. I know that Fred Again.. probably didn't release an album during that time. ES|QL will help us in figuring out more details. When switching to ES|QL, the time picker value will be kept, but any filters in the search bar will be removed. The first thing we need to do is to add that back. The next thing I want to know is how many albums I listened to and whether any were released near those days. We perform a simple count to get the count of records. The values allow us to retrieve the value of the document and not perform any aggregation on it, and we split those up by the album name. I cannot spot any release date near the anomaly days. My \"head date math\" is not always on point, so let's add a difference in days from the release date to the first listen date (during this anomaly) as it is quite clear that an album release did not trigger this anomaly. Single metric A single metric job is a type of anomaly detection job that tries to find unusual patterns in a single metric. The metric is defined by the field you select. So, what could be an interesting single metric? Let's use the listened_to_pct . This tells us how much of a song I complete before I skip to the next one. This is quite intriguing—let’s see if there are certain days when I skip more than others. 1. Go to the Machine Learning app in Kibana. 2. Click on \"Create job.\" 3. Select the Spotify-History data view. 4. Select Single Metric. 5. Select a proper timeframe, for me that is from 1st January 2017 to 31st December 2024. Now it gets tricky, do we use mean(), high_mean(), low_mean() ? Well, it depends on what we want to anomaly on. Mean will give you anomalies for low values such as 0, as well for high. High mean on the other hand is more on the high side, meaning that if the listening completion drops to 0 for a couple of days, it won't trigger an anomaly. Mean and low mean would. High mean is often useful when you want to detect spikes in your data. You don't want an anomaly if your service that is processing data is fast, you don't care if it finishes in 1 ms. But If it takes 10 ms, you want an anomaly. In this case, I guess we should try mean() and see where it takes us. 6. Don't forget to set the bucket to 1 day. 7. Click on next , give it a valid name and click next until it says create job . Ensure that the toggle for Start immediately is turned on. Click create job . It will start immedaitely. Here are the results: It's fascinating to see that on 18th August 2024, I only listened to songs for ~3% of their total duration on average. Usually, I listen to nearly 70% of the song before pressing the next button. All in all, I would say that mean() is a good choice for this metric. Multi metric Now, I want to figure out if I have a single song that spikes within an artist. To do that, we can leverage a multi metric job. 1. Go to the Machine Learning app in Kibana. 2. Click on \"Create job.\" 3. Select the Spotify-History data view. 4. Select Multi Metric. 5. Select a proper timeframe, for me that is from 1st January 2017 to 31st December 2024. 6. Select the distinct count(title) and split by artist , add artist, title, album to the influencers. 7. Don't forget the bucket span to 1d. It might give you a warning about high memory usage because instead of modeling everything in one large model, it now creates a single model for each artist. This can be quite memory intensive, so be careful. Go into the same anomaly detection table view and pick any album at random. I chose Live in Paris from Meute . At first glance, that's super interesting and it shows how accurate anomaly detection can be. I have the song You & Me from the album Live in Paris in my liked songs as well as roughly 10 other songs from different albums. I actively listened to the Live in Paris album on the 27th, 28th, 29th and 30th of December 2024. Conclusion In this blog, we dived into the rabbit hole of anomaly detection and what that can entail. It might be daunting at first, but it's not that complicated once you get a hang of it and it can also provide really good and quick insights. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. In this blog post, we may have used or referred to third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch, ESRE, Elasticsearch Relevance Engine and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is an anomaly detection job? Population job Single metric Multi metric Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Anomaly detection population jobs: Spotify Wrapped, part 3 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-anomaly-detection-jobs",
+    "meta_description": "Learn about anomaly detection jobs Elasticsearch, like population, single metric & multi metric jobs, and how to use them to uncover unusual patterns in your data."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Introducing Elasticsearch vector database to Azure OpenAI Service On Your Data (preview) Microsoft and Elastic partner to add Elasticsearch (preview) as an officially supported vector database and retrieval augmentation technology for Azure OpenAI On Your Data, enabling users to build chat experiences with advanced AI models grounded by enterprise data. Generative AI AT By: Aditya Tripathi On March 26, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Microsoft and Elastic are thrilled to announce that Elasticsearch, the world's most downloaded vector database is an officially supported vector store and retrieval augmented search technology for Azure OpenAI Service On Your Data in public preview. The groundbreaking feature empowers you to leverage the power of OpenAI models, such as GPT-4, and incorporates the advanced capabilities of RAG (Retrieval Augmented Generation) model, directly on your data with enterprise-grade security on Azure. Read the announcement from Microsoft here . Azure OpenAI Service On Your Data makes conversational experiences come alive for your employees, customers and users. With the addition of Elasticsearch vector database and vector search technology, LLMs are enriched by your business data, and conversations deliver superior quality responses out-of-the-box. All of this adds up to helping you better understand your data, and make more informed decisions. Build powerful conversational chat experiences, fast Business users, such as users on e-commerce teams, product managers, and others can add documents from an Elasticsearch index to build a conversational chat experience very quickly. All it takes is a few simple steps to configure the chat experience with parameters such as message history, and you're good to go! Customers can realize benefits pretty much right away.. Quickly roll out conversational experiences to your users, customers, or employees--backed by context from your business data Common use cases include offering internal knowledge search, users self-service, or chatbots that help process common business workflows How Elasticsearch vector database works with On Your Data The new native experience within Azure OpenAI Studio makes adding an Elastic index a simple matter. Developers can pick Elasticsearch as their chosen vector database option from the drop-down menu.. You can bring your existing Elasticsearch indexes to On Your Data—whether those indexes live on Azure or on-prem. Just select Elasticsearch as your data source, add your Elastic endpoint and API key, add an Elastic index, and you're all set! With the Elasticsearch vector database running in the background, users get all the Elastic advantages you'd expect. Precision of BM25 (text) search, the semantic understanding of vector search, and the best of both worlds with hybrid search Document and field level security, so users can only access information they're entitled to based on their permissions Filters, facets, and aggregations that add a real boost to how quickly relevant context is pulled from your organisation's data, and sent to an LLM Choice of leveraging a range of large language model providers, including Azure OpenAI, Hugging Face, or other 3rd party models Elastic on Microsoft Azure: a proven combination Elastic is a proud winner of the Worldwide Microsoft Partner of the Year award for Commercial Marketplace. Elastic and Microsoft customers have been using Elasticsearch and Azure OpenAI to build futuristic search experiences, that leverage the best of AI and machine learning, today . Ali Dalloul, VP, Azure AI Customer eXperience Engineering had this to say about the collaboration, \"By harnessing the power of Azure Cloud and OpenAI, Elastic is driving the development of AI-driven solutions that redefine customer experiences. This partnership is more than just a collaboration; it's a feedback loop of innovation, benefiting customers, Elastic, and Microsoft, while empowering the broader partner ecosystem. We're delighted to offer customers Elasticsearch's strong vector database and retrieval augmentation capabilities to store and search vector embeddings for On Your Data.\" \"This really helps customers connect data wherever it lives. We are happy to open the spectrum of building conversational AI solutions, agnostic to location, including Elasticsearch. We are excited to see how developers build upon this integration.\" Adds Pavan Li, Principal Product Manager of Azure OpenAI Service On Your Data. Elastic's clear strengths in hybrid search--combining BM25/text search with vector search for semantic relevance, was an important differentiator. With the backing of the open source Apache Lucene community, Elastic's vector database has already been widely adopted by large companies for enterprise scale use cases. Try On Your Data with Elasticsearch vector database today Unlock the insights with conversational AI, using Elasticsearch and Azure OpenAI On Your Data today! Visit Azure OpenAI Studio to build your first conversational copilot Connect Elasticsearch with OpenAI models Read more on the Microsoft Tech Community blog Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Build powerful conversational chat experiences, fast How Elasticsearch vector database works with On Your Data Elastic on Microsoft Azure: a proven combination Try On Your Data with Elasticsearch vector database today Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Introducing Elasticsearch vector database to Azure OpenAI Service On Your Data (preview) - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/azure-openai-on-your-data-elasticsearch-vector-database",
+    "meta_description": "Microsoft and Elastic partner to add Elasticsearch (preview) as an officially supported vector database and retrieval augmentation technology for Azure OpenAI On Your Data, enabling users to build chat experiences with advanced AI models grounded by enterprise data."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Advanced integration tests with real Elasticsearch Mastering advanced Elasticsearch integration testing: Faster, smarter, and optimized. How To PP By: Piotr Przybyl On January 31, 2025 Part of Series Integration tests using Elasticsearch Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the previous post on integration testing , we covered shortening the execution time of integration tests relying on real Elasticsearch by changing the approach to data initialization strategies. In this installment, we're about to shorten the test suite duration even further, this time by applying advanced techniques to the Docker container running Elasticsearch and Elasticsearch itself. Please note that the techniques described below can often be cherry-picked: you can choose what makes the most sense in your specific case. Here be dragons: The trade-offs Before we delve into the ins and outs of various approaches in pursuit of performance, it's important to understand that not every optimization should always be applied. While they tend to improve things, they can also make the setup more obscure, especially to an untrained eye. In other words, in the following sections, we're not going to change anything within the tests; only the \"infrastructure code around\" is going to be redesigned. These changes can make the code more difficult to understand for less-experienced team members. Using the techniques described below is not rocket science, but some caution is advised, and experience is recommended. Snapshots When we left off our demo code, we were still initializing Elasticsearch with data for every test . This approach has some advantages, especially if our dataset differs between test cases, e.g., we index somewhat different documents sometimes. However, if all our test cases can rely on the same dataset, we can use the snapshot-and-restore approach. It's helpful to understand how snapshot and restore work in Elasticsearch, which is explained in the official documentation . In our approach, instead of handling this via the CLI or the DevOps method, we will integrate it into the setup code around our tests. This ensures smooth test execution on developer machines as well as in CI/CD. The idea is quite simple: instead of deleting indices and recreating them from scratch before each test, we: Create a snapshot in the container's local file system (if it doesn't already exist, as this will become necessary later). Restore the snapshot before each test. Prepare snapshot location One important thing to note – which makes Elasticsearch different from many relational databases – is that before we send a request to create a snapshot, we first need to register a location where the snapshots can be stored, the so-called repository. There are many storage options available (which is very handy for cloud deployments); in our case, it's enough to keep them in a local directory inside the container. Note: The /tmp/... location used here is suitable only for volatile integration tests and should never be used in a production environment. In production, always store snapshots in a location that is safe and reliable for backups. To avoid the temptation of storing backups in an unsafe location, we first add this to our test: Next, we configure the ElasticsearchContainer to ensure it can use this location as a backup location: Change the setup Now we're ready to append the following logic to our @BeforeAll method: And our @BeforeEach method should start with: Checking if the snapshot exists can be done by verifying that the REPO_LOCATION directory exists and contains some files: The setupDataInContainer() method has minor changes: it's no longer called in @BeforeEach (we execute it on demand when needed), and the DELETE books request can be removed (as it is no longer necessary). To create a snapshot, we first need to register a snapshot location and then store any number of snapshots there (although we'll keep only one, as the tests don't require more): Once the snapshot is created, we can restore it before each test as follows: Please note the following: Before restoring an index, it cannot exist, so we must delete it first. If you need to delete multiple indices, you can do so in a single curl call, e.g., \"https://localhost:9200/indexA,indexB\" . To chain several commands in a container, you don't need to wrap them in separate execInContainer calls; running a simple script can improve readability (and reduce some network round-trips). In the example project, this technique shortened my build time to 26 seconds. While this might not seem like a significant gain at first glance, the approach is a universal technique that can be applied before, or even instead of, switching to _bulk ingestion (discussed in the previous post). In other words, you can prepare data for your tests in @BeforeAll in any way and then make a snapshot of it to use in @BeforeEach . If you want to maximize efficiency, you can even copy the snapshot back to the testing machine using elasticsearch.copyFileFromContainer(...) , allowing it to serve as a form of cache that is only purged when you need to update the dataset (e.g., for new features to test). For a complete example, check out the tag snapshots . RAM the data Sometimes, our test cases are noticeably data-heavy, which can negatively impact performance, especially if the underlying storage is slow. If your tests need to read and write large amounts of data, and the SSD or even hard drive is painfully slow, you can instruct the container to keep the data in RAM – provided you have enough memory available. This is essentially a one-liner, requiring the addition of .withTmpFs(Map.of(\"/usr/share/elasticsearch/data\", \"rw\")) to your container definition. The container setup will look like this: The slower your storage is, the more significant the performance improvement will be, as Elasticsearch will now write to and read from a temporary file system in RAM. Note: As the name implies, this is a temporary file system, meaning it is not persistent. Therefore, this solution is suitable only for tests. Do not use this in production, as it could lead to data loss. To assess how much this solution can improve performance on your hardware, you can try the tag tmpfs . More work, same time The size of a product's codebase grows the most during the active development phase. Then, when it moves into a maintenance phase (if applicable), it usually involves just bug fixes. However, the size of the test base grows continuously, as both features and bugs need to be covered by tests to prevent regressions. Ideally, a bug fix should always be accompanied by a test to prevent the bug from reappearing. This means that even when development is not particularly active, the number of tests will keep growing. The approach described in this section provides hints on how to manage a growing test base without significantly increasing test suite duration, provided sufficient resources are available to enable parallelization. Let's assume, for simplicity, that the number of test cases in our example has doubled (rather than writing additional tests, we will copy the existing ones for this demo). In the simplest approach, we could add three more @Test methods to the BookSearcherIntTest class. We can then observe CPU and memory consumption by using, in a somewhat unorthodox way, one of Java's profilers: Java Flight Recorder. Since we added it to our POM , after running the tests, we can open recording-1.jfr in the main directory. The results may look like this in Environment -> Processes : As you can see, running six tests in a single class doubled the time required. Additionally, the predominant color in the CPU usage chart above is... no color at all, as CPU utilization barely reaches 20% during peak moments. Underutilizing your CPU is wasteful when you’re paying for usage time (whether to cloud providers or in terms of your own wall clock time to get meaningful feedback). Chances are, the CPU you’re using has more than one core. The optimization here is to split the workload into two parts, which should roughly halve the duration. To achieve this, we move the newly added tests into another class called BookSearcherAnotherIntTest and instruct Maven to run two forks for testing using -DforkCount=2 . The full command becomes: With this change, and using JFR and Java Mission Control, we observe the following: Here, the CPU is utilized much more effectively. This example should not be interpreted with a focus on exact numbers. Instead, what matters is the general trend, which applies not only to Java: Check whether your CPU is being properly utilized during tests. If not, try to parallelize your tests as much as possible (though other resources might sometimes limit you). Keep in mind that different environments may require different parallelization factors (e.g., -DforkCount=N in Maven). It’s better to avoid hardcoding these factors in the build script and instead tune them per project and environment: This can be skipped for developer machines if only a single test class is being run. A lower number might suffice for less powerful CI environments. A higher number might work well for more powerful CI setups. For Java, it’s important to avoid having one large class and instead divide tests into smaller classes as much as it makes sense. Different parallelization techniques and parameters apply to other technology stacks, but the overarching goal remains to fully utilize your hardware resources. To refine things further, avoid duplicating setup code across test classes. Keep the tests themselves separate from infrastructure/setup code. For instance, configuration elements like the image version declaration should be maintained in one place. In Testcontainers for Java, we can use (or slightly repurpose) inheritance to ensure that the class containing infrastructure code is loaded (and executed) before the tests. The structure would look like this: For a complete demo, refer again to the example project on GitHub. Reuse - Start once and once only The final technique described in this post is particularly useful for developer machines. It may not be suitable for traditional CIs (e.g., Jenkins hosted in-house) and is generally unnecessary for ephemeral CI environments (like cloud-based CIs, where build machines are single-use and decommissioned after each build). This technique relies on a preview feature of Testcontainers, known as reuse . Typically, containers are cleaned up automatically after the test suite finishes. This default behavior is highly convenient, especially in long-running CIs, as it ensures no leftover containers regardless of the test results. However, in certain scenarios, we can keep a container running between tests so that subsequent tests don’t waste time starting it again. This approach is especially beneficial for developers working on a feature or bug fix over an extended period (sometimes days), where the same test (class) is run repeatedly. How to enable reuse Enabling reuse is a two-step process: 1. Mark the container as reusable when declaring it: 2. Opt-in to enable the reuse feature in the environments where it makes sense (e.g., on your development machine). The simplest and most persistent way to do this on a developer workstation is by ensuring that the configuration file in your $HOME directory has the proper content. In ~/.testcontainers.properties , include the following line: That’s all! On first use, tests won’t be any faster because the container still needs to start. However, after the initial test: Running docker ps will show the container still running (this is now a feature, not a bug). Subsequent tests will be faster. Note: Once reuse is enabled, stopping the containers manually becomes your responsibility. Leveraging reuse with snapshots or init data The reuse feature works particularly well in combination with techniques like copying initialization data files to the container only once or using snapshots. With reuse enabled, there’s no need to recreate snapshots for subsequent tests, saving even more time. All the pieces of optimization start falling into place. Reuse forked containers While reuse works well in many scenarios, issues arise when combining reuse with multiple forks during the second run. This can result in errors or gibberish output related to containers or Elasticsearch being in an improper state. If you wish to use both improvements simultaneously (e.g., running many integration tests on a powerful workstation before submitting a PR), you’ll need to make an additional adjustment. The problem The issue may manifest itself in errors like the following: This happens due to how Testcontainers identifies containers for reuse. When both forks start and no Elasticsearch containers are running, each fork initializes its own container. Upon restarting, however, each fork looks for a reusable container and finds one. Because all containers look identical to Testcontainers, both forks may select the same container. This results in a race condition, where more than one fork tries to use the same Elasticsearch instance. For example, one fork may be reinstating a snapshot while the other is attempting to do the same, leading to errors like the one above. The solution To resolve this, we need to introduce differentiation between containers and ensure that forks select containers deterministically based on these differences. Step 1: Update pom.xml Modify the Surefire configuration in your pom.xml to include the following: This adds a unique identifier ( fork_${surefire.forkNumber} ) for each fork as an environment variable. Step 2: Modify container declaration Adjust the Elasticsearch container declaration in your code to include a label based on the fork identifier: The effect These changes ensure that each fork creates and uses its own container. The containers are slightly different due to the unique labels, allowing Testcontainers to assign them deterministically to specific forks. This approach eliminates the race condition, as no two forks will attempt to reuse the same container. Importantly, the functionality of Elasticsearch within the containers remains identical, and tests can be distributed between the forks dynamically without affecting the outcome. Was it really worth it? As warned at the beginning of this post, the improvements introduced here should be applied with caution, as they make the setup code of our tests less intuitive. What are the benefits? We started this post with three integration tests taking around 25 seconds on my machine. After applying all the improvements together and doubling the number of actual tests to six, the execution time on my laptop dropped to 8 seconds. Doubled the tests; shortened the build by two-thirds. It's up to you to decide if it makes sense for your case. ;-) It doesn't stop here This miniseries on testing with real Elasticsearch ends here. In part one we discussed when it makes sense to mock Elasticsearch index and when it's a better idea to go for integration tests. In part two , we have addressed the most common mistakes that make your integration tests slow. This part three goes the extra mile to make integration tests run even faster, in seconds instead of minutes. There are more ways to optimize your experience and reduce costs associated with integration tests of systems using Elasticsearch. Don’t hesitate to explore these possibilities and experiment with your tech stack. If your case involves any of the techniques mentioned above, or if you have any questions, feel free to reach out on our Discuss forums or community Slack channel . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Here be dragons: The trade-offs Snapshots Prepare snapshot location Change the setup RAM the data Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Advanced integration tests with real Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-improve-performance-integration-tests",
+    "meta_description": "Here's how to master advanced Elasticsearch integration testing: We'll explain how to make integration tests run faster, in seconds instead of minutes."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Ingest geospatial data into Elasticsearch with Kibana for use in ES|QL How to use Kibana and the csv ingest processor to ingest geospatial data into Elasticsearch for use with search in Elasticsearch Query Language (ES|QL). Elasticsearch has powerful geospatial search features, which are now coming to ES|QL for dramatically improved ease of use and OGC familiarity. But to use these features, we need Geospatial data. How To CT By: Craig Taverner On October 25, 2024 Part of Series Elasticsearch geospatial search Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We recently published a blog describing how to use the new geospatial search features in ES|QL , Elasticsearch's new, powerful piped query language . To use these features, you need to have geospatial data in Elasticsearch. So in this blog, we'll show you how to ingest geospatial data, and how to use it in ES|QL queries. Importing geospatial data using Kibana The data we used for the examples in the previous blog were based on data we use internally for integration tests. For your convenience we've, included it here in the form of a few CSV files that can easily be imported using Kibana. The data is a mix of airports, cities, and city boundaries. You can download the data from: airports.csv This contains a merger of three datasets: Airports (names, locations and related data) from Natural Earth City locations from SimpleMaps Airport elevations from The global airport database airport_city_boundaries.csv This contains a merger of airport and city names from above with one new source: City boundaries from OpenStreetMap As you can guess, we spent some time combining these data sources into the two files above, with the goal of being able to test the geospatial features of ES|QL. This might not be quite the same as your specific data needs, but hopefully this gives you an idea of what is possible. In particular we want to demonstrate a few interesting things: Importing data with geospatial fields together with other indexable data Importing both geo_point and geo_shape data and using them together in queries Importing data into two indexes that can be joined using a spatial relationship Creating an ingest pipeline for facilitating future imports (beyond Kibana) Some examples of ingest processors, like csv , convert and split While we'll be discussing working with CSV data in this blog, it is important to understand there are several ways to add geo data using Kibana . Within the Map application you can upload delimited data like CSV, GeoJSON and ESRI ShapeFiles and you can also draw shapes directly in the Map. For this blog we'll focus on importing CSV files from the Kibana home page. Importing the airports The first file, airports.csv , has some interesting quirks we need to deal with. Firstly the columns have additional whitespace separating them, not typical of CSV files. Secondly, the type field is a multi-value field, which we need to split into separate fields. Finally, some fields are not strings, and need to be converted to the right type. All this can be done using Kibana's CSV import facility. Start at the Kibana home-page. There is a section called \"Get started by adding integrations\", which has a link called \"Upload a file\": Click on this link, and you will be taken to the \"Upload file\" page. Here you can drag and drop the airports.csv file, and Kibana will analyse the file and present you with a preview of the data. It should have automatically detected the delimiter as a comma, and the first row as the header row. However, it probably did not trim the extra whitespace between the columns, nor determined the types of the fields, assuming all fields are either text or keyword . We need to fix this. Click Override settings and check the checkbox for Should trim fields , and Apply to close the settings. Now we need to fix the types of the fields. This is available on the next page, so go ahead and click Import . First choose an index name, and then select Advanced to get to the field mappings and ingest processor page. Here we need to make changes to both the field mappings for the index, as well as the ingest pipeline for importing the data. Firstly, while Kibana likely auto-detected the scalerank field as long , it mistakenly perceived the location and city_location fields as keyword . Edit them to geo_point , ending up with mappings that look something like: You have some flexibility here, but note that what type you choose will affect how the field is indexed and what kind of queries are possible. For example, if you leave location as keyword you cannot perform any geospatial search queries on it. Similarly, if you leave elevation as text you cannot perform numerical range queries on it. Now it's time to fix the ingest pipeline. If Kibana auto-detected scalerank as long above, it will also have added a processor to convert the field to a long . We need to add a similar processor for the elevation field, this time converting it to double . Edit the pipeline to ensure you have this conversion in place. Before saving this, we want one more conversion, to split the type field into multiple fields. Add a split processor to the pipeline, with the following configuration: The final ingest pipeline should look like: Note that we did not add a convert processor for the location and city_location fields. This is because the geo_point type in the field mapping already understands the WKT format of the data in these fields. The geo_point type can understand a range of formats, including WKT, GeoJSON, and more . If we had, for example, two columns in the CSV file for latitude and longitude , we would have needed to add either a script or a set processor to combine these into a single geo_point field (eg. \"set\": {\"field\": \"location\", \"value\": \"{{lat}},{{lon}}\"} ). We are now ready to import the file. Click Import and the data will be imported into the index with the mappings and ingest pipeline we just defined. If there are any errors ingesting the data, Kibana will report them here, so you can either edit the source data or the ingest pipeline and try again. Notice that a new ingest-pipeline has been created. This can be viewed by going to the Stack Management section of Kibana, and selecting Ingest pipelines . Here you can see the pipeline we just created, and edit it if necessary. In fact the Ingest pipelines section can be used for creating and testing ingest pipelines, a very useful feature if you plan to do even more complex ingests. If you want to explore this data immediately skip down to the later sections, but if you want to import the city boundaries as well, continue reading. Importing the city boundaries The city boundaries file available at airport_city_boundaries.csv is a bit simpler to import than the previous example. It contains a city_boundary field that is a WKT representation of the city boundary as a POLYGON , and a city_location field that is a geo_point representation of the city location. We can import this data in a similar way to the airports data, but with a few differences: We needed to select the override setting Has header row since that was not autodetected We did not need to trim fields, as the data was already clean of extra whitespace We did not need to edit the ingest pipeline because all types were either string or spatial types We did, however, have to edit the field mappings to set the city_boundary field to geo_shape and the city_location field to geo_point Our final field mappings looked like: As with the airports.csv import before, simply click Import to import the data into the index. The data will be imported with the mappings we edited and ingest pipeline that Kibana defined. Exploring geospatial data with dev-tools In Kibana it is usual to explore the indexed data with \"Discover\". However, if your intention is to write your own app using ES|QL queries, it might be more interesting to try access the raw Elasticsearch API. Kibana has a convenient console for experimenting with writing queries. This is called the Dev Tools console, and can be found in the Kibana side-bar. This console talks directly to the Elasticsearch cluster, and can be used to run queries, create indexes, and more. Try the following: This should provide the following results: distance abbrev name location country city elevation 273418.05776847183 HAM Hamburg POINT (10.005647830925 53.6320011640866) Germany Norderstedt 17.0 337534.653466062 TXL Berlin-Tegel Int'l POINT (13.2903090925074 52.5544287044101) Germany Hohen Neuendorf 38.0 483713.15032266214 OSL Oslo Gardermoen POINT (11.0991032762581 60.1935783171386) Norway Oslo 208.0 522538.03148094116 BMA Bromma POINT (17.9456175406145 59.3555902065112) Sweden Stockholm 15.0 522538.03148094116 ARN Arlanda POINT (17.9307299016916 59.6511203397372) Sweden Stockholm 38.0 624274.8274399083 DUS Düsseldorf Int'l POINT (6.76494446612174 51.2781820420774) Germany Düsseldorf 45.0 633388.6966435644 PRG Ruzyn POINT (14.2674849854076 50.1076511703671) Czechia Prague 381.0 635911.1873311149 AMS Schiphol POINT (4.76437693232812 52.3089323889822) Netherlands Hoofddorp -3.0 670864.137958866 FRA Frankfurt Int'l POINT (8.57182286907608 50.0506770895207) Germany Frankfurt 111.0 683239.2529970079 WAW Okecie Int'l POINT (20.9727263383587 52.171026749259) Poland Piaseczno 111.0 Visualizing geospatial data with Kibana Maps Kibana Maps is a powerful tool for visualizing geospatial data. It can be used to create maps with multiple layers, each layer representing a different dataset. The data can be filtered, aggregated, and styled in various ways. In this section, we will show you how to create a map in Kibana Maps using the data we imported in the previous section. In the Kibana menu, navigate to Analytics -> Maps to open a new map view. Click on Add Layer and select Documents , choosing the data view airports and then editing the layer style to color the markers using the elevation field, so we can easily see how high each airport is. Click 'Keep changes' to save the Map: Now add a second layer, this time selecting the airport_city_boundaries data view. This time, we will use the city_boundary field to style the layer, and set the fill color to a light blue. This will show the city boundaries on the map. Make sure to reorder the layers to ensure that the airport markers are on top. Spatial joins ES|QL does not support JOIN commands, but you can achieve a special case of a join using the ENRICH command . This command operates akin to a 'left join' in SQL, allowing you to enrich results from one index with data from another index based on a spatial relationship between the two datasets. For example, let's enrich the results from a table of airports with additional information about the city they serve by finding the city boundary that contains the airport location, and then perform some statistics on the results: If you run this query without first preparing the enrich index, you will get an error message like: This is because, as we mentioned before, ES|QL does not support true JOIN commands. One important reason for this is that Elasticsearch is a distributed system, and joins are expensive operations that can be difficult to scale. However, the ENRICH command can be quite efficient, because it makes use of specially prepared enrich indexes that are duplicated across the cluster, enabling local joins to be performed on each node. To better understand this, let's focus on the ENRICH command in the query above: This command instructs Elasticsearch to enrich the results retrieved from the airports index, and perform an intersects join between the city_location field of the original index, and the city_boundary field of the airport_city_boundaries index, which we used in a few examples earlier. But some of this information is not clearly visible in this query. What we do see is the name of an enrich policy city_boundaries , and the missing information is encapsulated within that policy definition. Here we can see that it will perform a geo_match query ( intersects is the default), the field to match against is city_boundary , and the enrich_fields are the fields we want to add to the original document. One of those fields, the region was actually used as the grouping key for the STATS command, something we could not have done without this 'left join' capability. For more information on enrich policies, see the enrich documentation . The enrich indexes and policies in Elasticsearch were originally designed for enriching data at index time, using data from another prepared enrich index. In ES|QL, however, the ENRICH command works at query time, and does not require the use of ingest pipelines. This effectively makes it quite similar to an SQL LEFT JOIN , except youn cannot join any two indexes, only a normal index on the left with a specially prepared enrich index on the right. In either case, whether for ingest pipelines or use in ES|QL, it is necessary to perform a few preparatory steps to set up the enrich index and policy. We already imported the airport_city_boundaries index above, but this is not directly usable as an enrich index in the ENRICH command. We first need to perform two steps: Create the enrich policy described above to define the source index, the field in the source index to match against, and the fields to return once matched. Execute this policy to create the enrich index. This will build a special internal index, by reading the original source index into a more efficient data structure which is copied across the cluster. The enrich policy can be created using the following command: And the policy can be executed using the following command: Note that if you ever change the contents of the airport_city_boundaries index, you will need to re-execute this policy to see the changes reflected in the enrich index. Now let's run the original ES|QL query again: This returns the top 5 regions with the most airports, along with the centroid of all the airports that have matching regions, and the range in length of the WKT representation of the city boundaries within those regions: centroid count region POINT (-12.139086859300733 31.024386116624648) 126 null POINT (-83.10398317873478 42.300230911932886) 3 Detroit POINT (39.74537850357592 47.21613017376512) 3 городской округ Батайск POINT (-156.80986787192523 20.476673701778054) 3 Hawaii POINT (-73.94515332765877 40.70366442203522) 3 City of New York POINT (-83.10398317873478 42.300230911932886) 3 Detroit POINT (-76.66873019188643 24.306286952923983) 2 New Providence POINT (-3.0252167768776417 51.39245774131268) 2 Cardiff POINT (-115.40993484668434 32.73126147687435) 2 Municipio de Mexicali POINT (41.790108773857355 50.302146775648) 2 Центральный район POINT (-73.88902732171118 45.57078813901171) 2 Montréal You may also notice that the most commonly found region was null . What could this imply? Recall that I likened this command to a 'left join' in SQL, meaning if no matching city boundary is found for an airport, the airport is still returned but with null values for the fields from the airport_city_boundaries index. It turns out there were 125 airports that found no matching city_boundary , and one airport with a match where the region field was null . This lead to a count of 126 airports with no region in the results. If your use case requires that all airports can be matched to a city boundary, that would require sourcing additional data to fill in the gaps. It would be necessary to determine two things: which records in the airport_city_boundaries index do not have city_boundary fields which records in the airports index do not match using the ENRICH command (ie. do not intersect) Using ES|QL for geospatial data in Kibana Maps Kibana has added support for Spatial ES|QL in the Maps application. This means that you can now use ES|QL to search for geospatial data in Elasticsearch, and visualize the results on a map. There is a new layer option in the add layers menu, called \"ES|QL\". Like all of the geospatial features described so far, this is in \"technical preview\". Selecting this option allows you to add a layer to the map based on the results of an ES|QL query. For example, you could add a layer to the map that shows all the airports in the world. Or you could add a layer that shows the polygons from the airport_city_boundaries index, or even better, how about that complex ENRICH query above that generates statistics for how many airports are in each region? What's next The previous Geospatial search blog focused on the use of functions like ST_INTERSECTS to perform searching, available in Elasticsearch since 8.14. And this blog shows you how to import the data we used for those searches. However, Elasticsearch 8.15 came with a particularly interesting function: ST_DISTANCE which can be used to perform efficient spatial distance searches, and this will be the topic of the next blog! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Importing geospatial data using Kibana Importing the airports Importing the city boundaries Exploring geospatial data with dev-tools Visualizing geospatial data with Kibana Maps Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Ingest geospatial data into Elasticsearch with Kibana for use in ES|QL - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/geospatial-data-ingest-for-esql",
+    "meta_description": "Here's how to ingest geospatial data into Elasticsearch using Kibana. This blog also covers how to use ES|QL to search and visualize geospatial data. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Build RAG quickly with minimal code in Elastic 8.15 Learn how to build an end-to-end RAG pipeline with the S3 Connector, semantic_text datatype, and Elastic Playground. How To HC By: Han Xiang Choong On September 4, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elastic 8.15 is out, and Semantic Search is easier than ever to pull off. We're going to cover how to accomplish all of these tasks in 15 minutes: Store your documents in some data storage service like an AWS S3 Bucket Set up an Elastic S3 Connector Upload an embedding model using the eland library , set-up an inference API in Elastic Connect that to an index that uses the semantic_text datatype Add your inference API to that index Configure and sync content with the S3 Connector Use the Elastic Playground immediately You will need: An Elastic Cloud Deployment updated to Elastic 8.15 An S3 bucket An LLM API service (Anthropic, Azure, OpenAI, Gemini) And that's it! Let's get this done. Collecting data To follow along with this specific demo, I've uploaded a zip file containing the data used here . It's the first 60 or so pages of the Silmarillion , each as a separate pdf file. I'm going through a Lord of the Rings kick at the moment. Feel free to download it and upload it to your S3 bucket! Splitting the document into individual pages is sometimes necessary for large documents, as the native Elastic S3 Connector will not ingest content from files over 10MB in size. I use this Python script for splitting a PDF into individual pages: Setting up the S3 connector The connector can ingest a huge variety of data types . Here, we're sticking to an S3 bucket loaded with pdf pages. My S3 Bucket I'll just hop on my Elastic Cloud deployment, go to Search->Content->Connectors, and make a new connector called aws-connector, with all the default settings. Then I'll open up the configuration and add the name of my bucket, and the secret key and access key tagged to my AWS user. Elastic Cloud S3 Connector Configuration Run a quick sync to verify that everything is working okay. Synchronization will ingest every uningested file in your data source, extract its content, and store it as a unique document within your index. Each document will contain its original filename. Data source documents with the same filenames as existing indexed documents won't be reingested, so have no fear! Synchronization can also be regularly scheduled. The method is described in the documentation. If everything is working fine, assuming my AWS credentials and permissions are all in order, the data's going to go into an index called aws-connector. First successful sync of our S3 connector Looks like it's all good. Let's grab our embedding model! Uploading an embedding model Eland is a Python Elasticsearch client which makes it easy to convert numpy, pandas, and scikit-learn functions to Elasticsearch powered equivalents. For our purposes, it will be our method of uploading models from HuggingFace, for deployment in our Elasticsearch cluster. You can install eland like so: Now get to a bash editor and make this little .sh script, filling out each parameter appropriately: MODEL_ID refers to a model taken from huggingface. I'm choosing all-MiniLM-L6-v2 mainly because it is very good, but also very small, and easily runnable on a CPU. Run the bash script, and once done, your model should appear in your Elastic deployment under Machine Learning -> Model Management -> Trained Models. Deploy the model you just uploaded with eland Just click the circled play button to deploy the model, and you're done. Setting up your semantic_text index Time to set up semantic search. Navigate to Management -> Dev Tools, and delete your index because it does not have the semantic_text datatype enabled. Check the model_id of your uploaded model with: Now create an inference endpoint called minilm-l6, and pass it the correct model_id. Let's not worry about num_allocations and num_threads, because this isn't production and minilm-l6 is not a big-boy. Now recreate the aws-connector index. Set the \"body\" property as type \"semantic_text\", and add the id of your new inference endpoint. Get back to your connector and run another full-content sync (For real this time!). The incoming documents are going to be automatically chunked into blocks of 250 words, with an overlap of 100 words. You don't have to do anything explicitly. Now that's convenient! Sync your S3 connector for real this time! And it's done. Check out your aws-connector index, there'll be 140 documents in there, each of which is now an embedded chunk: Index full of chunked documents Do RAG with the Elastic Playground Scurry over to Search -> Build -> Playground and add an LLM connector of your choice. I'm using Azure OpenAI: Set your endpoint and API key Now let's set up a chat experience. Click Add Data Sources and select aws-connector: Set up your chat experience Check out the query tab of your new chat experience. Assuming everything was properly set up, it will automatically be set to this hybrid search query, with the model_id minilm-l6. Default hybrid search query Let's ask a question! We'll take three documents for the context, and add my special RAG prompt: Add a prompt and select the number of search results for context Query: Describe the fall from Grace of Melkor We'll use a relatively open-ended RAG query. To be answered satisfactorily, it will need to draw information from multiple parts of the text. This will be a good indicator of whether RAG is working as expected. Well I'm convinced. It even has citations! One more for good luck: Query: Who were the greatest students of Aule the Smith? This particular query is nothing too difficult, I'm simply looking for a reference to a very specific quote from the text. Let's see how it does! Well, that's correct. Looks like RAG is working just fine. Conclusion That was incredibly convenient and painless — hot damn! We're truly living in the future. I can definitely work with this. I hope you're as excited to try it as I am to show it off. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Collecting data Setting up the S3 connector Uploading an embedding model Setting up your semantic_text index Do RAG with the Elastic Playground Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Build RAG quickly with minimal code in Elastic 8.15 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/build-rag-in-elastic-815",
+    "meta_description": "Learn how to easily build a RAG pipeline with the S3 Connector, semantic_text datatype and Elastic Playground."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Serverless semantic search with ELSER in Python: Exploring Summer Olympic games history This blog shows how to fetch information from an Elasticsearch index, in a natural language expression, using semantic search. We will load previous olympic games data set and then use the ELSER model to perform semantic searches. How To EK By: Essodjolo Kahanam On July 26, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog shows how to fetch information from an Elasticsearch index, in a natural language expression, using semantic search. We will create a serverless Elasticsearch project, load previous olympic games data set into an index, generate inferred data (in a sparse vector field) using the inference processor along with ELSER model, and finally search for historical olympic competition information in a natural language expression, thanks to text expansion query . The tools and the data set For this project we will use an Elasticsearch serverless project, and the serverless Python client (elasticsearch_serverless) for interactions with Elasticsearch. To create a serverless project, simply follow the get started with serverless guide. More information on serverless including pricing can be found here . When setting up a serverless project, be sure to select the option for Elasticsearch and the general purpose option for working this tutorial. The data set used is that of summer olympic games competitors from 1896 to 2020, obtained from Kaggle ( Athletes_summer_games.csv ). It contains information about the competition year, the type of competition, the name of the participant, whether they won a medal or not and which medal eventually, along with other information. For the data set manipulation, we will use Eland , a Python client and toolkit for DataFrames and machine learning in Elasticsearch. Finally the natural language processing (NLP) model used is Elastic Learned Sparse EncodeR ( ELSER ), a retrieval model trained by Elastic that allows to retrieve more relevant search results through semantic search. Before following the steps below, please make sure you have installed the serverless Python client and Eland. Please note the versions I used below. If you are not using the same versions, you might need to adjust the code to any eventual syntax change in the versions you are using. Download and deploy ELSER model We will use the Python client to download and deploy the ELSER model. Before doing that, let's first confirm that we can connect to our serverless project. The URL and API key below are read from environment variables; you need to use the appropriate values in your case, or use whichever method you prefer for reading credentials. If everything is properly configured, you should get an output like below: Now that we've confirmed that the Python client is successfully connecting to the serverless Elasticsearch project, let’s download and deploy the ELSER model. We will check if the model was previously deployed and delete it in order to perform a fresh install. Also, as the deploy phase could take a few minutes, we will continuously check the model configuration information to make sure that the model definition is present before moving to the next phase. For more information check the Get trained models API. Once we get the confirmation that the model is downloaded and ready to be deployed, we can go ahead and start ELSER. It can take a little while to fully be ready to be deployed. Load the data set into Elasticsearch using Eland eland.csv_to_eland allows reading a comma-separated values (csv) file into a data frame stored in an Elasticsearch index. We will use it to load the Olympics data ( Athletes_summer_games.csv ) into Elasticsearch. The es_type_overrides allows to override default mappings. After executing the lines above, the data will be written in the index elser-olympic-games . You can also retrieve the resulting dataframe ( eland.DataFrame ) into a variable for further manipulations. Create an ingest pipeline for inference based on ELSER The next step in our journey to explore past Olympic competition data using semantic search is to create an ingest pipeline containing an inference processor that runs the ELSER model. A set of fields has been selected and concatenated into a single field on which the inference processor will work. Depending on your use case, you might want to use another strategy. The concatenation is done using the script processor. The inference processor uses the previously deployed ELSER model, taking as input the concatenated field, and storing the output in a sparse vector type field (see following point). Preparing the index This is the last stage before being able to query past Olympic competition data using natural language expressions. We will update the previously created index’s mapping adding a sparse vector type field. Update the mapping: add a sparse vector field We will update the index mapping by adding a field that will hold the concatenated data, and a sparse vector field that will hold the inferred information computed by the inference processor using the ELSER model. Populate the sparse vector field We will run an update by query to call the previously created ingest pipeline in order to populate the sparse vector field in each document. The request will take a few moments depending on the number of documents, and the number of allocations and threads per allocation used for deploying ELSER. Once this step is completed, we can now start exploring past olympic data set using semantic search. Let's explore the Olympic data set using semantic search Now we will use text expansion queries to retrieve information about past Olympic game competitions using natural language expressions. Before going to the demonstration, let's create a function to retrieve and format the search results. The function above will receive a question about past Olympic games competition winners, performing a semantic search using Elastic’s text expansion query. The retrieved results are formatted and printed. Notice that we force the existence of medals in the query, as we are only interested in the winners. We also limited the size of the result to 3 as we expect three winners (gold, silver, bronze). Again, based on your use case, you might not necessarily do the same thing. 🏌️‍♂️ “Who won the Golf competition in 1900?” Request: Output: 🏃‍♀️ “2004 Women's Marathon winners” Request: Output: 🏹 “Women archery winners of 1908” Request: Output: 🚴‍♂️ “Who won the individual cycling competition in 1972?” Request: Output: Conclusion This blog showed how you can perform semantic search with the Elastic Learned Sparse EncodeR (ELSER) NLP model, in Python programming language using Serverless. You will want to make sure you turn off severless after running this tutorial to avoid any extra charges. To go further, feel free to check out our Elasticsearch Relevance Engine (ESRE) Engineer course where you can learn how to leverage the Elasticsearch Relevance Engine (ESRE) and large language models (LLMs) to build advanced RAG (Retrieval-Augmented Generation) applications that combine the storage, processing, and search features of Elasticsearch with the generative power of an LLM. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to The tools and the data set Download and deploy ELSER model Load the data set into Elasticsearch using Eland Create an ingest pipeline for inference based on ELSER Preparing the index Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Serverless semantic search with ELSER in Python: Exploring Summer Olympic games history - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/serverless-semantic-search-with-elser-in-python",
+    "meta_description": "This blog shows how to fetch information from an Elasticsearch index, in a natural language expression, using semantic search. We will load previous olympic games data set and then use the ELSER model to perform semantic searches."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Text similarity search with vector fields This post explores how text embeddings and Elasticsearch’s new dense_vector type could be used to support similarity search. Vector Database JT By: Julie Tibshirani On October 6, 2022 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. From its beginnings as a recipe search engine , Elasticsearch was designed to provide fast and powerful full-text search. Given these roots, improving text search has been an important motivation for our ongoing work with vectors. In Elasticsearch 7.0, we introduced experimental field types for high-dimensional vectors, and now the 7.3 release brings support for using these vectors in document scoring. This post focuses on a particular technique called text similarity search. In this type of search, a user enters a short free-text query, and documents are ranked based on their similarity to the query. Text similarity can be useful in a variety of use cases: Question-answering: Given a collection of frequently asked questions, find questions that are similar to the one the user has entered. Article search: In a collection of research articles, return articles with a title that’s closely related to the user’s query. Image search: In a dataset of captioned images, find images whose caption is similar to the user’s description. A straightforward approach to similarity search would be to rank documents based on how many words they share with the query. But a document may be similar to the query even if they have very few words in common — a more robust notion of similarity would take into account its syntactic and semantic content as well. The natural language processing (NLP) community has developed a technique called text embedding that encodes words and sentences as numeric vectors. These vector representations are designed to capture the linguistic content of the text, and can be used to assess similarity between a query and a document. This post explores how text embeddings and Elasticsearch’s dense_vector type could be used to support similarity search. We’ll first give an overview of embedding techniques, then step through a simple prototype of similarity search using Elasticsearch. Note: Using text embeddings in search is a complex and evolving area. This blog is not a recommendation for a particular architecture or implementation. Start here to learn how you can enhance your search experience with the power of vector search . What are text embeddings? Let's take a closer look at different types of text embeddings, and how they compare to traditional search approaches. Word embeddings A word embedding model represents a word as a dense numeric vector. These vectors aim to capture semantic properties of the word — words whose vectors are close together should be similar in terms of semantic meaning. In a good embedding, directions in the vector space are tied to different aspects of the word’s meaning. As an example, the vector for \"Canada\" might be close to \"France\" in one direction, and close to \"Toronto\" in another. The NLP and search communities have been interested in vector representations of words for quite some time. There was a resurgence of interest in word embeddings in the past few years, when many traditional tasks were being revisited using neural networks. Some successful word embedding algorithms were developed, including word2vec and GloVe . These approaches make use of large text collections, and examine the context each word appears in to determine its vector representation: The word2vec Skip-gram model trains a neural network to predict the context words around a word in a sentence. The internal weights of the network give the word embeddings. In GloVe, the similarity of words depends on how frequently they appear with other context words. The algorithm trains a simple linear model on word co-occurrence counts. Many research groups distribute models that have been pre-trained on large text corpora like Wikipedia or Common Crawl, making them convenient to download and plug into downstream tasks. Although pre-trained versions are sometimes used directly, it can be helpful to adjust the model to fit the specific target dataset and task. This is often accomplished by running a 'fine-tuning' step on the pre-trained model. Word embeddings have proven quite robust and effective, and it is now common practice to use embeddings in place of individual tokens in NLP tasks like machine translation and sentiment classification. Sentence embeddings More recently, researchers have started to focus on embedding techniques that represent not only words, but longer sections of text. Most current approaches are based on complex neural network architectures, and sometimes incorporate labelled data during training to aid in capturing semantic information. Once trained, the models are able to take a sentence and produce a vector for each word in context, as well as a vector for the entire sentence. Similarly to word embedding, pre-trained versions of many models are available, allowing users to skip the expensive training process. While the training process can be very resource-intensive, invoking the model is much more lightweight — sentence embedding models are typically fast enough to be used as part of real-time applications. Some common sentence embedding techniques include InferSent , Universal Sentence Encoder , ELMo , and BERT . Improving word and sentence embeddings is an active area of research, and it’s likely that additional strong models will be introduced. Comparison to traditional search approaches In traditional information retrieval, a common way to represent text as a numeric vector is to assign one dimension for each word in the vocabulary. The vector for a piece of text is then based on the number of times each term in the vocabulary appears. This way of representing text is often referred to as \"bag of words,\" because we simply count word occurrences without regard to sentence structure. Text embeddings differ from traditional vector representations in some important ways: The encoded vectors are dense and relatively low-dimensional, often ranging from 100 to 1,000 dimensions. In contrast, bag of words vectors are sparse and can comprise 50,000+ dimensions. Embedding algorithms encode the text into a lower-dimensional space as part of modeling its semantic meaning. Ideally, synonymous words and phrases end up with a similar representation in the new vector space. Sentence embeddings can take the order of words into account when determining the vector representation. For example the phrase \"tune in\" may be mapped as a very different vector than \"in tune\". In practice, sentence embeddings often don’t generalize well to large sections of text. They are not commonly used to represent text longer than a short paragraph. Using embeddings for similarity search Let’s suppose we had a large collection of questions and answers. A user can ask a question, and we want to retrieve the most similar question in our collection to help them find an answer. We could use text embeddings to allow for retrieving similar questions: During indexing, each question is run through a sentence embedding model to produce a numeric vector. When a user enters a query, it is run through the same sentence embedding model to produce a vector. To rank the responses, we calculate the vector similarity between each question and the query vector. When comparing embedding vectors, it is common to use cosine similarity . This repository gives a simple example of how this could be accomplished in Elasticsearch. The main script indexes ~20,000 questions from the StackOverflow dataset , then allows the user to enter free-text queries against the dataset. We’ll soon walk through each part of the script in detail, but first let’s look at some example results. In many cases, the method is able to capture similarity even when there was not strong word overlap between the query and indexed question: \"zipping up files\" returns \"Compressing / Decompressing Folders & Files\" \"determine if something is an IP\" returns \"How do you tell whether a string is an IP or a hostname\" \"translate bytes to doubles\" returns \"Convert Bytes to Floating Point Numbers in Python\" Implementation details The script begins by downloading and creating the embedding model in TensorFlow. We chose Google’s Universal Sentence Encoder, but it’s possible to use many other embedding methods. The script uses the embedding model as-is, without any additional training or fine-tuning. Next, we create the Elasticsearch index, which includes mappings for the question title, tags, and also the question title encoded as a vector: In the mapping for dense_vector, we’re required to specify the number of dimensions the vectors will contain. When indexing a title_vector field, Elasticsearch will check that it has the same number of dimensions as specified in the mapping. To index documents, we run the question title through the embedding model to obtain a numeric array. This array is added to the document in the title_vector field. When a user enters a query, the text is first run through the same embedding model and stored in the parameter query_vector. As of 7.3, Elasticsearch provides a cosineSimilarity function in its native scripting language. So to rank questions based on their similarity to the user’s query, we use a script_score query: We make sure to pass the query vector as a script parameter to avoid recompiling the script () on every new query. Since Elasticsearch does not allow negative scores, it's necessary to add one to the cosine similarity. | Note: this blog post originally used a different syntax for vector functions that was available in Elasticsearch 7.3, but was deprecated in 7.6. | Important limitations The script_score query is designed to wrap a restrictive query, and modify the scores of the documents it returns. However, we’ve provided a match_all query, which means the script will be run over all documents in the index. This is a current limitation of vector similarity in Elasticsearch — vectors can be used for scoring documents, but not in the initial retrieval step. Support for retrieval based on vector similarity is an important area of ongoing work . To avoid scanning over all documents and to maintain fast performance, the match_all query can be replaced with a more selective query. The right query to use for retrieval is likely to depend on the specific use case. While we saw some encouraging examples above, it’s important to note that the results can also be noisy and unintuitive. For example, \"zipping up files\" also assigns high scores to \"Partial .csproj Files\" and \"How to avoid .pyc files?\". And when the method returns surprising results, it is not always clear how to debug the issue — the meaning of each vector component is often opaque and doesn’t correspond to an interpretable concept. With traditional scoring techniques based on word overlap, it is often easier to answer the question \"why is this document ranked highly?\" As mentioned earlier, this prototype is meant as an example of how embedding models could be used with vector fields, and not as a production-ready solution. When developing a new search strategy, it is critical to test how the approach performs on your own data, making sure to compare against a strong baseline like a match query. It may be necessary to make major changes to the strategy before it achieves solid results, including fine-tuning the embedding model for the target dataset, or trying different ways of incorporating embeddings such as word-level query expansion. Conclusions Embedding techniques provide a powerful way to capture the linguistic content of a piece of text. By indexing embeddings and scoring based on vector distance, we can compare documents using a notion of similarity that goes beyond their word-level overlap. We’re looking forward to introducing more functionality based around the vector field type. Using vectors for search is a nuanced and developing area — as always, we would love to hear about your use cases and experiences on Github and the Discuss forums ! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to What are text embeddings? Word embeddings Sentence embeddings Comparison to traditional search approaches Using embeddings for similarity search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Text similarity search with vector fields - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/text-similarity-search-with-vectors-in-elasticsearch",
+    "meta_description": "This post explores how text embeddings and Elasticsearch’s new dense_vector type could be used to support similarity search."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Optimizing vector distance computations with the Foreign Function & Memory (FFM) API Learn how to optimize vector distance computations using the Foreign Function & Memory (FFM) API to achieve faster performance. Lucene Vector Database CH By: Chris Hegarty On February 23, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. At the heart of any vector database are the distance functions that determine how close two vectors are. These distance functions are executed many times, both during indexing and searching. When merging segments or navigating the graph for nearest neighbors, much of the execution time is spent comparing vectors for similarity. Micro optimizing these distance functions is time well spent, we're already benefiting from similar previous optimizations, e.g. see SIMD , FMA . With the recent support for scalar quantization in both Lucene and Elasticsearch, we're now more than ever leaning on the byte variants of these distance functions. We know from previous experience that there's still the potential for significant performance improvements in these variants. Current state of play: The Panama Vector API When we leveraged the Panama Vector API to accelerate the distance functions in Lucene , much of the focus was on the float (32-bit) variants. We were quite happy with the performance improvements we managed to achieve for these. However, the improvements for the byte (8-bit) variants was a little disappointing - and believe me, we tried! The fundamental problem with the byte variants is that they do not take full advantage of the most optimal SIMD instructions available on the CPU. When doing arithmetic operations in Java, the narrowest type is int (32-bit). The JVM automatically sign-extends byte values to values of type int . Consider this simple scalar dot product implementation: The multiplication of elements from a and b is performed as if a and b are of type int , whose value is the byte value loaded from the appropriate array index sign-extended to int . Our SIMD-ized implementation must be equivalent, so we need to be careful to ensure that overflows when multiplying large byte values are not lost. We do this by explicitly widening the loaded byte values to short (16-bit), since we know that all signed byte values when multiplied will fit without loss into signed short . We then need a further widen to int (32-bit) when accumulating. Here's an excerpt from the inner loop body of Lucene's 128-bit dot product code: Visualizing this we can see that we're only processing 4 elements at a time. e.g. This is all fine, even with these explicit widening conversations, we get some nice speed up through the extra data parallelism of the arithmetic operations, just not as much as we know is possible. The reason we know that there is potential left is that each widening halves the number of lanes, which effectively halves the number of arithmetic operations. The explicit widening conversations are not being optimized by the JVM's C2 JIT compiler. Additionally, we're only accessing the lower half of the data - accessing anything other than the lower half just does not result in good machine code. This is where we're leaving potential performance \"on the table\". For now, this is as good as we can do in Java. Longer term, the Panama Vector API and/or C2 JIT compiler should provide better support for such operations, but for now, at least, this is as good as we can do. Or is it? Introducing the Foreign Function & Memory (FFM) API OpenJDK's project Panama has several different strands, we've already seen the Panama Vector API in action, but the flagship of the project is the Foreign Function & Memory API (FFM). The FFM API offers a low overhead for interacting with code and memory outside the Java runtime. The JVM is an amazing piece of engineering, abstracting away much of the differences between architectures and platforms, but sometimes it's not always possible for it to make the best tradeoffs, which is understandable. FFM can rescue us when the JVM cannot easily do so, by allowing the programmer to take things into her own hands if she doesn't like the tradeoff that's been made. This is one such area, where the tradeoff of the Panama Vector API is not the right one for byte sized vectors. FFM usage example We're already leveraging the foreign memory support in Lucene to mediate safer access to mapped off-heap index data. Why not use the foreign invocation support to call already optimized distance computation functions? Since our distance computation functions are tiny, and for some set of deployments and architectures for which we already know the optimal set of CPU instructions, why not just write the small block of native code that we want. Then invoke it through the foreign invocation API. Going foreign Elastic Cloud has a profile that is optimized for vector search. This profile targets the ARM architecture, so let's take a look at how we might optimize for this. Let's write our distance function, say dot product, in C with some ARM Neon intrinsics. Again, we'll focus on the inner body of the loop. Here's what that looks like: We load 16 8-bit values from our a and b vectors into va8 and vb8 , respectively. We then multiply the lower half and store the result in va16 - this result holds 8 16-bit values and the operation implicitly handles the widening. Similar with the higher half. Finally, since we operated on the full original 16 values, it's faster to use to two accumulators to store the results. The vpadalq_s16 add and accumulate intrinsic knows how to widen implicitly as it accumulates into 4 32-bit values. In summary, we've operated on all 16 byte values per loop iteration. Nice! The disassembly for this is very clean and mirrors the above instrinsics. Neon SIMD on ARM has arithmetic instructions that offer the semantics we want without having to do the extra explicit widening. The C instrinsics expose these instructions for use in a way that we can leverage. The operations on registers densely packed with values is much cleaner than what we can do with the Panama Vector API. Back in Java-land The last piece of the puzzle is a small \"shim\" layer in Java that uses the FFM API to link to our foreign code. Our vector data is off-heap, we map it with a MemorySegment , and determine offsets and memory addresses based on the vector dimensions. The dot product method looks like this: We have a little more work to do here since this is now platform-specific Java code, so we only execute it on aarch64 platforms, falling back to an alternative implementation on other platforms. So is it actually faster than the Panama Vector code? Performance improvements with FFM API Micro benchmarks of the above dot product for signed byte values show a performance improvement of approximately 6 times, than that of the Panama Vector code. And this includes the overhead of the foreign call. The primary reason for the speedup is that we're able to pack the full 128-bit register with values and operate on all of them without explicitly moving or widening the data. Macro benchmarks, SO_Dense_Vector with scalar quantization enabled, shows significant improvements in merge times, approximately 3 times faster - the experiment only plugged in the optimized dot product for segment merges. We expect search benchmarks to show improvement too. Summary Recent advancements in Java, namely the FFM API, allows to interoperate with native code in a much more performant and straightforward way than was previously possible. Significant performance benefits can be had by providing micro-optimized platform-specific vector distance functions that are called through FFM. We're looking forward to a future version of Elasticsearch where scalar quantized vectors can take advantage of this performance improvement. And of course, we're giving a lot of thought to how this relates to Lucene and even the Panama Vector API, to determine how these can be improved too. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Current state of play: The Panama Vector API Introducing the Foreign Function & Memory (FFM) API FFM usage example Going foreign Back in Java-land Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Optimizing vector distance computations with the Foreign Function & Memory (FFM) API - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/vector-similarity-computations-ludicrous-speed",
+    "meta_description": "Learn how to optimize vector distance computations using the Foreign Function & Memory (FFM) API to achieve faster performance."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Open Crawler now in beta The Open Crawler is now in beta. This latest version 0.2 update also comes with several new features. Ingestion NF By: Navarone Feekery On September 17, 2024 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. We have released version 0.2 of Open Crawler, which has also been promoted to beta ! Open Crawler was initially released (version 0.1 ) in June 2024 as a tech-preview . Since then, we've been iterating on the product and have added several new features. To get access to these changes you can use the latest Docker artifact , or download the source code directly from the GitHub repository . Follow the setup instructions in our documentation to get started. What's new in the Open Crawler? A list of every change can be found in our changelog in the Open Crawler repository. In this blog we will go over only new features, and configuration format changes Open Crawler features Feature Description Extraction rules Allows for the extraction of HTML content using CSS and XPath selectors, and URL content using regex. Binary content extraction Allows for the extraction of binary content from file types supported by the Apache Tika project. Crawl rules Used to enable or disable certain URL patterns from being crawled and ingested. Purge crawls Deletes outdated documents from the index at the end of a crawl job. Scheduling Recurrent crawls can be scheduled based on a cron expression. Configuration changes in the Open Crawler Among the new features, the config.yml file format has changed for a few fields, so existing configuration files will not work between 0.1 and 0.2 . Notably, the configuration field domain_allowlist has been changed to domains , and seed_urls is now a subset of domains instead of a top-level field. This change was made so new features like extraction rules and crawl rules could be applied to specific domains, while allowing a single crawler to configure multiple domains for a crawl job. Make sure to reference the updated config.yml.example file to fix your configuration. Here is an example for migrating a 0.1 configuration to 0.2 : Showcase 1: crawl rules We're very excited to bring the crawl rules feature to Open Crawler. This is an existing feature in the Elastic Crawler . The biggest difference for crawl rules between Open Crawler and Elastic Crawler is the way these rules are configured. Elastic Crawler is configured using Kibana, while Open Crawler has crawl rules defined for a domain in the crawler.yml config file. Crawling only specific endpoints When determining if a URL is crawlable, Open Crawler will execute crawl rules in order from top to bottom. In this example below, we want to crawl only the content of https://www.elastic.co/search-labs . Because this has links to other URLs within the https://www.elastic.co domain, it's not enough to limit just the seed_urls to this entry point. Using crawl rules, we need two more rules: An allow rule for everything under (and including) the /search-labs URL pattern A deny everything rule to catch all other URLs In this example we are using a regex pattern for the deny rule. If I want to add another URL to ingest to this configuration (for example, /security-labs ), I need to: Add it as a seed_url Add it to the crawl_rules above the deny all rule Through this manner of configuration, you can be very specific about what webpages Crawler will ingest. If you have debug logs enabled , each denied URL will show up in the logs like this: Here's an actual example from my crawl results: Crawling everything except a specific endpoint This pattern is much easier to implement, as Crawler will crawl everything by default. All that is needed is to add a deny rule for URL pattern that you want to exclude. In this example, I want to crawl the entire https://www.elastic.co website, except for anything under /search-labs . Because I want to crawl everything , seed_urls is not needed for this configuration. Now if I run a crawl, Crawler will not ingest webpages with URLs that begins with /search-labs . Showcase 2: extraction rules Extraction rules are another much-asked-for feature for Open Crawler. Like crawl rules, extraction rules function almost the same as they do for Elastic Crawler , except for how they are configured. Extraction rules are configured under extraction_rulesets , which belong to a single item from domains . Getting the CSS selector For this example, I want to extract the authors' names for each blog article in /search-labs and assign it to the field authors . Without extraction rules, each blog's Elasticsearch document will have the author names buried in the body field. Using my browser developer tools (in my case, the Firefox dev tools ), I can visit the webpage and use the selector tool to find what CSS selectors an HTML element has. I can now see that the authors are stored in a <p> element with a few different classes, but most eye-catching is the class .author-name . Now, to test that using the selector .author-name is enough to fetch only the author name from this field, I can use the dev tools HTML search feature . Unfortunately, I can see that using only this class name returns 11 results for this blog post. After some investigation, I found that this is because the \"Recommended articles\" section at the bottom of a page also uses the .author-name class. To remedy this, we need a more restrictive selector. Examining the HTML code directly, I can see that the side-bar containing the author name that I want to extract is nested a few levels under a class called .sticky . This class refers to the sidebar that contains the author name I want to extract. We can combine these selectors into a single selector .sticky .author-name that will only search for .author-name classes that are nested within .sticky classes. We can then test this in the same HTML search bar as before, and ta-da ! Only one hit -- we've found our CSS selector! Configuring the extraction rules Now we can add the CSS selector from the previous step. We also need to define the url_filters for this rule. This will determine which endpoints the extraction rule is executed against. All articles for search labs fall under the format https://www.elastic.co/search-labs/blog/<slug> , so this can be achieved with a simple regex pattern: /search-labs/blog/.+$ . /search-labs/blog/ asserts the start of the URL .+ matches any character except line breaks $ marks the end of the string This stops sub-URLs like https://www.elastic.co/search-labs/blog/<slug>/<something-else> from having this extraction rule In this example we will also utilize crawl rules, to avoid crawling the entire https://www.elastic.co website. After completing a crawl with the above configuration, I can check for the new author field in the ingested documents. I can do this using a _search query to find articles written by the author Sebastien Guilloux . And we have a single hit! Showcase 3: combining it all with Semantic Text Jeff Vestal wrote a fantastic article combining Open Crawler with Semantic Text search, among other cool RAG things. Read up on that here . Comparing with Elastic Crawler We now maintain a feature comparison table on the Open Crawler repository to compare the features available for Open Crawler vs Elastic Crawler. Open Crawler next steps The next release will bring Open Crawler to version 1.0 , and will also promote it to GA (generally available). We don't have a release date planned for this version yet. We do have a general idea of some features we want to include: Extraction using data attributes and meta tags Full HTML extraction Send event logs to Elasticsearch This list is not exhaustive, and depending on user feedback we will include other features in the 1.0 GA release. If there are other features you would like to see included, feel free to create an enhancement issue directly on the Open Crawler repository. Feedback like this will help us prioritize what to include in the next release. Report an issue Related content Integrations Ingestion +1 March 7, 2025 Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. JR By: Jeffrey Rengifo Integrations Ingestion +1 February 19, 2025 Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. AK By: Amit Khandelwal Integrations Ingestion +1 February 18, 2025 Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. JR TM By: Jeffrey Rengifo and Tomás Murúa Ingestion How To February 4, 2025 How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. AL By: Andre Luiz Integrations Ingestion +1 February 3, 2025 Elastic Playground: Using Elastic connectors to chat with your data Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources. JR TM By: Jeffrey Rengifo and Tomás Murúa Jump to What's new in the Open Crawler? Open Crawler features Configuration changes in the Open Crawler Showcase 1: crawl rules Crawling only specific endpoints Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Open Crawler now in beta - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-open-crawler-beta-release",
+    "meta_description": "Elastic's Open Crawler version 0.2 is in beta. Explore the Open Crawler features, configuration changes and its crawl & extraction rules."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Scalar quantization 101 Understand what scalar quantization is, how it works and its benefits. This guide also covers the math behind quantization and examples. Lucene ML Research BT By: Benjamin Trent On October 25, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction to scalar quantization Most embedding models output f l o a t 32 float32 f l o a t 32 vector values. While this provides the highest fidelity, it is wasteful given the information that is actually important in the vector. Within a given data set, embeddings never require all 2 billion options for each individual dimension. This is especially true on higher dimensional vectors (e.g. 386 dimensions and higher). Quantization allows for vectors to be encoded in a lossy manner, thus reducing fidelity slightly with huge space savings. Understanding buckets in scalar quantization Scalar quantization takes each vector dimension and buckets them into some smaller data type. For the rest of the blog, we will assume quantizing f l o a t 32 float32 f l o a t 32 values into i n t 8 int8 in t 8 . To bucket values accurately, it isn't as simple as rounding the floating point values to the nearest integer. Many models output vectors that have dimensions continuously on the range [ − 1.0 , 1.0 ] [-1.0, 1.0] [ − 1.0 , 1.0 ] . So, two different vector values 0.123 and 0.321 could both be rounded down to 0. Ultimately, a vector would only use 2 of its 255 available buckets in i n t 8 int8 in t 8 , losing too much information. Figure 1: Illustration of quantization goals, bucketing continuous values from − 1.0 -1.0 − 1.0 to 1.0 1.0 1.0 into discrete i n t 8 int8 in t 8 values. The math behind the numerical transformation isn't too complicated. Since we can calculate the minimum and maximum values for the floating point range, we can use min-max normalization and then linearly shift the values. i n t 8 ≈ 127 m a x − m i n × ( f l o a t 32 − m i n ) int8 \\approx \\frac{127}{max - min} \\times (float32 - min) in t 8 ≈ ma x − min 127 ​ × ( f l o a t 32 − min ) f l o a t 32 ≈ m a x − m i n 127 × i n t 8 + m i n float32 \\approx \\frac{max - min}{127} \\times int8 + min f l o a t 32 ≈ 127 ma x − min ​ × in t 8 + min Figure 2: Equations for transforming between i n t 8 int8 in t 8 and f l o a t 32 float32 f l o a t 32 . Note, these are lossy transformations and not exact. In the following examples, we are only using positive values within int8. This aligns with the Lucene implementation. The role of statistics in scalar quantization A quantile is a slice of a distribution that contains a certain percentage of the values. So, for example, it may be that 99 % 99\\% 99% of our floating point values are between [ − 0.75 , 0.86 ] [-0.75, 0.86] [ − 0.75 , 0.86 ] instead of the true minimum and maximum values of [ − 1.0 , 1.0 ] [-1.0, 1.0] [ − 1.0 , 1.0 ] . Any values less than -0.75 and greater than 0.86 are considered outliers. If you include outliers when attempting to quantize results, you will have fewer available buckets for your most common values. And fewer buckets can mean less accuracy and thus greater loss of information. Figure 3: Illustration of the 99 % 99\\% 99% confidence interval and the individual quantile values. 99 % 99\\%% 99% of all values fall within the range [ − 0.75 , 0.86 ] [-0.75, 0.86] [ − 0.75 , 0.86 ] . This is all well and good, but now that we know how to quantize values, how can we actually calculate distances between two quantized vectors? Is it as simple as a regular dot_product ? The role of algebra in scalar quantization We are still missing one vital piece, how do we calculate the distance between two quantized vectors. While we haven't shied away from math yet in this blog, we are about to do a bunch more. Time to break out your pencils and try to remember polynomials and basic algebra. The basic requirement for dot_product and cosine similarity is being able to multiply floating point values together and sum up their results. We already know how to transform between f l o a t 32 float32 f l o a t 32 and i n t 8 int8 in t 8 values, so what does multiplication look like with our transformations? f l o a t 3 2 i × f l o a t 3 2 i ′ ≈ ( m a x − m i n 127 × i n t 8 i + m i n ) × ( m a x − m i n 127 × i n t 8 i ′ + m i n ) float32_i \\times float32'_i \\approx (\\frac{max - min}{127} \\times int8_i + min) \\times (\\frac{max - min}{127} \\times int8'_i + min) f l o a t 3 2 i ​ × f l o a t 3 2 i ′ ​ ≈ ( 127 ma x − min ​ × in t 8 i ​ + min ) × ( 127 ma x − min ​ × in t 8 i ′ ​ + min ) We can then expand this multiplication and to simplify we will substitute α \\alpha α for m a x − m i n 127 \\frac{max - min}{127} 127 ma x − min ​ . α 2 × i n t 8 i × i n t 8 i ′ + α × i n t 8 i × m i n + α × i n t 8 i ′ × m i n + m i n 2 \\alpha^2 \\times int8_i \\times int8'_i + \\alpha \\times int8_i \\times min + \\alpha \\times int8'_i \\times min + min^2 α 2 × in t 8 i ​ × in t 8 i ′ ​ + α × in t 8 i ​ × min + α × in t 8 i ′ ​ × min + mi n 2 What makes this even more interesting, is that only one part of this equation requires both values at the same time. However, dot_product isn't just two floats being multiplied, but all the floats for each dimension of the vector. With vector dimension count d i m dim d im in hand, all the following can be pre-calculated at query time and storage time. d i m × α 2 dim\\times\\alpha^2 d im × α 2 is just d i m × ( m a x − m i n 127 ) 2 dim\\times(\\frac{max-min}{127})^2 d im × ( 127 ma x − min ​ ) 2 and can be stored as a single float value. ∑ i = 0 d i m − 1 m i n × α × i n t 8 i \\sum_{i=0}^{dim-1}min\\times\\alpha\\times int8_i ∑ i = 0 d im − 1 ​ min × α × in t 8 i ​ and ∑ i = 0 d i m − 1 m i n × α × i n t 8 i ′ \\sum_{i=0}^{dim-1}min\\times\\alpha\\times int8'_i ∑ i = 0 d im − 1 ​ min × α × in t 8 i ′ ​ can be pre-calculated and stored as a single float value or calculated once at query time. d i m × m i n 2 dim\\times min^2 d im × mi n 2 can be pre-calculated and stored as a single float value. Of all this: d i m × α 2 × d o t P r o d u c t ( i n t 8 , i n t 8 ′ ) + ∑ i = 0 d i m − 1 m i n × α × i n t 8 i + ∑ i = 0 d i m − 1 m i n × α × i n t 8 i ′ + d i m × m i n 2 dim \\times \\alpha^2 \\times dotProduct(int8, int8') + \\sum_{i=0}^{dim-1}min\\times\\alpha\\times int8_i + \\sum_{i=0}^{dim-1}min\\times\\alpha\\times int8'_i + dim\\times min^2 d im × α 2 × d o tP ro d u c t ( in t 8 , in t 8 ′ ) + i = 0 ∑ d im − 1 ​ min × α × in t 8 i ​ + i = 0 ∑ d im − 1 ​ min × α × in t 8 i ′ ​ + d im × mi n 2 The only calculation required for dot_product is just d o t P r o d u c t ( i n t 8 , i n t 8 ′ ) dotProduct(int8, int8') d o tP ro d u c t ( in t 8 , in t 8 ′ ) with some pre-calculated values combined with the result. Ensuring accuracy in quantization So, how is this accurate at all? Aren't we losing information by quantizing? Yes, we are, but quantization takes advantage of the fact that we don't need all the information. For learned embeddings models, the distributions of the various dimensions usually don't have fat-tails . This means they are localized and fairly consistent. Additionaly, the error introduced per dimension via quantization is independent. Meaning, the error cancels out for our typical vector operations like dot_product. Conclusion Whew, that was a ton to cover. But now you have a good grasp of the technical benefits of quantization, the math behind it, and how you can calculate the distances between vectors while accounting for the linear transformation. Look next at how we implemented this in Lucene and some of the unique challenges and benefits available there. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Jump to Introduction to scalar quantization Understanding buckets in scalar quantization The role of statistics in scalar quantization The role of algebra in scalar quantization Ensuring accuracy in quantization Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Scalar quantization 101 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/scalar-quantization-101",
+    "meta_description": "Understand what scalar quantization is, how it works and its benefits. This guide also covers the math behind quantization and examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Apache Lucene 10 is out! Improvements to Lucene's hardware efficiency & more Apache Lucene 10 has been released, with a focus on hardware efficiency! Check out the main release highlights. Lucene AG By: Adrien Grand On October 14, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Apache Lucene 10 is finally out! With more than 2,000 commits from 185 unique contributors since Lucene 9.0 - which was released in December 2021 , almost 3 years ago - a lot has been going on. To be fair, a majority of these changes have been delivered in 9.x minor releases. However, the most ambitious changes usually need a major version, such as the introduction of multi-dimensional points in Lucene 6.0, dynamic pruning in 8.0, or vector search in 9.0. In 10.0, the area of focus for Lucene has been hardware efficiency, ie. making Lucene better at taking advantage of modern hardware. Let me guide you through the main release highlights. Lucene 10 release highlights More search parallelism For many years now, Lucene has had the ability to parallelize search execution, by creating groups of segments, searching each group in a different thread, and combining results in the end. One downside of this approach is that it couples the index geometry - how the index is organized into segments - and search parallelism. For instance, an index that has been force-merged down to a single segment can no longer take advantage of multiple execution threads to be searched. Quite disappointing when modern CPUs commonly have tens of cores! To overcome this limitation, Lucene's query evaluation logic now allows splitting the index into logical partitions, that no longer need to be aligned with segments. For instance, an index that has been force-merged down to a single segment could still be sliced into 10 logical partitions that have one tenth of the documents of this segment each. This change will help increase search parallelism, especially on machines that have many cores and/or on indexes that have few segments on their highest tier. This change doesn't work nicely with queries that have a high cost for creating a Scorer yet - such as range queries and prefix queries, but we're hoping to lift this limitation in upcoming minor releases. Better I/O parallelism Until now, Lucene would use synchronous I/O and perform at most one I/O operation at a time per search thread. For indexes that significantly exceed the size of the page cache, this could lead to queries being bound on I/O latency, while the host is still very far from maxing out IOPS. Frustrating! To overcome this, Lucene's Directory abstraction introduced a new IndexInput#prefetch API, to let the OS know about regions of files that it is about to read. The OS can then parallelize retrieving pages that intersect with these regions, within a single OS thread. For instance, a BooleanQuery with TermQuery clauses would now perform the I/O of terms dictionary lookups in parallel and then retrieve the first couple pages of each postings list in parallel, within a single execution thread. MMapDirectory , Lucene's default Directory implementation, implements this prefetch API using madvise 's MADV_WILLNEED advice on Linux and Mac OS. We are very excited about this change, which has already proved to help on fast local NVMe disks, and will further help on storage systems that have worse latencies while retaining good parallelism such as network-attached disks (GCP persistent storage, Amazon EBS, Azure managed disks) or even object storage (GCP Cloud storage, Amazon S3, Azure blob storage). Better CPU efficiency and storage efficiency with sparse indexing Lucene 10 introduces support for sparse indexing , sometimes called primary-key indexing or zone indexing in other data stores. The idea is simple: if your data is stored on disk in sorted order , then you can organize it into blocks, record the minimum and maximum values per block, and your queries will be able to take advantage of this information to skip blocks that don't intersect with the query, or to fully match blocks that are contained by the query. Only blocks that partially intersect with the query will need further inspection, and the challenge consists of picking the best index sort that will minimize the number of such blocks. Lucene's sparse indexes are currently implemented via 4 levels of blocks that have 4k, 32k, 256k and 2M docs each respectively. When done right, this form of indexing is extremely space-efficient (only a few bytes per block) and CPU-efficient (can make a decision about whether thousands of documents match or not with only a few CPU instructions). The downside is that the index can only be stored in a single order on disk, so not all fields can benefit from it. Typically, the index would be sorted on the main dimensions of the data. For instance, for an e-commerce catalog containing products, these dimensions could be the category and the brand of the products. Conclusion Note that some hardware-efficiency-related changes have also been released in 9.x minor releases. In particular, it's worth highlighting that: Lucene now takes advantage of explicit vectorization when comparing vectors and decoding postings , Lucene's concurent search execution logic performs work stealing in order to reduce the overhead of forking tasks, Lucene's postings format has been updated to have a more sequential access pattern, Lucene now passes a MADV_RANDOM advice when opening files that have a random-access pattern. We are pretty excited about this new Lucene release and the hardware-efficiency focus. In case you are curious to learn more about these improvements, we will be writing more detailed blogs about them in the coming weeks. Stay tuned. Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Lucene 10 release highlights More search parallelism Better I/O parallelism Better CPU efficiency and storage efficiency with sparse indexing Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Apache Lucene 10 is out! Improvements to Lucene's hardware efficiency & more - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/apache-lucene-10-release-highlights",
+    "meta_description": "Apache Lucene 10 is here! Discover Lucene 10 release highlights: search parallelism, better I/O performance, and sparse indexing for better CPU and storage efficiency. "
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch .NET client evolution: From NEST to Elastic.Clients.Elasticsearch Learn about the evolution of the Elasticsearch .NET client and the transition from NEST to Elastic.Clients.Elasticsearch. .NET How To FB By: Florian Bernd On April 16, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction to .NET client and NEST In the .NET world, integration with Elasticsearch has long been facilitated by the NEST library, which serves as a robust interface for developers to interact with Elasticsearch's powerful search and analytics capabilities. NEST , born out of a need for a native .NET client for Elasticsearch, quickly gained popularity among developers for its rich feature set and seamless integration capabilities. For nearly 14 years and only 8 months after Elasticsearch's first commit NEST has been faithfully tracking Elasticsearch releases. Transitioning from NEST to Elastic.Clients.Elasticsearch As Elasticsearch evolved, maintaining NEST 's complex codebase became increasingly difficult. We recognized the need for a more sustainable approach to client development and went on a journey to redesign the .NET client from the ground up. It took us almost a year to release a first beta version and another year to get close to supporting every single server endpoint. One of the most difficult decisions was to reduce the scope of the library in order to prioritize maintainability instead. Given the size of the Elasticsearch API surface today, it is no longer practical to maintain over 450 endpoints and nearly 3000 types (requests, responses, queries, aggregations, etc.) by hand. To ensure consistent, accurate, and timely alignment between language clients and Elasticsearch, the 8.x clients, and many of the associated types are now automatically code-generated from a shared specification . This is a common solution to maintaining alignment between client and server among SDKs and libraries, such as those for Azure, AWS and the Google Cloud Platform. The Elasticsearch specification was created over 8 years ago by exporting the type mappings from NEST and through the hard work of the clients team we can now use the same specification to create a new .NET client (and clients for multiple other languages like Java, Go, etc.). With the release of version 8.13, the deprecation of NEST was officially announced. As Elasticsearch transitions to Elastic.Clients.Elasticsearch , NEST will gradually phase out, reaching its end-of-life at the close of the year. Developers are strongly encouraged to commence migration efforts early to ensure a smooth transition and mitigate any potential disruptions. Embracing Elastic.Clients.Elasticsearch not only ensures compatibility with the latest server features but also future-proofs applications against deprecated functionality. Elastic.Clients.Elasticsearch: features and changes overview Switching to the v8 client Elastic.Clients.Elasticsearch enables access to all the new features of Elasticsearch 8 and also brings numerous modernizations to the library itself but also implies a reduction in convenience features compared to its predecessor. Some of the new core features include the query language ES|QL , modern machine learning (ML) capabilities and improved diagnostics in the form of OpenTelemetry-compatible activities. Starting with version 8.13, Elastic.Clients.Elasticsearch supports almost all server features of Elasticsearch 8. An important breaking change, for example, is related to aggregations. In NEST , the fluent API usage looks like this: while the v8 client requires the following syntax: Migrating from NEST v7 to .NET client v8 A comprehensive migration guide is available here: Migration guide: From NEST v7 to .NET Client v8 . Additional resources Elastic.Clients.Elasticsearch v8 Client on GitHub Elastic.Clients.Elasticsearch v8 Client on NuGet Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction to .NET client and NEST Transitioning from NEST to Elastic.Clients.Elasticsearch Elastic.Clients.Elasticsearch: features and changes overview Migrating from NEST v7 to .NET client v8 Additional resources Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch .NET client evolution: From NEST to Elastic.Clients.Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/net-client-evolution",
+    "meta_description": "Learn about the evolution of the Elasticsearch .NET client and the transition from NEST to Elastic.Clients.Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog RaBitQ binary quantization 101 Understand the most critical components of RaBitQ binary quantization, how it works and its benefits. This guide also covers the math behind the quantization and examples. Vector Database Lucene ML Research JW By: John Wagster On October 22, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction As we have discussed previously in Scalar Quantization 101 most embedding models output f l o a t 32 float32 f l o a t 32 vector values, which is often excessive to represent the vector space. Scalar quantization techniques greatly reduce the space needed to represent these vectors. We've also previously talked about Bit Vectors In Elasticsearch and how binary quantization can often be unacceptably lossy. With binary quantization techniques such as those presented in the RaBitQ paper we can address the problems associated with naively quantizing into a bit vector and maintain the quality associated with scalar quantization by more thoughtfully subdividing the space and retaining residuals of the transformation. These newer techniques allow for better optimizations and generally better results over other similar techniques like product quantization (PQ) in distance calculations and a 32x level of compression that typically is not possible with scalar quantization. Here we'll walk through some of the core aspects of binary quantization and leave the mathematical details to the RaBitQ paper . Building the Bit Vectors Because we can more efficiently pre-compute some aspects of the distance computation, we treat the indexing and query construction separately. To start with let's walk through indexing three very simple 2 dimensional vectors, v 1 v_1 v 1 ​ , v 2 v_2 v 2 ​ , and v 3 v_3 v 3 ​ , to see how they are transformed and stored for efficient distance computations at query time. v 1 = [ 0.56 , 0.82 ] v_1 = [0.56, 0.82] v 1 ​ = [ 0.56 , 0.82 ] v 2 = [ 1.23 , 0.71 ] v_2 = [1.23, 0.71] v 2 ​ = [ 1.23 , 0.71 ] v 3 = [ - 3.28 , 2.13 ] v_3 = [\\text{-}3.28, 2.13] v 3 ​ = [ - 3.28 , 2.13 ] Our objective is to transform these vectors into much smaller representations that allow: A reasonable proxy for estimating distance rapidly Some guarantees about how vectors are distributed in the space for better control over the total number of data vectors needed to recall the true nearest neighbors We can achieve this by: Shifting each vector to within a hyper-sphere, in our case a 2d circle, the unit circle Snapping each vector to a single representative point within each region of the circle Retaining corrective factors to better approximate the distance between each vector and the query vector Let's unpack that step by step. Find a Representative Centroid In order to partition each dimension we need to pick a pivot point. For simplicity, we'll select one point to use to transform all of our data vectors. Let's continue with our vectors, v 1 v_1 v 1 ​ , v 2 v_2 v 2 ​ , and v 3 v_3 v 3 ​ . We pick their centroid as the pivot point. v 1 = [ 0.56 , 0.82 ] v 2 = [ 1.23 , 0.71 ] v 3 = [ - 3.28 , 2.13 ] c = [ ( 0.56 + 1.23 + - 3.28 ) / 3 , ( 0.82 + 0.71 + 2.13 ) / 3 ] c = [ - 0.49 , 1.22 ] v_1 = [0.56, 0.82] \\newline v_2 = [1.23, 0.71] \\newline v_3 = [\\text{-}3.28, 2.13] \\newline ~\\\\ c = [(0.56 + 1.23 + \\text{-}3.28) / 3, (0.82 + 0.71 + 2.13) / 3] \\newline ~\\\\ c = [\\text{-}0.49, 1.22] v 1 ​ = [ 0.56 , 0.82 ] v 2 ​ = [ 1.23 , 0.71 ] v 3 ​ = [ - 3.28 , 2.13 ] c = [( 0.56 + 1.23 + - 3.28 ) /3 , ( 0.82 + 0.71 + 2.13 ) /3 ] c = [ - 0.49 , 1.22 ] Here's all of those points graphed together: Figure 1: graph of the example vectors and the derived centroid of those three vectors. Each residual vector is then normalized . We'll call these v c 1 v_{c1} v c 1 ​ , v c 2 v_{c2} v c 2 ​ , and v c 3 v_{c3} v c 3 ​ . v c 1 = ( v 1 − c ) / ∥ v 1 − c ∥ v c 2 = ( v 2 − c ) / ∥ v 2 − c ∥ v c 3 = ( v 3 − c ) / ∥ v 3 − c ∥ v_{c1} = (v_1 - c) / \\|v_1 - c\\| \\newline v_{c2} = (v_2 - c) / \\|v_2 - c\\| \\newline v_{c3} = (v_3 - c) / \\|v_3 - c\\| \\newline v c 1 ​ = ( v 1 ​ − c ) /∥ v 1 ​ − c ∥ v c 2 ​ = ( v 2 ​ − c ) /∥ v 2 ​ − c ∥ v c 3 ​ = ( v 3 ​ − c ) /∥ v 3 ​ − c ∥ Let's do the math for one of the vectors together: v c 1 = ( v 1 − c ) / ∥ v 1 − c ∥ v 1 − c = [ 0.56 , 0.82 ] − [ - 0.49 , 1.22 ] = [ 1.05 , - 0.39 ] ∥ v 1 − c ∥ = 1.13 v c 1 = ( v 1 − c ) / ∥ v 1 − c ∥ = ( [ 1.05 , - 0.39 ] ) / ∥ [ 1.05 , - 0.39 ] ∥ = ( [ 1.05 , - 0.39 ] ) / 1.13 = [ 0.94 , - 0.35 ] v_{c1} = (v_1 - c) / \\|v_1 - c\\| \\newline ~\\\\ \\begin{align*} v_1 - c &= [0.56, 0.82] - [\\text{-}0.49, 1.22] \\newline &= [1.05, \\text{-}0.39] \\end{align*} \\newline ~\\\\ \\|v_1 - c\\| = 1.13 \\newline ~\\\\ \\begin{align*} v_{c1} &= (v_1 - c) / \\|v_1 - c\\| \\newline &= ([1.05, \\text{-}0.39]) / \\|[1.05, \\text{-}0.39]\\| \\newline &= ([1.05, \\text{-}0.39]) / 1.13 \\newline &= [0.94, \\text{-}0.35] \\end{align*} \\newline v c 1 ​ = ( v 1 ​ − c ) /∥ v 1 ​ − c ∥ v 1 ​ − c ​ = [ 0.56 , 0.82 ] − [ - 0.49 , 1.22 ] = [ 1.05 , - 0.39 ] ​ ∥ v 1 ​ − c ∥ = 1.13 v c 1 ​ ​ = ( v 1 ​ − c ) /∥ v 1 ​ − c ∥ = ([ 1.05 , - 0.39 ]) /∥ [ 1.05 , - 0.39 ] ∥ = ([ 1.05 , - 0.39 ]) /1.13 = [ 0.94 , - 0.35 ] ​ And for each of the remaining vectors: v c 2 = [ 0.96 , - 0.28 ] v c 3 = [ - 0.95 , 0.31 ] v_{c2} = [0.96, \\text{-}0.28] \\newline v_{c3} = [\\text{-}0.95, 0.31] v c 2 ​ = [ 0.96 , - 0.28 ] v c 3 ​ = [ - 0.95 , 0.31 ] Let's see that transformation and normalization all together: Figure 2: Animation of the example vectors and the derived centroid transformed within the unit circle. As you may be able to see our points now sit on the unit circle around the centroid as if we had placed the centroid at 0,0. 1 Bit, 1 Bit Only Please With our data vectors centered and normalized, we can apply standard binary quantization encoding each transformed vector component with a 0 if it is negative and 1 if it is positive. In our 2 dimensional example this splits our unit circle into four quadrants and the binary vectors corresponding to v 1 c v_{1c} v 1 c ​ , v c 2 v_{c2} v c 2 ​ , and v c 3 v_{c3} v c 3 ​ become r 1 = [ 1 , 0 ] , r 2 = [ 1 , 0 ] , r 3 = [ 0 , 1 ] r_1 = [1, 0], r_2 = [1, 0], r_3 = [0, 1] r 1 ​ = [ 1 , 0 ] , r 2 ​ = [ 1 , 0 ] , r 3 ​ = [ 0 , 1 ] , respectively. We finish quantizing each data vector by snapping it to a representative point within each region specifically picking a point equidistant from each axis on the unit circle: ± 1 d \\pm \\frac{1}{\\sqrt{d}} ± d ​ 1 ​ We'll denote each quantized vector as v ‾ 1 \\overline{v}_1 v 1 ​ , v ‾ 2 \\overline{v}_2 v 2 ​ , and v ‾ 3 \\overline{v}_3 v 3 ​ . So for instance if we snap v c 1 v_{c1} v c 1 ​ to its representative point within its region r 1 r_1 r 1 ​ we get: v ‾ 1 = 1 d ( 2 r 1 − 1 ) = 1 2 [ 1 , − 1 ] = [ 1 2 , − 1 2 ] \\begin{align*} \\overline{v}_1 &= \\frac{1}{\\sqrt{d}} (2 r_1 - 1) \\newline &= \\frac{1}{\\sqrt{2}} [1, -1] \\newline &= [\\frac{1}{\\sqrt{2}}, -\\frac{1}{\\sqrt{2}}] \\end{align*} v 1 ​ ​ = d ​ 1 ​ ( 2 r 1 ​ − 1 ) = 2 ​ 1 ​ [ 1 , − 1 ] = [ 2 ​ 1 ​ , − 2 ​ 1 ​ ] ​ And here are the quantized forms for the other data vectors, v 2 v_2 v 2 ​ and v 3 v_3 v 3 ​ : v ‾ 2 = [ 1 2 , − 1 2 ] v ‾ 3 = [ − 1 2 , 1 2 ] \\overline{v}_2 = [\\frac{1}{\\sqrt{2}}, -\\frac{1}{\\sqrt{2}}] \\newline \\overline{v}_3 = [-\\frac{1}{\\sqrt{2}}, \\frac{1}{\\sqrt{2}}] v 2 ​ = [ 2 ​ 1 ​ , − 2 ​ 1 ​ ] v 3 ​ = [ − 2 ​ 1 ​ , 2 ​ 1 ​ ] Picking these representative points has some nice mathematical properties as outlined in the RaBitQ paper . And are not unlike the codebooks seen in product quantization (PQ) . Figure 3: Binary quantized vectors within a region snapped to representative points. At this point we now have a 1 bit approximation; albeit a somewhat fuzzy one that we can use to do distance comparisons. Clearly, v ‾ 1 \\overline{v}_1 v 1 ​ and v ‾ 2 \\overline{v}_2 v 2 ​ are now identical in this quantized state, which is not ideal and is a similar problem experienced when we discussed encoding float vectors as Bit Vectors In Elasticsearch. The elegance of this is that at query time we can use something akin to a dot product to compare each data vector and each query vector rapidly for an approximation of distance. We'll see that in more detail when we discuss handling the query. The Catch As we saw above, a lot of information is lost when converting to bit vectors. We'll need some additional information to help compensate for the loss and correct our distance estimations. In order to recover fidelity we'll store the distance from each vector to the centroid and the projection (dot product) of the vector (e.g v c 1 v_{c1} v c 1 ​ ) with its quantized form (e.g. v 1 ‾ \\overline{v_1} v 1 ​ ​ ) as two f l o a t 32 float32 f l o a t 32 values. The Euclidean distance to the centroid is straight-forward and we already computed it when quantizing each vector: ∥ v 1 − c ∥ = 1.13 ∥ v 2 − c ∥ = 1.79 ∥ v 3 − c ∥ = 2.92 \\|v_1 - c\\| = 1.13 \\newline \\|v_2 - c\\| = 1.79 \\newline \\|v_3 - c\\| = 2.92 ∥ v 1 ​ − c ∥ = 1.13 ∥ v 2 ​ − c ∥ = 1.79 ∥ v 3 ​ − c ∥ = 2.92 Precomputing distances from each data vector to the centroid restores the transformation of centering the vectors. Similarly, we'll compute the distance from the query to the centroid. Intuitively the centroid acts as a go-between instead of directly computing the distance between the query and data vector. The dot product of the vector and the quantized vector is then: v c 1 ⋅ v ‾ 1 = v c 1 ⋅ 1 2 ( 2 r 1 − 1 ) = [ 0.94 , - 0.35 ] ⋅ [ 1 2 , − 1 2 ] = 0.90 \\begin{align*} v_{c1} \\cdot \\overline{v}_1 &= v_{c1} \\cdot \\frac{1}{\\sqrt{2}} (2 r_1 - 1) \\\\ &= [0.94, \\text{-}0.35] \\cdot \\left[\\frac{1}{\\sqrt{2}}, -\\frac{1}{\\sqrt{2}}\\right] \\\\ &= 0.90 \\end{align*} v c 1 ​ ⋅ v 1 ​ ​ = v c 1 ​ ⋅ 2 ​ 1 ​ ( 2 r 1 ​ − 1 ) = [ 0.94 , - 0.35 ] ⋅ [ 2 ​ 1 ​ , − 2 ​ 1 ​ ] = 0.90 ​ And for our other two vectors: v c 2 ⋅ v ‾ 2 = 0.95 v c 3 ⋅ v ‾ 3 = 0.89 v_{c2} \\cdot \\overline{v}_2 = 0.95 \\newline v_{c3} \\cdot \\overline{v}_3 = 0.89 v c 2 ​ ⋅ v 2 ​ = 0.95 v c 3 ​ ⋅ v 3 ​ = 0.89 The dot product between the quantized vector and the original vector, being the second corrective factor, captures for how far away the quantized vector is from its original position. In section 3.2 the RaBitQ paper shows there is a bias correcting the dot product between the quantized data and query vectors in a naive fashion. This factor exactly compensates it. Keep in mind we are doing this transformation to reduce the total size of the data vectors and reduce the cost of vector comparisons. These corrective factors while seemingly large in our 2d example become insignificant as vector dimensionality increases. For example, a 1024 dimensional vector if stored as f l o a t 32 float32 f l o a t 32 requires 4096 bytes. If stored with this bit compression and corrective factors, only 136 bytes are required. To better understand why we use these factors refer to the RaBitQ paper . It gives an in-depth treatment of the math involved. The Query q = [ 0.68 , - 1.72 ] q = [0.68, \\text{-}1.72] q = [ 0.68 , - 1.72 ] To be able to compare our quantized data vectors to a query vector we must first get the query vector into a quantized form and shift it relative to the unit circle. We'll refer to the query vector as q q q , the transformed vector as q c q_c q c ​ , and the scalar quantized vector as q ‾ \\overline{q} q ​ . q − c = ( 0.68 − - 0.49 ) , ( - 1.72 − 1.22 ) = [ 1.17 , − 2.95 ] q c = ( q − c ) / ∥ q − c ∥ = [ 1.17 , − 2.95 ] / 3.17 = [ 0.37 , − 0.92 ] \\begin{align*} q - c &= (0.68 - \\text{-}0.49), (\\text{-}1.72 - 1.22) \\newline &= [1.17, −2.95] \\end{align*} \\newline ~\\\\ \\begin{align*} q_c &= (q - c) / \\|q - c\\| \\newline &= [1.17, −2.95] / 3.17 \\newline &= [0.37, −0.92] \\end{align*} q − c ​ = ( 0.68 − - 0.49 ) , ( - 1.72 − 1.22 ) = [ 1.17 , − 2.95 ] ​ q c ​ ​ = ( q − c ) /∥ q − c ∥ = [ 1.17 , − 2.95 ] /3.17 = [ 0.37 , − 0.92 ] ​ Next we perform Scalar Quantization on the query vector down to 4 bits; we'll call this vector q ‾ \\overline{q} q ​ . It's worth noting that we do not quantize down to a bit representation but instead maintain an i n t 4 int4 in t 4 scalar quantization, q ‾ \\overline{q} q ​ as an int4 byte array, for estimating the distance. We can take advantage of this asymmetric quantization to retain more information without additional storage. l o w e r = - 0.92 u p p e r = 0.37 w i d t h = ( u p p e r − l o w e r ) / ( 2 4 − 1 ) ; = ( 0.37 − - 0.92 ) / 15 ; = 0.08 q ‾ = ⌊ ( q c − l o w e r ) / w i d t h ⌉ = ⌊ ( [ 0.37 , − 0.92 ] − [ - 0.92 , - 0.92 ] ) / 0.08 ⌉ = [ 15 , 0 ] lower = \\text{-}0.92 \\newline upper = 0.37 \\newline ~\\\\ \\begin{align*} width &= (upper - lower) / (2^4 - 1); \\newline &= (0.37 - \\text{-}0.92) / 15; \\newline &= 0.08 \\end{align*} \\newline ~\\\\ \\newline \\begin{align*} \\overline{q} &= \\lfloor{(q_c - lower) / width}\\rceil \\newline &= \\lfloor{([0.37, −0.92] - [\\text{-}0.92, \\text{-}0.92]) / 0.08}\\rceil \\newline &= [15, 0] \\end{align*} l o w er = - 0.92 u pp er = 0.37 w i d t h ​ = ( u pp er − l o w er ) / ( 2 4 − 1 ) ; = ( 0.37 − - 0.92 ) /15 ; = 0.08 ​ q ​ ​ = ⌊ ( q c ​ − l o w er ) / w i d t h ⌉ = ⌊ ([ 0.37 , − 0.92 ] − [ - 0.92 , - 0.92 ]) /0.08 ⌉ = [ 15 , 0 ] ​ Figure 4: Query with centroid transformation applied. As you can see because we have only 2 dimensions our quantized query vector now consists of two values at the ceiling and floor of the i n t 4 int4 in t 4 range. With longer vectors you would see a variety of int4 values with one of them being the ceiling and one of them being the floor. Now we are ready to perform a distance calculation comparing each indexed data vector with this query vector. We do this by summing up each dimension in our quantized query that's shared with any given data vector. Basically, a plain old dot-product, but with bits and bytes. q ‾ ⋅ r 1 = [ 15 , 0 ] ⋅ [ 1 , 0 ] = 15 \\begin{align*} \\overline{q} \\cdot r_1 &= [15, 0] \\cdot [1, 0] \\newline &= 15 \\end{align*} q ​ ⋅ r 1 ​ ​ = [ 15 , 0 ] ⋅ [ 1 , 0 ] = 15 ​ We can now apply corrective factors to unroll the quantization and get a more accurate reflection of the estimated distance. To achieve this we'll collect the upper and lower bound from the quantized query, which we derived when doing the scalar quantization of the query. Additionally we need the distance from the query to the centroid. Since we computed the distance between a vector and a centroid previously we'll just include that distance here for reference: ∥ q − c ∥ = 3.17 \\|q - c\\| = 3.17 ∥ q − c ∥ = 3.17 Estimated Distance Alright! We have quantized our vectors and collected corrective factors. Now we are ready to compute the estimated distance between v 1 v_1 v 1 ​ and q q q . Let's transform our Euclidean distance into an equation that has much more computationally friendly terms: d i s t ( v 1 , q ) = ∥ v 1 − q ∥ = ∥ ( v 1 − c ) − ( q − c ) ∥ 2 = ∥ v 1 − c ∥ 2 + ∥ q − c ∥ 2 − 2 × ∥ v 1 − c ∥ × ∥ q − c ∥ × ( q c ⋅ v c 1 ) \\begin{align*} dist(v_1, q) &= \\|v_1 - q\\| \\newline &= \\sqrt{\\|(v_1 - c) - (q - c)\\|^2} \\newline &= \\sqrt{\\|v_1 - c\\|^2 + \\|q - c\\|^2 - 2 \\times \\|v_1 - c\\| \\times \\|q - c\\| \\times (q_c \\cdot v_{c1})} \\end{align*} d i s t ( v 1 ​ , q ) ​ = ∥ v 1 ​ − q ∥ = ∥ ( v 1 ​ − c ) − ( q − c ) ∥ 2 ​ = ∥ v 1 ​ − c ∥ 2 + ∥ q − c ∥ 2 − 2 × ∥ v 1 ​ − c ∥ × ∥ q − c ∥ × ( q c ​ ⋅ v c 1 ​ ) ​ ​ In this form, most of these factors we derived previously, such as ∥ v 1 − c ∥ \\|v_1-c\\| ∥ v 1 ​ − c ∥ , and notably can be pre-computed prior to query or are not direct comparisons between the query vector and any given data vector such as v 1 v_1 v 1 ​ . We however still need to compute q c ⋅ v c 1 q_c \\cdot v_{c1} q c ​ ⋅ v c 1 ​ . We can utilize our corrective factors and our quantized binary distance metric q ‾ ⋅ r 1 \\overline{q} \\cdot r_1 q ​ ⋅ r 1 ​ to estimate this value reasonably and quickly. Let's walk through that. q c ⋅ v c 1 ≈ ( q c ⋅ v ‾ 1 ) / ( v c 1 ⋅ v ‾ 1 ) q_c \\cdot v_{c1} \\approx (q_c \\cdot \\overline{v}_1) / (v_{c1} \\cdot \\overline{v}_1) q c ​ ⋅ v c 1 ​ ≈ ( q c ​ ⋅ v 1 ​ ) / ( v c 1 ​ ⋅ v 1 ​ ) Let's start by estimating q c ⋅ v ‾ 1 q_c \\cdot \\overline{v}_1 q c ​ ⋅ v 1 ​ which requires this equation which essentially unrolls our transformations using the representative points we defined earlier: q c ⋅ v ‾ 1 ≈ ( l o w e r + w i d t h ⋅ q ‾ ) ⋅ ( 1 d ( 2 r 1 − 1 ) ) q_c \\cdot \\overline{v}_1 \\approx (lower + width \\cdot \\overline{q}) \\cdot (\\frac{1}{\\sqrt{d}}(2r_1 - 1)) q c ​ ⋅ v 1 ​ ≈ ( l o w er + w i d t h ⋅ q ​ ) ⋅ ( d ​ 1 ​ ( 2 r 1 ​ − 1 )) Specifically, 1 d ( 2 r 1 − 1 ) \\frac{1}{\\sqrt{d}}(2r_1-1) d ​ 1 ​ ( 2 r 1 ​ − 1 ) maps the binary values back to our representative point and l o w e r + w i d t h ⋅ q ‾ lower + width \\cdot \\overline{q} l o w er + w i d t h ⋅ q ​ undoes the shift and scale used to compute the scalar quantized query components. We can rewrite this to make it more computationally friendly like this. First, though, let's define a couple of helper variables: total number of 1's bits in r 1 r_1 r 1 ​ as v ‾ b 1 \\overline{v}_{b1} v b 1 ​ , 1 1 1 in this case total number of all quantized values in q ‾ b \\overline{q}_b q ​ b ​ as q ‾ b \\overline{q}_b q ​ b ​ , 15 15 15 in this case q c ⋅ v ‾ 1 ≈ 2 × w i d t h d × ( q ‾ ⋅ r 1 ) + 2 × l o w e r d × v ‾ b 1 − w i d t h d × q ‾ b − d × l o w e r ≈ 2 × 0.08 2 × 15 + 2 × - 0.92 2 × 1 − 0.08 2 × 15 − 2 × - 0.92 ≈ 0.92 \\begin{align*} q_c \\cdot \\overline{v}_1 &\\approx \\frac{2 \\times width}{\\sqrt{d}} \\times (\\overline{q} \\cdot r_1) + \\frac{2 \\times lower}{\\sqrt{d}} \\times \\overline{v}_{b1} - \\frac{width}{\\sqrt{d}} \\times \\overline{q}_b - \\sqrt{d} \\times lower \\newline &\\approx \\frac{2 \\times 0.08}{\\sqrt{2}} \\times 15 + \\frac{2 \\times \\text{-}0.92}{\\sqrt{2}} \\times 1 - \\frac{0.08}{\\sqrt{2}} \\times 15 - \\sqrt{2} \\times \\text{-}0.92 \\newline &\\approx 0.92 \\end{align*} \\newline q c ​ ⋅ v 1 ​ ​ ≈ d ​ 2 × w i d t h ​ × ( q ​ ⋅ r 1 ​ ) + d ​ 2 × l o w er ​ × v b 1 ​ − d ​ w i d t h ​ × q ​ b ​ − d ​ × l o w er ≈ 2 ​ 2 × 0.08 ​ × 15 + 2 ​ 2 × - 0.92 ​ × 1 − 2 ​ 0.08 ​ × 15 − 2 ​ × - 0.92 ≈ 0.92 ​ With this value and v c 1 ⋅ v ‾ 1 v_{c1} \\cdot \\overline{v}_1 v c 1 ​ ⋅ v 1 ​ , which we precomputed when indexing our data vector, we can then plug those values in to compute an approximation of q c ⋅ v c 1 q_c \\cdot v_{c1} q c ​ ⋅ v c 1 ​ : q c ⋅ v c 1 ≈ ( q c ⋅ v ‾ 1 ) / ( v c 1 ⋅ v ‾ 1 ) ≈ 0.92 / 0.90 ≈ 1.01 \\begin{align*} q_c \\cdot v_{c1} &\\approx (q_c \\cdot \\overline{v}_1) / (v_{c1} \\cdot \\overline{v}_1) \\newline &\\approx 0.92 / 0.90 \\newline &\\approx 1.01 \\end{align*} q c ​ ⋅ v c 1 ​ ​ ≈ ( q c ​ ⋅ v 1 ​ ) / ( v c 1 ​ ⋅ v 1 ​ ) ≈ 0.92/0.90 ≈ 1.01 ​ Finally, let's plug this into our larger distance equation noting that we are using an estimate for q c ⋅ v c 1 q_c \\cdot v_{c1} q c ​ ⋅ v c 1 ​ : d i s t ( v 1 , q ) = ∥ v 1 − c ∥ 2 + ∥ q − c ∥ 2 − 2 × ∥ v 1 − c ∥ × ∥ q − c ∥ × ( q c ⋅ v c 1 ) e s t _ d i s t ( v 1 , q ) = 1.1 3 2 + 3.1 7 2 − 2 × 1.13 × 3.17 × 1.01 dist(v_1, q) = \\sqrt{\\|v_1-c\\|^2 + \\|q-c\\|^2 - 2 \\times \\|v_1-c\\| \\times \\|q-c\\| \\times (q_c \\cdot v_{c1})} \\newline est\\_dist(v_1, q) = \\sqrt{1.13^2 + 3.17^2 − 2 \\times 1.13 \\times 3.17 \\times 1.01} d i s t ( v 1 ​ , q ) = ∥ v 1 ​ − c ∥ 2 + ∥ q − c ∥ 2 − 2 × ∥ v 1 ​ − c ∥ × ∥ q − c ∥ × ( q c ​ ⋅ v c 1 ​ ) ​ es t _ d i s t ( v 1 ​ , q ) = 1.1 3 2 + 3.1 7 2 − 2 × 1.13 × 3.17 × 1.01 ​ With all of the corrections applied we are left with a reasonable estimate of the distance between two vectors. For instance in this case our estimated distances between each of our original data vectors, v 1 v_1 v 1 ​ , v 2 v_2 v 2 ​ , v 3 v_3 v 3 ​ , and q q q compared to the true distances are: e s t _ d i s t ( v 1 , q ) = 2.02 e s t _ d i s t ( v 2 , q ) = 1.15 e s t _ d i s t ( v 3 , q ) = 6.15 est\\_dist(v_1, q) = 2.02 \\newline est\\_dist(v_2, q) = 1.15 \\newline est\\_dist(v_3, q) = 6.15 \\newline ~\\\\ es t _ d i s t ( v 1 ​ , q ) = 2.02 es t _ d i s t ( v 2 ​ , q ) = 1.15 es t _ d i s t ( v 3 ​ , q ) = 6.15 e u c l _ d i s t ( v 1 , q ) = 2.55 e u c l _ d i s t ( v 2 , q ) = 2.50 e u c l _ d i s t ( v 3 , q ) = 5.52 eucl\\_dist(v_1, q) = 2.55 \\newline eucl\\_dist(v_2, q) = 2.50 \\newline eucl\\_dist(v_3, q) = 5.52 e u c l _ d i s t ( v 1 ​ , q ) = 2.55 e u c l _ d i s t ( v 2 ​ , q ) = 2.50 e u c l _ d i s t ( v 3 ​ , q ) = 5.52 For details on how the linear algebra is derived or simplified when applied refer to the RaBitQ paper . Re-ranking As you can see from the results in the prior section, these estimated distances are indeed estimates. Binary quantization produces vectors whose distance calculations are, even with the extra corrective factors, only an approximation of the distance between vectors. In our experiments we were able to achieve high recall by involving a multi-stage process. This confirms the findings in the RaBitQ paper . Therefore to achieve high quality results, a reasonable sample of vectors returned from binary quantization then must be re-ranked with a more exact distance computation. In practice this subset of candidates can be small achieving typically high > 95% recall with 100 or less candidates for large datasets (>1m). With RaBitQ results are re-ranked continually as part of the search operation. In our experiments to achieve a more scalable binary quantization we decoupled the re-ranking step. While RaBitQ is able to maintain a better list of top N N N candidates by re-ranking while searching, it is at the cost of constantly paging in full f l o a t 32 float32 f l o a t 32 vectors, which is untenable for some larger production-like datasets. Conclusion Whew! You made it! This blog is indeed a big one. We are extremely excited about this new algorithm as it can alleviate many of the pain points of Product Quantization (e.g. code-book building cost, distance estimation slowness, etc.) and provide excellent recall and speed. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Jump to Introduction Building the Bit Vectors Find a Representative Centroid 1 Bit, 1 Bit Only Please The Catch Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "RaBitQ binary quantization 101 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/rabitq-explainer-101",
+    "meta_description": "Understand the most critical components of RaBitQ binary quantization, how it works and its benefits, including the math behind quantization and examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Getting Started with the Elastic Chatbot RAG app using Vertex AI running on Google Kubernetes Engine Learn how to configure the Elastic Chatbot RAG app using Vertex AI and run it on Google Kubernetes Engine (GKE). Integrations How To JS By: Jonathan Simon On April 4, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. I’ve been having fun playing around with the Elastic Chatbot RAG example app . It’s open source and it’s a great way to get started with understanding Retrieval Augmented Generation (RAG) and trying your hand at running a RAG app. The app supports integration with a variety of GenAI Large Language Models (LLMs) like OpenAI, AWS Bedrock, Azure OpenAI, Google Vertex AI, Mistral AI, and Cohere. You can run the app on your local computer using Python or using Docker . You can also run the app in Kubernetes. I recently deployed the app to Google Kubernetes Engine (GKE) configured to use Google Vertex AI as the backing LLM. I was able to do it all just using a browser, Google Cloud, and Elastic Cloud. This blog post will walk you through the step-by-step process that I followed to configure the Chatbot RAG app to use Vertex AI and how to run it on GKE. Enable Vertex API Since this blog post is focused on running the Elastic Chatbot RAG app with Vertex AI as the backing LLM, the very first step is to go to Google Cloud and enable the Vertex AI API . If this is your first time using Vertex AI, you'll see an Enable all recommended APIs button. Click that button to enable the necessary Google Cloud APIs to use Vertex AI. Once you've done that, you should see that the Vertex AI API is now enabled. Use Google Cloud Shell Editor to clone the Chatbot RAG app Now that you’ve got the Vertex AI API enabled, the next step is to clone the code for the Chatbot RAG app. Google Cloud has the perfect tool for doing this right in your browser: Cloud Shell Editor. 1. Open Google Cloud Shell Editor . 2. Open your terminal in Cloud Shell Editor. Click the Terminal menu and select New Terminal. 3. Clone the Chatbot RAG app by running the following command in the terminal. 4. Change directory to the Chatbot RAG app’s directory using the following command. Use Google Cloud Shell Editor to create an app configuration file The app needs to access Elastic Cloud and Vertex AI and it does so using configuration values that are stored in a configuration file. The configuration file for the app should have the filename .env and you will create it now. The example app includes an example configuration file named env.example that you can copy to create a new file. 1. Create a .env file that will contain the app’s configuration values using the following command: 2. Click the View menu and select Toggle Hidden Files . Files like .env are hidden in Cloud Shell Editor by default. 3. Open the .env file for editing. Find the line that sets the ELASTICSEARCH_URL value. That’s where you’ll make your first edit. Elastic Cloud - Create Deployment The Chatbot RAG app needs an Elasticsearch backend that will power the retrieval augmentation part of the RAG app. So the next step is to create an Elastic Cloud deployment with Elasticsearch and ML enabled. Once the deployment is ready, copy the Elasticsearch Endpoint URL to add it to the app’s .env configuration file. Create an Elastic Cloud deployment. Copy the Elasticsearch Endpoint URL. Use Google Cloud Shell Editor to update the .env configuration file with Elasticsearch URL Add Elasticsearch Endpoint URL to the .env file. Comment out unused configuration lines. Uncomment the line where the ELASTICSEARCH_API_KEY is set. Elastic Cloud - Create API Key and add its value to .env configuration file Jumping back into the Elastic Cloud deployment, click the Create API Key button to create a new API Key that will be used by the app to access Elasticsearch running in your deployment. Paste the copied API Key into your .env configuration file using Google Cloud Shell Editor. Create an Elastic Cloud API Key. Copy the Key’s encoded value and add it to the app’s .env configuration file in Google Cloud Shell Editor. Use Google Cloud Shell Editor to update the .env configuration file to use Vertex AI Moving down in the .env configuration file, find the lines to configure a connection to Vertex AI and uncomment them. The first custom values that you'll need to set are GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_REGION . Set GOOGLE_CLOUD_PROJECT to your Google Cloud Project ID, which you can find right on the welcome page for your Google Cloud project . Set GOOGLE_CLOUD_REGION to one of the available regions supported by Vertex AI that you’d like to use. For this blog post, we used us-central1 . Uncomment the Vertex AI lines in the .env configuration file. Set GOOGLE_CLOUD_PROJECT to your Google Cloud Project ID in the .env configuration file. Set GOOGLE_CLOUD_REGION in the .env configuration file to one of the available regions supported by Vertex AI . Save the changes to the .env configuration file. Google Cloud IAM - Create Service Account and download its Key file Now it’s time to set up the app’s access to Vertex AI and GKE. You can do this by creating a Google IAM Service Account and assigning it the Roles to grant the necessary permissions. 1. Create a Service Account with the IAM Roles necessary to access Vertex AI and GKE. Add the following IAM Roles: Vertex AI Custom Code Service Agent Kubernetes Engine Default Node Service Account 2. Create a Service Account Key and download it to your local computer. Google Kubernetes Engine - Create cluster Google Kubernetes Engine (GKE) is where you’re going to deploy and run the Chatbot RAG app. GKE is the gold standard of managed Kubernetes, providing a super scalable infrastructure for running your applications. While creating a new GKE cluster, in the “create cluster” dialog, edit the Advanced settings > Security setting to use the Service Account you created in the previous step. Create a new GKE cluster. Use the Service Account created previously, within Advanced settings > Security , when creating the cluster. Google Cloud Shell Editor - Upload Google Service Account Key file Back in Google Cloud Shell Editor you can now complete the configuration of the app's settings by adding the Google Service Account key you previously downloaded to your local computer. Click the Cloud Shell Editor’s More button to upload and add the Google Cloud Service Account key file to the app. Upload the Google Cloud Service Account key file using Cloud Shell Editor. Save the file to the top level directory of the Chatbot RAG app. Google Cloud Shell Editor - Deploy app to Google Kubernetes Engine Everything for the app’s configuration is in place, so you can now deploy the app to GKE. Connect the Cloud Shell terminal to your GKE cluster using the gcloud command line tool. Once you’re connected to the cluster, you can use the kubectl command line tool to add the configuration values from your .env configuration file to your cluster. Next, use kubectl to add the Google Cloud Service Account key file to your cluster. Then, use kubectl to deploy the app to GKE. 1. Connect the Cloud Shell terminal to your GKE cluster using gcloud. Replace example-project in the command with your Google Cloud Project ID found on the welcome page for your Google Cloud project . 2. Add your .env configuration file values to your cluster using kubectl. 3. Add the Google Cloud Service Account key file to your cluster using kubectl . 4. Deploy the app to your cluster using kubectl . This command will create a new Elasticsearch index in Elastic Cloud with sample data, initialize the frontend and backend of the app with the values that you provided in the .env file and then deploy the app to the GKE cluster. It will take a few minutes for the app to be deployed. You can use the GKE cluster’s details page to watch its status. Google Kubernetes Engine - Expose deployed app The final required step is to expose the app in GKE so it's viewable on the Internet and in your browser. You can do this in Google Cloud’s GKE Workloads , which is where your deployed app will appear as chatbot-rag-app in the list of running GKE workloads. Select your workload by clicking on its workload Name link. In the details page of your app’s workload, use the Actions menu to select the Expose action. In the Expose dialog, set the Target port 1 to 4000 which is the port that the Chatbot RAG app is configured to run on in the k8s-manifest.yml file that was used for its deployment to GKE. Select the chatbot-rag-app in GKE Workloads. Use the Expose action from the Actions menu to expose the app. In the Expose dialog, set the Target port 1 to 4000 . Try out the app After clicking the Expose button for the workload, you’ll be taken to the workload’s Service Details page in GKE. Once the exposed app is ready, you'll see External Endpoints displayed along with a linked IP address. Click the IP address to try out the Chatbot RAG app. Elastic Cloud is your starting point for GenAI RAG apps Thanks for reading. Check out a guided tour of all the steps included in this blog post. Get started with building GenAI RAG apps today and give Elastic Cloud a try. Read to explore more? Try a hands-on tutorial where you can build a RAG app in a sandbox environment. To learn more about using RAG for real world applications, see our recent blog post series GenAI for customer support . Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Enable Vertex API Use Google Cloud Shell Editor to clone the Chatbot RAG app Use Google Cloud Shell Editor to create an app configuration file Elastic Cloud - Create Deployment Use Google Cloud Shell Editor to update the Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Getting Started with the Elastic Chatbot RAG app using Vertex AI running on Google Kubernetes Engine - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-rag-chatbot-vertex-ai-gke",
+    "meta_description": "Learn how to configure the Elastic Chatbot RAG app using Vertex AI and run it on Google Kubernetes Engine (GKE)."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Building Elastic Cloud Serverless Explore the architecture of Elastic Cloud Serverless and key design and scalability decisions we made along the way of building it. Elastic Cloud Serverless JT By: Jason Tedor On May 15, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. This blog explores the architectural decisions we made along the journey of building Elastic Cloud Serverless, including key design and scalability decisions. Architecture of Elastic Cloud Serverless In October 2022 we introduced the Stateless architecture of Elasticsearch . Our primary goal with that initiative was to evolve Elasticsearch to take advantage of operational, performance, and cost efficiencies offered by cloud-native services. That initiative became part of a larger endeavor that we recently announced called the Search AI Lake , and serves as the foundation for our new Elastic Cloud Serverless offering. In this endeavor, we aimed not only to make Elastic Stack products such as Elasticsearch and Kibana more cloud-native, but also their orchestration too. We designed and built a new backend platform powered by Kubernetes for orchestrating Elastic Cloud Serverless projects, and evolved the Elastic Stack products to be easier for us to orchestrate in Kubernetes. In this article, we'd like to detail a few of the architectural decisions we made along the way. In future articles, we will dive deeper into some of these aspects. One of the main reasons that we settled on Kubernetes to power the backend is due to the wealth of resources in Kubernetes for solving container lifecycle management, scaling, resiliency, and resource management issues. We established an early principle of \"doing things the native Kubernetes way\", even if that meant non-trivial evolutions of Elastic Stack products. We have built a variety of Kubernetes-native services for managing, observing, and orchestrating physical instances of Elastic Stack products such as Elasticsearch and Kibana. This includes custom controllers/operators for provisioning, managing software updates, and autoscaling; and services for authentication, managing operational backups, and metering. For a little bit of background, our backend architecture has two high-level components. Control Plane: This is the user-facing management layer. We provide UIs and APIs for users to manage their Elastic Cloud Serverless projects. This is where users can create new projects, control who has access to their projects, and get an overview of their projects. Data Plane: This is the infrastructure layer that powers the Elastic Cloud Serverless projects, and the layer that users interact with when they want to use their projects. The Control Plane is a global component, and the Data Plane consists of multiple \"regional components\". These are individual Kubernetes clusters in individual Cloud Service Provider (CSP) regions. Key design decisions in building Elastic Cloud Serverless Scale Kubernetes horizontally Our Data Plane will be deployed across AWS, Azure, and Google Cloud. Within each major CSP, we will operate in several CSP regions. Rather than vertically scaling up massive Kubernetes clusters, we have designed for horizontally scaling independent Kubernetes clusters using a cell-based architecture . Within each CSP region, we will be running many Kubernetes clusters. This design choice enables us to avoid Kubernetes scaling limits, and also serves as smaller fault domains in case a Kubernetes cluster fails. Push vs. pull One interesting debate we had was \"push vs. pull\". In particular, how should the global Control Plane communicate with individual Kubernetes clusters in the Data Plane? For example, when a new Elastic Cloud Serverless project is created and needs to be scheduled in a Kubernetes cluster in the Data Plane, should the global Control Plane push the configuration of that project down to a selected Kubernetes cluster, or should a Kubernetes cluster in the Data Plane watch for and pull that configuration from the global Control Plane? As always, there are tradeoffs in both approaches. We settled on the push model because: The scheduling logic is simpler as the global Control Plane solely chooses an appropriate Kubernetes cluster Dataflow will be uni-directional vs. dataflow must be bi-directional in the pull model Kubernetes clusters in the Data Plane can operate independently from the global Control Plane services Simplified operations and handling of failure scenarios there is no need to manage two Data Plane clusters in the same region competing for scheduling or rescheduling an application if the global Control Plane fails, we can manually interact with the Kubernetes API server in the target cluster to simulate the blessed path in the push model; however, simulating the blessed path in the pull model is not easily achievable when the global Control Plane being watched is unavailable Managed Kubernetes infrastructure Within each major CSP, we have elected to use their managed Kubernetes offerings (AWS EKS, Azure AKS, and Google GKE). This was an early decision for us to reduce the management burden of the clusters themselves. While the managed Kubernetes offerings meet that goal, they are otherwise barebones. We wanted to be more opinionated about the Kubernetes clusters that our engineers build on, reducing the burden on our engineering teams, and providing certain things for free. The (non-exhaustive) types of things that we provide out-of-the-box to our engineering teams are: guarantees around the configuration of and the services available on the clusters a managed network substrate a secure baseline, and compliance guarantees around internal policies and security standards managed observability—the logs and metrics of every component are automatically collected and shipped to centralized observability infrastructure, which is also based on the Elastic Stack capacity management Internally we call this wrapped infrastructure \"Managed Kubernetes Infrastructure\". It is a foundational building block for us, and enables our engineering teams to focus on building and operating the services that they create. Clusters are disposable An important architectural principle we took here is that our Kubernetes clusters are considered disposable. They are not the source of truth for any important data so we will never experience data loss on a Kubernetes disaster and they can be recreated at any time. This level of resiliency is important to safeguard our customer's data. This architectural principle will simplify the operability of our platform at our scale. Key scalability decisions in building Elastic Cloud Serverless Object store API calls As the saying goes, that's how they get you. We previously outlined the stateless architecture of Elasticsearch where we are using object stores (AWS S3, Azure Blob Storage, Google Cloud Storage) as a primary data store. At a high-level, the two primary cost dimensions when using a major CSP object store are storage, and API calls. The storage dimension is fairly obvious and easy to estimate. But left unchecked, the cost of object store API calls can quickly explode. With the object store serving as the primary data store, and the per-shard data structures such as the translog, this meant that every write to Elasticsearch would go to the object store, and therefore every write to a shard would incur at least one object store API call. For an Elasticsearch node holding many shards frequently receiving writes, the costs would add up very quickly. To address this, we evolved the translog writes to be performed per-node, where we coalesce the writes across per-shard translogs on a node, and flush them to the object store every 200ms. A related aspect is refreshes . In Elasticsearch, refreshes translate to writes to its backing data store, and in the stateless architecture, this means writes to the object store and therefore object store API calls. As some use cases expect a high refresh rate, for example, every second, these object store API calls would amplify quickly when an Elasticsearch node is receiving writes across many shards. This means we have to trade off between suboptimal UX and high costs. What is more, these refresh object store API calls are independent of the amount of data ingested in that one second period which means they're difficult to tie to perceived user value. We considered several ways to address this: an intermediate data store that doesn't have per-operation costs, that would sit between Elasticsearch and the object store decoupling refreshes from writes to the object store compounding into a single object refreshes across all shards on node We ultimately settled on decoupling refreshes from writes to the object store. Instead of a refresh triggering a write to the object store that the search nodes would read so they had access to the recently performed operations, the primary shard will push the refreshed data (segments) directly to the search nodes, and defer writing to the object store until a later time. There's no risk of data loss with this deferment because we still persist operations to the translog in the object store. While this deferment does increase recovery times, it comes with a two order of magnitude reduction in the number of refresh-triggered object store API calls. Autoscaling One major UX goal we had with Elastic Cloud Serverless was to remove the need for users to manage the size/capacity of their projects. While this level of control is a powerful knob for some users, we envisioned a simpler experience where Elastic Cloud Serverless would automatically respond to the demand of increased ingestion rates or querying over larger amounts of data. With the separation of storage and compute in the stateless Elasticsearch architecture, this is a much easier problem to solve than before as we can now manage the indexing and search resources independently. One early problem that we encountered was the need to have an autoscaler that can support both vertical and horizontal autoscaling, so that as more demand is placed on a project, we can both scale up to larger nodes, and scale out to more nodes. Additionally, we ran into scalability issues with the Kubernetes Horizontal Pod Autoscaler. To address this, we have built custom autoscaling controllers . These custom controllers obtain application-level metrics (specific to the workload being scaled, e.g., indexing vs. search), make autoscaling decisions, and push these decisions to the resource definitions in Kubernetes. These decisions are then acted upon to actually scale the application to the desired resource level. With this framework in place, we can independently add more tailored metrics (e.g., search query load metrics) and therefore intelligence to the autoscaling decisions. This will enable Elastic Cloud Serverless projects to iteratively respond more dynamically to user workloads over time. Conclusion These are only a few of the interesting architectural decisions we made along the journey of building Elastic Cloud Serverless. We believe this new platform gives us a foundation to rapidly deliver more functionality to our users over time, while being easier to operate, performant, scalable, and cost efficient. Stay tuned to several future articles where we will dive deeper into some of the above concepts. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Architecture of Elastic Cloud Serverless Key design decisions in building Elastic Cloud Serverless Scale Kubernetes horizontally Push vs. pull Managed Kubernetes infrastructure Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Building Elastic Cloud Serverless - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/building-elastic-cloud-serverless",
+    "meta_description": "Explore the architecture of Elastic Cloud Serverless and key design and scalability decisions we made along the way of building it."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elastic Jira connector tutorial part II: Optimization tips After connecting Jira to Elasticsearch, we'll now review best practices to escalate this deployment. Integrations Ingestion How To GL By: Gustavo Llermaly On January 16, 2025 Part of Series Jira connector tutorials Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In part I of this series , we configured the Elastic Jira connector and indexed objects into Elasticsearch. In this second part, we'll review some best practices and advanced configurations to escalate the connector. These practices complement the current documentation and are to be used during the indexing phase. Having a connector running was just the first step. When you want to index large amounts of data, every detail counts and there are many optimization points you can use when you index documents from Jira. Jira connector optimization points Index only the documents you'll need by applying advanced sync filters Index only the fields you'll use Refine mappings based on your needs Automate Document Level security Offload attachment extraction Monitor the connector's logs 1. Index only the documents you'll need by applying advanced sync filters By default, Jira sends all projects, issues, and attachments. If you're only interested in some of these or, for example, just issues \"In Progress\", we recommend not to index everything. There are three instances to filter documents before we put them in Elasticsearch: Remote : We can use a native Jira filter to get only what we need. This is the best option and you should try to use this option any time you can since with this, the documents don't even come out of the source before getting into Elasticsearch. We'll use advanced sync rules for this. Integration: If the source does not have a native filter to provide what we need, we can still filter at an integration level before ingesting into Elasticearch by using basic sync rules . Ingest Pipelines: The last option to handle data before indexing it is using Elasticsearch ingest pipelines . By using Painless scripts, we get great flexibility to filter or manipulate documents. The downside to this is that the data has already left the source and been through the connector, thus potentially putting a heavy load on the system and creating security issues. Let's do a quick review of the Jira issues: Note: We use \"exists\" query to only return the documents with the field we are filtering. You can see there are many issues in \"To Do\" that we don't need: To only get the issues \"In Progress\", we'll create an advanced sync rule using a JQL query ( Jira query language ): Go to the connector and click on the sync rules tab and then on Draft Rules . Once inside, go to Advanced Sync Rules and add this: Once the rule has been applied, run a Full Content Sync . This rule will exclude all issues that are not \"In Progress\". You can check by running the query again: Here's the new response: 2. Index only the fields you'll use Now that we have only the documents we want, you can see that we're still getting a lot of fields that we don't need. We can hide them when we run the query by using _source , but the best option is to simply not index them. To do so, we'll use the ingest pipelines . We can create a pipeline that drops all the fields we won't use. Let's say we only want this info from an issue: Assignee Title Status We can create a new ingest pipeline that only gets those fields by using the ingest pipelines' Content UI : Click on Copy and customize and then modify the pipeline called index-name@custom that should have just been created and empty. We can do it using Kibana DevTools console , running this command: Let's remove the fields that we don't need and also move the ones that we need to the root of the document. The remove processor with the keep parameter, will delete all the fields but the ones within the keep array from the document. We can check this is working by running a simulation. Add the content of one of the documents from the index: The response will be: This looks much better! Now, let's run a full content sync to apply the changes. 3. Refine mappings based on your needs The document is clean. However, we can optimize things more. We can go into “it depends” territory. Some mappings can work for your use case while others will not. The best way to find out is by experimenting. Let's say we tested and got to this mappings design: assignee : full text search and filters summary : full text search status : filters and sorting By default, the connector will create mappings using dynamic_templates that will configure all text fields for full-text search, filtering and sorting, which is a solid baseline but it can be optimized if we know what we want to do with our fields. This is the rule: Let's create different subfields for different purposes for all text fields. You can find additional information about the analyzers in the documentation . To use these mappings you must: Create the index before you create the connector When you create the connector, select that index instead of creating a new one Create the ingest pipeline to get the fields you want Run a Full Content Sync* *A Full Content Sync will send all documents to Elasticsearch. Incremental Sync will only send to Elasticsearch documents that changed after the last Incremental, or Full Content Sync. Both methods will fetch all the data from the data source. Our optimized mappings are below: For assignee, we kept the mappings as they are because we want this field to be optimized for both search and filters. For summary, we removed the “enum” keyword field because we don’t plan to filter on summaries. We mapped status as a keyword because we only plan to filter on that field. Note: If you're not sure how you will use your fields, the baselines analyzers should be fine. 4. Automate Document Level security In the first section, we learned to manually create API keys for a user and limit access based on it using Document Level Security (DLS) . However, if you want to automatically create an API Key with permissions every time a user visits our site, you need to create a script that takes the request, generates an API Key using the user ID and then uses it to search in Elasticsearch. Here's a reference file in Python: You can call this create_api_key function on each API request to generate an API Key the user can use to query Elasticsearch in the subsequent requests. You can set expiration, and also arbitrary metadata in case you want to register some info about the user or the API that generated the key. 5. Offload attachment extraction For content extraction, like extracting text from PDF and Powerpoint files, Elastic provides an out of the box service that works fine but has a size limitation. By default, the extraction service of the native connectors supports 10MB max per attachment. If you have bigger attachments like a PDF with big images inside or you want to host the extraction service, Elastic offers a tool that lets you deploy your own extraction service. This option is only compatible with Connector Clients, so if you're using a Native connector you will need to convert it to a connector client and host it in your own infrastructure. Follow these steps to do it: a. Configure custom extraction service and run it with Docker EXTRACTION_SERVICE_VERSION you should use 0.3.x for Elasticsearch 8.15 b. Configure yaml con extraction service custom and run Go to the connector client and add the following to the config.yml file to use the extraction service: c. Follow steps to run connector client After configuring you can run the connector client with the connector you want to use. You can refer to the full process in the docs . 6. Monitor Connector's logs It's important to have visibility of the connector's logs in case there's an issue and Elastic offers this out of the box. The first step is to activate logging in the cluster. The recommendation is to send logs to an additional cluster (Monitoring deployment), but in a development environment, you can send the logs to the same cluster where you're indexing documents too. By default, the connector will send the logs to the elastic-cloud-logs-8 index. If you're using Cloud, you can check the logs in the new Logs Explorer : Conclusion In this article, we learned different strategies to consider when we take the next step in using a connector in a production environment. Optimizing resources, automating security, and cluster monitoring are key mechanisms to properly run a large-scale system. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Jira connector optimization points 1. Index only the documents you'll need by applying advanced sync filters 2. Index only the fields you'll use 3. Refine mappings based on your needs 4. Automate Document Level security Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elastic Jira connector tutorial part II: Optimization tips - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elastic-jira-connector-optimization",
+    "meta_description": "Discover best practices and advanced configuration tips to escalate the Elastic Jira connector."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Simplifying data lifecycle management for data streams Elasticsearch data streams can now be managed by a data stream property called lifecycle. Learn to set up and update a data stream lifecycle here. How To AD By: Andrei Dan On June 13, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Today, we’ll explore Elasticsearch’s new data management system for data streams: data stream lifecycle , available from version 8.14. With its straightforward and robust execution model, the data stream lifecycle lets you concentrate on the business-related aspects of your data's lifecycle, such as downsampling and retention. Behind the scenes, it automatically ensures that the Elasticsearch structures storing your data are efficiently managed. This blog explains the evolution of data lifecycle management in Elasticsearch, how to configure a data stream lifecycle, update the configured lifecycle, and migrate from ILM to the data stream lifecycle. Data lifecycle management evolution in Elasticsearch Since the 6.x Elasticsearch series, Index Lifecycle Management (ILM) has empowered users to maintain healthy indices and save costs by automatically migrating data between tiers . ILM takes care of indices based on their unique performance, resilience, and retention needs, whilst offering significant control over cost and defining an index's lifecycle in great detail. ILM is a very general solution that caters to a broad range of use cases, from time series indices and data streams to indices that store text content. The lifecycle definitions will be very different for all these use cases, and it gets even more divergent when we factor each individual deployment’s available hardware and data tiering resources. For this reason, ILM allows fully customisable lifecycle definitions, at the cost of complexity (precise rollover definitions; when to force merge, shrink, and (partially) mount indices). As we started working on our Serverless solution we got a chance to look at the lifecycle management through a new lens where our users could (and will) be shielded from Elasticsearch internal concepts like shards, allocations, or cluster topology. Even more, in Serverless we want to be able to change the internal Elasticsearch configuration as much as needed to maintain the best experience for our users. In this new context, we looked at the existing ILM solution which offers the users the internal Elasticsearch concepts as building blocks and decided we need a new solution to manage the lifecycle of data. We took the lessons learned from building and maintaining ILM at scale and created a simpler lifecycle management system for the future. This system is more specific and only applies to data streams . It's configured as a property directly on the data stream (similar to how an index setting belongs to an index), and we call it data stream lifecycle . It’s a built-in mechanism (continuing with the index settings analogy) that is always on and always reactive to the lifecycle needs of a data stream. By scoping the applicability to only data streams (i.e. data with a timestamp that’s rarely updated in place) we were able to eschew customizations in favor of ease-of-use and automatic defaults. Data stream lifecycles will automatically execute the data structure maintenance operations like rollover and force merge, and allow you to only deal with the business-related lifecycle functionality you should care about, like downsampling and data retention . A data stream lifecycle is not as feature-rich as ILM; most notably it doesn’t currently support data tiering , shrinking, or searchable snapshots . However, the use cases that do not need these particular features will be better served by data stream lifecycles. Though data stream lifecycles were originally designed for the needs of the Serverless environment, they are also available in regular on-premise and ESS Elasticsearch deployments. Configuring data stream lifecycle Let’s create an Elasticsearch Serverless project and get started with creating a data stream managed by data stream lifecycle. Once the project is created, go to Index Management and create an index template for the my-data-* index pattern and configure a retention of 30 days: Let’s navigate through the steps and finalize this index template (I’ve configured one text field in the mapping section, but that’s optional): We’ll now ingest some data that’ll target the my-data-stream namespace. I’ll use the Dev Tools section on the left hand side, but you can your preferred way of ingesting data : my-data-stream has now been created and it contains 2 documents. Let’s go to Index Management/Data Streams and check it out: And that’s it! 🎉 Our data stream is managed by data steam lifecycle, and retention for the data is configured to 30 days. All new data streams that match the my-data-* pattern will be managed by data stream and receive a 30 days data retention. Updating the configured lifecycle The data stream lifecycle property belongs to the data stream. So updating the lifecycle for existing data streams is something we configured by navigating to the data stream directly. Let’s go to Index Management/Data Streams and edit the retention for my-data-stream to 7 days: We now see our data stream has a data retention of 7 days: Now that the existing data stream in the system has the desired 7 days retention period configured, let’s also update the index template retention so that new data streams that get created also receive the 7 days retention period: Implementation details The master node periodically (every 5 minutes by default, according to the data_streams.lifecycle.poll_interval setting) iterates over the data streams in the system that are configured to be managed by the lifecycle. On every iteration, each backing index state in the system is evaluated and one operation is executed towards achieving the target state described by the configured lifecycle. For each managed data stream we first attempt to rollover the data stream according to the cluster.lifecycle.default.rollover conditions. This is the only operation attempted for the write index of a data stream. After rolling over, the former write index becomes eligible for merging. As we wanted the merging of the shards maintenance task to be something we execute automatically we implemented a lighter merging operation, an alternative to force merging to 1 segment, that only merges the long tail of small segments instead of the entire shard. The main benefit of this approach is that it can be applied automatically and early after rollover. Once a backing index has been merged, on the next lifecycle execution run, the index will be downsampled. After completing all the scheduled downsample rounds, each time the lifecycle runs, the backing index will be examined for eligibility for data retention. When the specified data retention period lapses (since rollover time), the backing index will be deleted. Both downsampling and data retention are time based operations (e.g. data_retention: 7d ) and are calculated since the index was rolled over. The time since an index has been rolled over is visible in the explain lifecycle API and we call it generation_time and represents the time since a backing index became a generational index (as opposed to being the write index of a data stream). I’ve run the explain lifecycle API for my-data-stream (which has 2 backing indices as it was rolled over) to get some insights into We can see the lifecycle definition for both indices includes the updated data retention of 7 days. The older index, .ds-my-data-stream-2024.05.09-000001, is not the write index of the data stream anymore and we can see the explain API reports the generation_time as 49 minutes. Once the generation time reaches 7 days, the .ds-my-data-stream-2024.05.09-000001 backing index will be deleted to conform with the configured data retention. Index .ds-my-data-stream-2024.05.09-000002 is the write index of the data stream and is waiting to be rolled over once it meets the rollover criteria . The time_since_index_creation field is meant to help calculating when to rollover the data stream according to an automatic max_age criteria when the data stream is not receiving a lot of data anymore. Migrating from ILM to data stream lifecycle Facilitating a smooth transition to data stream lifecycle for testing, experimenting, and eventually production migration of data streams was always a goal for this feature. For this reason, we decided to allow ILM and data stream lifecycle to co-exist on a data stream in cloud environments and on premise deployments. The ILM configuration continues to exist directly on the backing indices whilst the data stream lifecycle is configured on the data stream itself. A backing index is managed by only one management system at a time. If both ILM and data stream lifecycle are applicable for a backing index, ILM takes precedence (by default, but the precedence can be changed to data stream lifecycle using the index.lifecycle.prefer_ilm index setting). The migration path for a data stream will allow the existing ILM-managed backing indices to age out and eventually get deleted by ILM, whilst the new backing indices will start being managed by data stream lifecycle. We’ve enhanced the GET _data_stream API to include rollover information for each backing index (a managed_by field with Index Lifecycle Management , Data stream lifecycle , or Unmanaged as possible values, and the value of the prefer_ilm setting) and at the data stream level a next_generation_managed_by field to indicate the system that’ll manage the next generation backing index. To configure the future backing indices (created after data stream rollover) to be managed by data stream lifecycle two steps need to be executed: Update the index template that’s backing the data stream to set prefer_ilm to false (note that prefer_ilm is an index setting so configuring it in the index template means it’ll only be configured on the new backing indices) and configure the desired data stream lifecycle (this will make sure the new data streams will start being managed by data stream lifecycle). Configure the data stream lifecycle for the existing data streams using the lifecycle API . For a complete tutorial on migrating to data stream lifecycle check out our documentation . Conclusion We’ve built a lifecycle functionality for data streams that handles the underlying data structures maintenance automatically and lets you focus on the business lifecycle needs like downsampling and data retention. Try out our new Serverless offering and learn more about the possibilities of data stream lifecycle. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Data lifecycle management evolution in Elasticsearch Configuring data stream lifecycle Updating the configured lifecycle Implementation details Migrating from ILM to data stream lifecycle Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Simplifying data lifecycle management for data streams - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/data-lifecycle-simplified-for-data-streams",
+    "meta_description": "Elasticsearch data streams can now be managed by a data stream property called lifecycle. Learn to set up and update a data stream lifecycle here."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Indexing OneLake data into Elasticsearch - Part 1 Learn to configure OneLake, consume data using Python and index documents in Elasticsearch to then run semantic searches. Integrations Ingestion How To GL By: Gustavo Llermaly On January 23, 2025 Part of Series Indexing OneLake data into Elasticsearch Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. OneLake is a tool that allows you to connect to different Microsoft data sources like Power BI , Data Activator, and Data factory, among others. It enables centralization of data in DataLakes, large-volume repositories that support comprehensive data storage, analysis, and processing. In this article, we’ll learn to configure OneLake, consume data using Python and index documents in Elasticsearch to then run semantic searches. Sometimes you would want to run searches across unstructured data, and structured from different sources and software providers, and create visualizations with Kibana. For this kind of task indexing the documents in Elasticsearch as a central repository becomes extremely helpful. For this example, we’ll use a fake company called Shoestic, an online shoe store. We have the list of products in a structured file (CSV) while some of the products’ datasheets are in an unstructured format (DOCX). The files are stored in OneLake. You can find a Notebook with the complete example (including test documents) here . Steps OneLake initial configuration Connect to OneLake using Python Indexing documents Queries OneLake initial configuration OneLake architecture can be summarized like this: To use OneLake and Microsoft Fabric, we’ll need an Office 365 account. If you don’t have one, you can create a trial account here . Log into Microsoft Fabric using your account. Then, create a workspace called \"ShoesticWorkspace\". Once inside the newly created workspace, create a Lakehouse and name it \"ShoesticDatalake\". The last step will be creating a new folder inside “Files”. Click on “new subfolder” and name it \"ProductsData\". Done! We're ready to begin ingesting our data. Connect to OneLake using Python With our OneLake configured, we can now prepare the Python scripts. Azure has libraries to handle credentials and communicate with OneLake. Installing dependencies Run the following in the terminal to install dependencies The \"azure-identity azure-storage-file-datalake\" library lets us interact with OneLake while \"azure-cli\" access credentials and grant permissions. To read the files’ content to later index it to Elasticsearch, we use python-docx. Saving Microsoft Credentials in our local environment We’ll use \"az login\" to enter our Microsoft account and run: The flag \" --allow-no-subscriptions\" allows us to authenticate to Microsoft Azure without an active subscription. This command will open a browser window in which you’ll have to access your account and then select your account’s subscription number. We’re now ready to start writing the code! Create a file called onelake.py and add the following: _onelake.py_ Uploading files to OneLake In this example, we’ll use a CSV file and some .docx files with info about our shoe store products. Though you can upload them using the UI, we’ll do it with Python. Download the files here . We’ll place the files in a folder /data next to a new python script called upload_files.py : Run the upload script The result should be: Now that we have the files ready, let’s start analyzing and searching our data with Elasticsearch! Indexing documents We’ll be using ELSER as the embedding provider for our vector database so we can run semantic queries. We choose ELSER because it is optimized for Elasticsearch, outperforming most of the competition in out-of-domain retrieval , which means using the model as it is, without fine tuning it for your own data. Configuring ELSER Start by creating the inference endpoint: While loading the model in the background, you can get a 502 Bad Gateway error if you haven’t used ELSER before. In Kibana, you can check the model status at Machine Learning > Trained Models . Wait until the model is deployed before proceeding to the next steps. Index data Now, since we have both structured and unstructured data, we’ll use two different indices with different mappings as well in the Kibana DevTools Console . For our structured sales let’s create the following index: And to index our unstructured data (product datasheets) we'll use: Note: It’s important to use a field with copy_to to also allow running full-text and not just semantic searches on the body field. Reading OneLake files Before we begin, we need to initialize our Elasticsearch client using these commands (with your own Cloud ID and API-key ). Create a python script called indexing.py and add the following lines: Now, run the script: Queries Once the documents have been indexed in Elasticsearch, we can test the semantic queries. In this case, we’ll search for a unique term in some of the products (tag). We’ll run a keyword search against the structured data, and a semantic one against the unstructured data. 1. Keyword search Result: 2. Semantic search: *We excluded embeddings and chunks just for readability. Result: As you can see, when using the keyword search, we got an exact match to one of the tags and in contrast, when we used semantic search, we got a result that matches the meaning in the description, without needing an exact match. Conclusion OneLake makes it easier to consume data from different Microsoft sources and then indexing these documents Elasticsearch allows us to use advanced search tools. In this first part, we learnt how to connect to OneLake and index documents in Elasticsearch. In part two, we’ll make a more robust solution using the Elastic connector framework. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps OneLake initial configuration Connect to OneLake using Python Installing dependencies Saving Microsoft Credentials in our local environment Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Indexing OneLake data into Elasticsearch - Part 1 - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/onelake-ingesting-data-part-1",
+    "meta_description": "Learn to configure OneLake, consume data using Python and index documents into Elasticsearch to then run semantic searches."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Building AI Agents with AI SDK and Elastic Do you keep hearing about AI agents, and aren't quite sure what they are or how to build one in TypeScript (or JavaScript)? Join me as I dive into what AI agents are, the possible use cases they can be used for, along with an example Travel Planner Agent built using AI SDK and Elasticsearch. How To CR By: Carly Richmond On March 25, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Do you keep hearing about AI agents, and aren't quite sure what they are or how they connect to Elastic? Here I dive into AI Agents, specifically covering: What is an AI agent? What problems can be solved using AI Agents? An example agent for travel planning, available here in GitHub, using AI SDK , Typescript and Elasticsearch. What is an AI Agent? An AI agent is a software that is able to perform tasks autonomously and take actions on behalf of a human leveraging Artificial Intelligence. It achieves this by combining one or more LLMs with tools (or functions) that you define to perform particular actions. Example actions in these tools could be: Extracting information from databases, sensors, APIs or search engines such as Elasticsearch. Performing complex calculations whose results can be summarized by the LLM. Making key decisions based on various data inputs quickly. Raising necessary alerts and feedback based on the response. What can be done with them? AI Agents could be leveraged for many different use cases in numerous domains based on the type of agent you build. Possible examples include: A utility-based agent to evaluate actions and make recommendations to maximize the gain, such as to suggest films and series to watch based on a person's prior watching history. Model-based agents that make real-time decisions based on input from sensors, such as self-driving cars or automated vacuum cleaners. Learning agents that combine data and machine learning to identify patterns and exceptions in cases such as fraud detection. Utility agents that recommend investment decisions based on a person's risk market and existing portfolio to maximize their return. With my former finance hat on this could expedite such decisions if accuracy, reputational risk and regulatory factors are carefully weighted. Simple chatbots, as seen today, that can access our account information and answer basic questions using language. Example: Travel Planner To better understand what these agents can do, and how to build one using familiar web technologies, let's walk through a simple example of a travel planner written using AI SDK , Typescript and Elasticsearch. Architecture Our example comprises of 5 distinct elements: A tool, named weatherTool that pulls weather data for the location specified by the questioner from Weather API . A fcdoTool tool that provides the current travel status of the destination from the GOV.UK Content API . The flight information is pulled from Elasticsearch using a simple query in tool flightTool . All of the above information is then passed to LLM GPT-4 Turbo . Model choice When building your first AI agent, it can be difficult to figure out which is the right model to use. Resources such as the Hugging Face Open LLM Leaderboard are a good start. But for tool usage guidance you can also check out the Berkeley Function-Calling Leaderboard . In our case, AI SDK specifically recommends using models with strong tool calling capabilities such as gpt-4 or gpt-4-turbo in their Prompt Engineering documentation . Selecting the wrong model, as I found at the start of this project, can lead to the LLM not calling multiple tools in the way you expect, or even compatibility errors as you see below: Prerequisites To run this example, please ensure the prerequisites in the repository README are actioned. Basic Chat Assistant The simplest AI agent that you can create with AI SDK will generate a response from the LLM without any additional grounding context. AI SDK supports numerous JavaScript frameworks as outlined in their documentation. However the AI SDK UI library documentation lists varied support for React, Svelte, Vue.js and SolidJS, with many of the tutorials targeting Next.js. For this reason, our example is written with Next.js. The basic anatomy of any AI SDK chatbot uses the useChat hook to handle requests from the backend route, by default /api/chat/ : The page.tsx file contains our client-side implementation in the Chat component, including the submission, loading and error handling capabilities exposed by the useChat hook . The loading and error handling functionality are optional, but recommended to provide an indication of the state of the request. Agents can take considerable time to respond when compared to simple REST calls, meaning that it's important to keep a user updated on state and prevent key mashing and repeated calls. Because of the client interactivity of this component, I use the use client directive to make sure the component is considered part of the client bundle: The Chat component will maintain the user input via the input property exposed by the hook, and will send the response to the appropriate route on submission. I have used the default handleSubmit method, which will invoke the /ai/chat/ POST route. The handler for this route, located in /ai/chat/route.ts , initializes the connection to the gpt-4-turbo LLM using the OpenAI provider : Note that the above implementation will pull the API key from the environment variable OPENAI_API_KEY by default. If you need to customize the configuration of the openai provider, use the createOpenAI method to override the provider settings. With the above routes, a little help from Showdown to format the GPT Markdown output as HTML, and a bit of CSS magic in the globals.css file, we end up with a simple responsive UI that will generate an itinerary based on the user prompt: Adding tools Adding tools to AI agents is basically creating custom functions that the LLM can use to enhance the response it generates. At this stage I shall add 3 new tools that the LLM can choose to use in generation of an itinerary, as shown in the below diagram: Weather tool While the generated itinerary is a great start, we may want to add additional information that the LLM was not trained on, such as weather. This leads us to write our first tool that can be used not only as input to the LLM, but additional data that allows us to adapt the UI. The created weather tool, for which the full code is shown below, takes a single parameter location that the LLM will pull from the user input. The schema attribute accepts the parameter object using the TypeScript schema validation library Zod and ensures that the correct parameter types are passed. The description attribute allows you to define what the tool does to help the LLM decide if it wants to invoke the tool. You may have guessed that the execute attribute is where we define an asynchronous function with our desired tool logic. Specifically, the location to send to the weather API is passed to our tool function. The response is then transformed into a single JSON object that can be shown on the UI, and also used to generate the itinerary. Given we are only running a single tool at this stage, we don't need to consider sequential or parallel flows. It's simply the case of adding the tools property to the streamText method that handles the LLM output in the original api/chat route: The tool output is provided alongside the messages, which allows us to provide a more complete experience for the user. Each message contains a parts attribute that contains type and state properties. Where these properties are of value tool-invocation and result respectively, we can pull the returned results from the toolInvocation attribute and show them as we wish. The page.tsx source is changed to show the weather summary alongside the generated itinerary: The above will provide the following output to the user: FCO tool The power of AI agents is that the LLM can choose to trigger multiple tools to source relevant information when generating the response. Let's say we want to check the travel guidance for the destination country. A new tool fcdoGuidance , as per the below code, can trigger an API call to the GOV.UK Content API : You will notice that the format is very similar to the weather tool discussed previously. Indeed, to include the tool into the LLM output it's just a case of adding to the tools property and amending the prompt in the /api/chat route: Once the components showing the output for the tool are added to the page, the output for a country where travel is not advised should look something like this: LLMs that support tool calling have the choice not to call a tool unless it feels the need. With gpt-4-turbo both of our tools are being called in parallel. However, prior attempts using llama3.1 would result in a single model being called depending on the input. Flight information tool RAG, or Retrieval Augmented Generation, refers to software architectures where documents from a search engine or database is passed as the context to the LLM to ground the response to the provided set of documents. This architecture allows the LLM to generate a more accurate response based on data it has not been trained on previously. While Agentic RAG processes the documents using a defined set of tools, or alongside vector or hybrid search, it's also possible to utilize RAG as part of a complex flow with traditional lexical search as we do here. To pass the flight information alongside the other tools to the LLM, a final tool flightTool pulls outbound and inbound flights using the provided source and destination from Elasticsearch using the Elasticsearch JavaScript client : This example makes use of the Multi search API to pull the outbound and inbound flights in separate searches, before pulling out the documents using the extractFlights utility method. To use the tool output, we need to amend our prompt and tool collection once more in /ai/chat/route.ts : With the final prompt, all 3 tools will be called to generate an itinerary including flight options: Summary If you weren't 100% confident about what AI agents are, now you do! We've covered that using a simple travel planner example using AI SDK, Typescript and Elasticsearch. It would be possible to extend our planner to add other sources, allow the user to book the trip along with tours, or even generate image banners based on the location (for which support in AI SDK is currently experimental ). If you haven't dived into the code yet, check it out here ! Resources AI SDK Core Documentation AI SDK Core > Tool Calling Elasticsearch JavaScript Client Travel Planner AI Agent | GitHub Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is an AI Agent? What can be done with them? Example: Travel Planner Architecture Model choice Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Building AI Agents with AI SDK and Elastic - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/ai-agents-ai-sdk-elasticsearch",
+    "meta_description": "Do you keep hearing about AI agents, and aren't quite sure what they are or how to build one in TypeScript (or JavaScript)? Join me as I dive into what AI agents are, the possible use cases they can be used for, along with an example Travel Planner Agent built using AI SDK and Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open Inference API adds support for Cohere’s Rerank 3 model “Learn about Cohere reranking, how to use Cohere's Rerank 3 model with the Elasticsearch open inference API and Elastic's roadmap for semantic reranking.” Integrations Vector Database Generative AI How To SC MH By: Serena Chou and Max Hniebergall On April 11, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Cohere's Rerank 3 model rerank-english-v3.0 is now available in their Rerank endpoint . As the only vector database included in Cohere’s Rerank 3 launch, Elasticsearch has integrated seamless support for this new model into our open Inference API. So briefly, what is reranking? Rerankers take the ‘top n’ search results from existing vector search and keyword search systems, and provide a semantic boost to those results. With good reranking in place, you have better ‘top n’ results without requiring you to change your model or your data indexes – ultimately providing better search results you can send to large language models (LLMs) as context. Recently, we collaborated with the Cohere team to make it easy for Elasticsearch developers to use Cohere’s embeddings (available in Elasticsearch 8.13 and Serverless!). It is a natural evolution to include Cohere’s incredible reranking capabilities to unlock all of the tools necessary for true refinement of results past the first-stage of retrieval. Cohere’s Rerank 3 model can be added to any existing Elasticsearch retrieval flow without requiring any significant code changes. Given Elastic’s vector database and hybrid search capabilities, users can also bring embeddings from any 3rd party model to Elastic, to use with Rerank 3. Elastic’s approach to hybrid search When looking to implement RAG (Retrieval Augmented Generation), the strategy for retrieval and reranking is a key optimization for customers to ground LLMs and achieve accurate results. Customers have trusted Elastic for years with their private data, and are able to leverage several first-stage retrieval algorithms (e.g. for BM25/keyword, dense, and sparse vector retrieval). More importantly, most real-world search use cases benefit from hybrid search which we have supported since Elasticsearch 8.9 . For mid-stage reranking, we also offer native support for Learning To Rank and query rescore . In this walkthrough, we will focus on Cohere’s last stage reranking capabilities, and will cover Elastic’s mid stage reranking capabilities in a subsequent blog post! Cohere’s approach to reranking Cohere has seen phenomenal results with their new Rerank model. In the testing, Cohere is reporting that reranking models in particular benefit from long context. Chunking for model token limits is a necessary constraint when preparing your document for dense vector retrieval. But with Cohere’s approach for reranking, a considerable benefit to reranking can be seen based on context contained in the full document, rather than a specific chunk within the document. Rerank has a 4k token limit to enable the input of more context to unlock the full relevance benefits of incorporating this model into your Elasticsearch based search system. (i) General retrieval based on BEIR benchmark; accuracy measured as nDCG@10 (ii) Code retrieval based on 6 common code benchmarks; accuracy measured as nDCG@10 (iii) Long context retrieval based on 7 common benchmarks; accuracy measured as nDCG@10 (iv) Semi-structured (JSON) retrieval based on 4 common benchmarks; accuracy measured as nDCG@10 If you’re interested in how to chunk with LangChain and LlamaIndex , we provide chat application reference code, integrations and more in Search Labs and our open source repository . Alternatively, you can leverage Elastic’s passage retrieval capabilities and chunk with ingest pipelines . Building a RAG implementation with Elasticsearch and Cohere Now that you have a general understanding of how these capabilities can be leveraged, let’s jump into an example on building a RAG implementation with Elasticsearch and Cohere. You'll need a Cohere account and some working knowledge of the Cohere Rerank endpoint . If you’re intending to use Cohere’s newest generative model Command R+ familiarize yourself with the Chat endpoint . In Kibana , you'll have access to a console for you to input these next steps in Elasticsearch even without an IDE set up. If you prefer to use a language client - you can revisit these steps in the provided guide . Elasticsearch vector database In an earlier announcement, we had some steps to get you started with the Elasticsearch vector database. You can review the steps to cover ingesting a sample books catalog, and generate embeddings using Cohere’s Embed capabilities by reading the announcement . Alternatively, if you prefer we also provide a tutorial and Jupyter notebook to get you started on this process. Cohere reranking The following section assumes that you’ve ingested data and have issued your first search. This will give you a baseline as to how the search results are ranked with your first dense vector retrieval. The previous announcement concluded with a query issued against the sample books catalog, and, and generated the following results in response to the query string “Snow”. These results are returned in descending order of relevance. You’ll next want to configure an inference endpoint for Cohere Rerank by specifying the Rerank 3 model and API key. Once this inference endpoint is specified, you’ll now be able to rerank your results by passing in the original query used for retrieval, “Snow” along with the documents we just retrieved with the kNN search. Remember, you can repeat this with any hybrid search query as well! To demonstrate this while still using the dev console, we’ll do a little cleanup on the JSON response above. Take the hits from the JSON response and form the following JSON for the input , and then POST to the cohere_rerank endpoint we just configured. And there you have it, your results have been reranked using Cohere's Rerank 3 model. The books corpus that we used to illustrate these capabilities does not contain large passages, and is a relatively simple example. When instrumenting this for your own search experience, we recommend that you follow Cohere’s approach to populate your input with the context from the full documents returned from the first retrieved result set, not just a retrieved chunk within the documents. Elasticsearch’s accelerated roadmap to semantic reranking and retrievers In upcoming versions of Elasticsearch we will continue to build seamless support for mid and final stage rerankers. Our end goal is to enable developers to have the ability to use semantic reranking to improve the results from any search whether it is BM25, dense or sparse vector retrieval, or a combination with hybrid retrieval. To provide this experience, we are building a concept called retrievers into the query DSL. Retrievers will provide an intuitive way to execute semantic reranking, and will also enable direct execution of what you’ve configured in the open inference API in the Elasticsearch stack without relying on you to execute this in your application logic. When incorporating the use of retrievers in the earlier dense vector example, this is how different the reranking experience can be: (i) Elastic’s roadmap: The indexing step is simplified with the addition of Elastic’s future capabilities to automatically chunk indexed data (ii) Elastic’s roadmap: The kNN retriever specifies the model (in this case Cohere’s Rerank 3) that was configured as an inference endpoint (iii) Cohere’s roadmap: The step between sending the resulting data to Cohere’s Command R+ will benefit from a planned feature named extractive snippets which will enable the user to return a relevant chunk of the reranked document to the Command R+ model This was our original kNN dense vector search executed on the books corpus to return the first set of results for “Snow”. As explained in this blog, there are a few steps to retrieve the documents and pass on the correct response to the inference endpoint. At the time of this publication, this logic should be handled in your application code. In the future, retrievers can be configured to use the Cohere rerank inference endpoint directly within a single API call. In this case, the kNN query is exactly the same as my original, but the cleansing of the response before input to the rerank endpoint will no longer be a necessary step. A retriever will know that a kNN query has been executed and seamlessly rerank using the Cohere rerank inference endpoint specified in the configuration. This same principle can be applied to any search, BM25, dense, sparse and hybrid. Retrievers as an enabler of great semantic reranking is on our active and near term roadmap. Cohere’s generative model capabilities Now you’re ready with a semantically reranked set of documents that can be used to ground the responses for the large language model of your choice! We recommend Cohere’s newest generative model Command R+ . When building the full RAG pipeline, in your application code you can easily issue a command to Cohere’s Chat API with the user query and the reranked documents. An example of how this might be achieved in your Python application code can be seen below: This integration with Cohere is offered in Serverless and soon will be available to try in a versioned Elasticsearch release either on Elastic Cloud or on your laptop or self-managed environment. We recommend you use our Elastic Python client v0.2.0 against your Serverless project to get started! Happy reranking! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Elastic’s approach to hybrid search Cohere’s approach to reranking Building a RAG implementation with Elasticsearch and Cohere Elasticsearch vector database Cohere reranking Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Elasticsearch open Inference API adds support for Cohere’s Rerank 3 model - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank",
+    "meta_description": "“Learn about Cohere reranking, how to use Cohere's Rerank 3 model with the Elasticsearch open inference API and Elastic's roadmap for semantic reranking.”"
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Faster integration tests with real Elasticsearch Learn how to make your automated integration tests for Elasticsearch faster using various techniques for data initialization and performance improvement. How To PP By: Piotr Przybyl On November 13, 2024 Part of Series Integration tests using Elasticsearch Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In part 1 of this series , we explored how writing integration tests that allow us to test our software against real Elasticsearch isn't rocket science. This post will demonstrate various techniques for data initialization and performance improvements. Integration tests: Different purposes, different characteristics Once the testing infrastructure is set up, and the project is already using an integration test framework for at least one test (like we use Testcontainers in our demo project ), adding more tests becomes easy because it doesn't require mocking. For example, if you need to verify that the number of books fetched for the year 1776 is correct, all you have to do is add a test like this: This is all that's needed, provided the dataset used to initialize Elasticsearch already contains the relevant data. The cost of creating such tests is low, and maintaining them is nearly effortless (as it mostly involves updating the Docker image version). No software exists in a vacuum Today, every piece of software we write is connected to other systems. While tests using mocks are excellent for verifying the behavior of the system we're building, integration tests give us confidence that the entire solution works as expected and will continue to do so. This can make it tempting to add more and more integration tests. Integration tests have their costs However, integration tests aren't free. Due to their nature — going beyond in-memory-only setups — they tend to be slower, costing us execution time. It's crucial to balance the benefits (confidence from integration tests) with the costs (test execution time and billing, often translating directly to an invoice from cloud vendors). Instead of limiting the number of tests because they are slow, we can make them run faster. This way, we can maintain the same execution time while adding more tests. The rest of this post will focus on how to achieve that. Let's revisit the example we've been using so far, as it is painfully slow and needs optimization. For this and subsequent experiments, I assume the Elasticsearch Docker image is already pulled, so it won't impact the time. Also, note that this is not a proper benchmark but more of a general guideline. Tests with Elasticsearch can also benefit from performance improvements Elasticsearch is often chosen as a search solution because it performs well. Developers usually take great care to ensure production code is optimized, especially in critical paths. However, tests are often seen as less critical, leading to slow tests that few people want to run. But it doesn't have to be this way. With some simple technical tweaks and a shift in approach, integration tests can run much faster. Let's start with the current integration test suite. It functions as intended, but with only three tests, it takes five and a half minutes — roughly 90 seconds per test — when you run time ./mvnw test '-Dtest=*IntTest*' . Please note your results may vary depending on your hardware, internet speed, etc. Batch me if you can In integration test suites, many performance issues stem from inefficient data initialization. While certain processes may be natural or acceptable in a production flow (such as data arriving as users enter it), these processes may not be optimal for tests, where we need to import data quickly and in bulk. The dataset in our example is about 50 MiB and contains nearly 81,000 valid records. If we process and index each record individually, we end up making 81,000 requests just to prepare data for each test. Instead of a naive loop that indexes documents one by one (as in the main branch): We should use a batch approach , such as with BulkIngester . This allows concurrent indexing requests, with each request sending multiple documents, significantly reducing the number of requests: This simple change reduced the overall testing time to around 3 minutes and 40 seconds, or roughly 73 seconds per test. While this is a nice improvement, we can push further. Stay local We reduced test duration in the previous step by limiting network round-trips. Could there be more network calls we can eliminate without altering the tests themselves? Let's review the current situation: Before each test, we fetch test data from a remote location repeatedly. While the data is fetched, we send it to the Elasticsearch container in bulk. We can improve performance by keeping the data as close to the Elasticsearch container as possible. And what's closer than inside the container itself? One way to import data into Elasticsearch en masse is the _bulk REST API, which we can call using curl . This method allows us to send a large payload (e.g., from a file) written in newline-delimited JSON format. The format looks like this: Make sure the last line is empty. In our case, the file might look like this: Ideally, we can store this test data in a file and include it in the repository, for example, in src/test/resources/ . If that's not feasible, we can generate the file from the original data using a simple script or program. For an example, take a look at CSV2JSONConverter.java in the demo repository. Once we have such a file locally (so we have eliminated network calls to obtain the data), we can tackle the other point, which is: copying the file from the machine where the tests are running into the container running Elasticsearch. It's easy, we can do that using a single method call, withCopyToContainer , when defining the container. So after the change it looks like this: The final step is making a request from within the container to send the data to Elasticsearch. We can do this with curl and the _bulk endpoint by running curl inside the container. While this could be done in the CLI with docker exec , in our @BeforeEach , it becomes elasticsearch.execInContainer , as shown here: Reading from the top, we're making this way a POST request to the _bulk endpoint (and wait for the refresh to complete), authenticated as the user elastic with the default password, accepting the auto-generated and self-signed certificate (which means we don't have to disable SSL/TLL), and the payload is the content of the /tmp/books.ndjson file, which was copied to the container when it was starting. This way, we reduce the need for frequent network calls. Assuming the books.ndjson file is already present on the machine running the tests, the overall duration is reduced to 58 seconds. Less (often) is more In the previous step, we reduced network-related delays in our tests. Now, let's address CPU usage. There's nothing wrong with relying on @Testcontainers and @Container annotations. The key, though, is to understand how they work: when you annotate an instance field with @Container , Testcontainers will start a fresh container for each test. Since container startup isn't free (it takes time and resources), we pay this cost for every test. Starting a fresh container for each test is necessary in some scenarios (e.g., when testing system start behavior), but not in our case. Instead of starting a new container for each test, we can keep the same container and Elasticsearch instance for all tests, as long as we properly reset the container's state before each test. First, make the container a static field. Next, before creating the books index (by defining the mapping) and populating it with documents, delete the existing index if it exists from a previous test. For this reason the setupDataInContainer() should start with something like: As you can see, we can use curl to execute almost any command from within the container. This approach offers two significant advantages: Speed : If the payload (like books.ndjson ) is already inside the container, we eliminate the need to repeatedly copy the same data, drastically improving execution time. Language Independence : Since curl commands aren't tied to the programming language of the tests, they are easier to understand and maintain, even for those who may be more familiar with other tech stacks. While using raw curl calls isn't ideal for production code, it's an effective solution for test setup. Especially when combined with a single container startup, this method reduced my tests execution time to around 30 seconds. It's also worth noting that in the demo project (branch data-init ), which currently has only three integration tests, roughly half of the total duration is spent on container startup. After the initial warm-up, individual tests take about 3 seconds each. Consequently, adding three more tests won't double the overall time to another 30 seconds, but will only increase it by roughly 9-10 seconds. The test execution times, including data initialization, can be observed in the IDE: Summary In this post, I demonstrated several improvements for integration tests using Elasticsearch: Integration tests can run faster without changing the tests themselves — just by rethinking data initialization and container lifecycle management. Elasticsearch should be started only once, rather than for every test. Data initialization is most efficient when the data is as close to Elasticsearch as possible and transmitted efficiently. Although reducing the test dataset size is an obvious optimization (and not covered here), it's sometimes impractical. Therefore, we focused on demonstrating technical methods instead. Overall, we significantly reduced the test suite's duration — from 5.5 minutes to around 30 seconds — lowering costs and speeding up the feedback loop. In the next post , we'll explore more advanced techniques to further reduce execution time in Elasticsearch integration tests. Let us know if your case is using one of the techniques described above or if you have questions on our Discuss forums and the community Slack channel . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Integration tests: Different purposes, different characteristics No software exists in a vacuum Integration tests have their costs Tests with Elasticsearch can also benefit from performance improvements Batch me if you can Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Faster integration tests with real Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-integration-tests-faster",
+    "meta_description": "Learn how to make your automated Elasticsearch integration tests faster using various techniques for data initialization and performance improvement."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Comparing ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard This blog compares ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard. ML Research AP SC By: Aris Papadopoulos and Serena Chou On October 7, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Comparing ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard ELSER (Elastic Learned Sparse EncodeR) is Elastic’s transformer language model for semantic search and is a popular model for anyone interested in utilizing machine learning to elevate the relevance of a traditional search experience or to power a newly designed Retrieval Augmented Generation (RAG) application. ELSER v2 remains in the top-10 models on MTEB for Retrieval when grouping together the multiple flavors of the same competitor family. It is also one of the very few models in the top-10 that was released in 2023, with the majority of the competition having been released in 2024. ELSER timeline First introduced in June of 2023 , and with a second version made generally available in November 2023 , ELSER from day one has been designed to minimize the barrier to semantic search, while significantly improving search relevance, by capturing the context, semantic relationships and user intent in natural language. Among other use cases, this is an incredibly intuitive and valuable addition to RAG applications, as surfacing the most relevant results is critical for generative applications to produce accurate responses based on your own private data and to minimize the probability of hallucinations. ELSER can be used in tandem with the highly scalable, distributed Elasticsearch vector database, the open Inference API, native model management and the full power of the Search AI platform. ELSER is the component that provides the added value of state-of-the-art semantic search for a wide range of use cases and organizations. Because it is a sparse vector model (this will be explained further later in the blog), it is optimized for the Elasticsearch platform and it achieves superior relevance out of domain. When ELSER was first released, it outperformed the competition in out-of-domain retrieval, i.e. without you having to retrain/fine-tune a model on own data, as measured by the industry standard BEIR benchmark. This was a testament to Elastic’s commitment to democratize AI search. ELSER v2 was released in October 2023 and introduced significant performance gains on your preferred price point of operation by adding optimizations for Intel CPUs and by introducing token pruning . Because we know that the other equally important part of democratizing AI search is reducing its cost. As a result we provide two model artifacts: one optimized for Intel CPUs (leveraged by Elastic Cloud) and a cross-platform one. NDCG@10 for BEIR data sets for BM25 and ELSER V2 ELSER customer reception Customers worldwide leverage ELSER today in production search environments, as a testament to the ease of use and the immediate relevance boost that is achievable in a few clicks. Examples of ELSER customer success stories include Consensus , Georgia State University and more. When these customers test ELSER in pilots or initial prototypes, a common question is how does ELSER compare with relevance that can be achieved with traditional keyword (i.e.BM25) retrieval or with the use of a number of other models, including for example OpenAI’s text-embedding-ada-002. To provide the relevant comparison insights, we published a holistic evaluation of ELSER (the generally available version) on MTEB (v1.5.3). MTEB is a collection of tasks and datasets that have been carefully chosen to give a solid comparison framework between NLP models. It was introduced with the following motivation: “Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding Benchmark (MTEB).” ( source paper ). MTEB Leaderboard comparison - What you need to know For a meaningful comparison on MTEB, a number of considerations come into play. First, the number of parameters. The more parameters a model has, the greater its potential, but it also becomes far more resource intensive and costly. Models of similar size (number of parameters) are best for comparison, as models with vastly different numbers of parameters typically serve different purposes in a search architecture. Second, one of the aims of MTEB is to compare models and their variations on a number of different tasks. ELSER was designed specifically to lower the barrier for AI search by offering you state-of-the-art out-of-domain retrieval, so we will focus on the outcomes for the Retrieval task. Retrieval is measured with the ndcg@10 metric. Finally some models appear in multiple flavors, incorporating different numbers of parameters and other differentiations, forming a family. It makes more sense to group them together and compare against the top performer of the family for the task. ELSER on MTEB Leaderboard According to the above, filtering for the classes of up to 250 million parameters (ELSER has 110 million parameters), at the time of writing of this blog and as we are working on ELSER v3, ELSER v2 remains in the top-10 models for Retrieval when grouping together the multiple flavors of the same competitor family. It is also one of the very few models in the top-10 that was released in 2023, with the majority of the competition having been released in 2024. The top of the MTEB list for Retrieval (nDCG@10) for models with <250 million parameters. At the time of writing, ELSER ranks top-10 for the retrieval task. It is one of the very few models in the group that was released in 2023, with the vast majority released in 2024. The list, when filtered as mentioned inline, includes more then 80 models (not grouped) at the time of writing. Elastic’s continued investment in ELSER As mentioned previously, ELSER uses a contextualized sparse vector representation, a design choice that gives it the nice properties mentioned before and all the space for gains and feature extensions in future releases that are already in development. This sets it apart on MTEB, as the vast majority of models on the leaderboard are embeddings, i.e. dense vectors. This is why you will notice a much larger number of dimensions in the corresponding MTEB column for ELSER compared with the other models. ELSER extends BERT’s architecture and expands the output embeddings by retaining the masked language model (MLM) head and adapting it to create and aggregate per-token activation distributions for each input sequence. As a result, the number of dimensions is equal to BERT’s vocabulary, only a fraction of which get activated for a given input sequence. The upcoming ELSER v3 model is currently in development, being trained with the additional use of LLM-generated data, new advanced training recipes and other state-of-the-art and novel strategies as well as support for GPU inference. Conclusion The innovation in this space is outpacing many customers' ability to adopt, test and ensure enterprise quality incorporation of new models into their search applications. Many customers lack holistic insight into the metrics and methodology behind the training of the model artifacts, leading to additional delays in adoption. From the very first introduction of our ELSER model, we have provided transparency into our relevance goals, our evaluation approach for improved relevance and the investments into efficient performance of this model on local, self-managed deployments (even those hosted on laptops!) with capabilities to enable scale for large production grade search experiences. Our full results are now published on the MTEB Leaderboard to provide an additional baseline in comparison to new emerging models. In upcoming versions of ELSER we expect to apply new state of the art retrieval techniques, evaluate new use cases for the model itself, and provide additional infrastructure support for fast GPU powered ELSER inference workloads. Stay tuned! Links https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-1 https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-2 https://www.elastic.co/search-labs/blog/may-2023-launch-information-retrieval-elasticsearch-ai-model Report an issue Related content How To July 26, 2024 Serverless semantic search with ELSER in Python: Exploring Summer Olympic games history This blog shows how to fetch information from an Elasticsearch index, in a natural language expression, using semantic search. We will load previous olympic games data set and then use the ELSER model to perform semantic searches. EK By: Essodjolo Kahanam ML Research June 21, 2023 Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search Learn about the Elastic Learned Sparse Encoder (ELSER), an AI model for high relevance semantic search across domains. AP GG By: Aris Papadopoulos and Gilad Gal ML Research October 17, 2023 Improving information retrieval in the Elastic Stack: Optimizing retrieval with ELSER v2 Learn how we are reducing the retrieval costs of the Learned Sparse EncodeR (ELSER) v2. TV QH VK By: Thomas Veasey , Quentin Herreros and Valeriy Khakhutskyy Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Jump to Comparing ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard ELSER timeline ELSER customer reception MTEB Leaderboard comparison - What you need to know ELSER on MTEB Leaderboard Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Comparing ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-elser-relevance-mteb-comparison",
+    "meta_description": "Discover how Elastic ELSER ranks on the Hugging Face MTEB Leaderboard for retrieval relevance, with insights into the parameters shaping its performance."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to choose the best k and num_candidates for kNN search Learn strategies for selecting the optimal values for `k` and `num_candidates` parameters in kNN search, illustrated with practical examples. How To MK By: Madhusudhan Konda On May 24, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. How to choose the best k and num_candidates for kNN search? Vector search has emerged as a game-changer in the current generative AI/ML world. It allows us to find similar items based on their semantic meaning rather just exact keyword matches. Elasticsearch's k-Nearest Neighbors (kNN) algorithm is a foundational ML technique for classification and regression tasks. It found a significant place within Elasticsearch's ecosystem with the introduction of vector search capabilities. Introduced in Elasticsearch 8.5, kNN based vector search allows users to perform high-speed similarity searches on dense vector fields. Users can find documents in the index \"closest\" to a given vector by leveraging the kNN algorithm using an underlying specified distance metric such as Euclidean or Cosine similarity. This feature marked a pivotal advancement as it is particularly useful in applications requiring semantic search, recommendations and other use cases such as anomaly detection. The introduction of dense vector fields and k-nearest neighbor (kNN) search functionality in Elasticsearch has opened new horizons for implementing sophisticated search capabilities that go beyond traditional text search. This article delves into strategies for selecting the optimal values for k and num_candidates parameters, illustrated with practical examples using Kibana. kNN search query Elasticsearch provides a kNN search option for nearest-neighbors - something like the following: As the snippet shows, the knn query fetches the relevant results for the query in question (having a movie title as \"Good Ugly\") using vector search. The search is conducted in a multi-dimensional space, producing the closest vectors to the given query vector. From the above query, notice two attributes: num_candidates which is the initial pool of candidates to consider and k , the number of nearest neighbors. kNN critical parameters - k and num_candidates To leverage the kNN feature effectively, one requires a nuanced understanding of the two critical parameters: k - the number of global nearest neighbors to retrieve, and num_candidates - the number of candidate neighbors considered for each shard during the search. Choosing the optimal values for the k and num_candidates involves balancing precision, recall, and performance. These parameters play a crucial role to efficiently handle high-dimensional vector spaces commonly found in machine learning applications. The optimal value for k largely depends on the specific use case. For example, if you're building a recommendation system, a smaller k (e.g., 10-20) might be sufficient to provide relevant recommendations. In contrast, for a use case where you'd want clustering or outlier detection capabilities, you might need a larger k . Note that the higher k value can significantly increase both computation and memory usage, especially with large datasets. It's important to test different values of k to find a balance between result relevance and system resource usage. K: Unveiling the closest neighbors We have an option of choosing the k value as per our requirements. Sometimes, setting up a lower k value receives more or less exactly what you want with the exception that a few results might not make it to the final output. However, setting up a higher k value might broaden your search results in numbers, with a caveat that you may receive diversified results at times. Imagine you're searching for a new book in the vast library of recommendations. k , also known as the number of nearest neighbors, determines how many books you'll be presented with. Think of it as the inner circle of your search results. Let's see how setting the lower and higher k values affects the number of books that the query returns. Setting lower K The lower K setting prioritizes extreme precision - meaning we will receive a handful of books that are the most similar to our query vector. This ensures a high degree of relevance to our specific interests. This might be ideal if you're searching for a book with a very specific theme or writing style. Setting higher K With a larger K value, we will be fetching a broader exploration result set. Note that the results might not be as tightly focused on your exact query. However, you'll encounter a wider range of potentially interesting books. This approach can be valuable for diversifying your reading list and discovering unexpected gems, perhaps. Whenever we say higer or lower values of k , we mean the actual values depends on multiple factors, such as size of the data sets, available computing power and other factors. In some cases, the k=10 might be a large but in others it might be small too. So, do keep a note of the environmnet that this parateter is expected to operate. The num_candidates attribute: Behind the curtain While k determines the final number of books you see, num_candidates plays a crucial role under the hood. It essentially defines the search space per shard – the initial pool of books in a shard from which the most relevant K neighbors are identified. When we issue the query, we are expected to hint Elasticsearch to run the query amongst top \"x\" number of candidates on each shard. For example, say our books index contains 5000 books evenly distributed amongst five primary shards (i.e., ~1000 books per shard). When we are performing a search, obviously choosing all 1000 documents for each shard is neither a viable nor a correct option. Instead, we will be pick up to say 25 documents (which is our num_candidates ) from the 1000 documents. That amounts to 125 documents as our total search space (5 shards times 25 documents each). We will let the kNN query know to choose the 25 documents from each shard and this number is the num_candidates parameter. When the kNN search is executed, the \"coordinator\" node sends the request query to all of the involved shards. The num_candidates documents from each shard will constitute the search space and the top k documents will be fetched from that space. Say, if k is 3, the top 3 documents out of the 25 candidate documents will be selected in each shard and returned to the coordinator node. That is, the coordinator node will receive 15 documents in total from all the involved nodes. These top 15 documents are then ranked to fetch the global top 3 ( k ==3) documents. The process is depicted in the following figure: Here's what num_candidates means for your search: Setting the lower num_candidates This approach might restrict the search space, potentially missing some relevant books that fall outside the initial exploration set. Think of it as surveying a smaller portion of the library's shelves. Setting the higher num_candidates A higher num_candidates value increases the likelihood of finding the true nearest neighbors within our chosen K. It expands the search space - that is - more number of candidates are considered - and hence leads to a slight increase in search time. So, a higher value generally increases accuracy (as the chance of missing relevant vectors decreases) but at the cost of performance. Balancing precision & performance for kNN parameters The optimal values for k and num_candidates depend on a few factors and specific needs. If we prioritize extreme precision with a smaller set of highly relevant results, a lower k with a moderate num_candidates might be ideal. Conversely, if exploration and discovering unexpected books are your goals, a higher K with a larger num_candidates could be more suitable. While there is no hard-and-fast rule to define the \"lower\" or \"higher\" number for the num_candidates , you need to decide this number based on your dataset, computing power and the expected precision. Experimentation to optimize kNN parameters By experimenting with different K and num_candidates combinations and monitoring search results and performance, you can fine-tune your searches to achieve the perfect balance between precision, exploration, and speed. Remember, there's no one-size-fits-all solution – the best approach depends on your unique goals and data characteristics. Practical example: Using kNN for movie recommendations Let's consider an example of movies to create a manual \"simple\" framework for understanding the effect of k and num_candidates attributes while searching for movies. Manual framework Let's understand how we can develop a home grown framework for tweaking the k and num_of_candidates attributes for a kNN search. The mechanics of the framework is as follows: Create a movies index with a couple of dense_vector fields in the mapping to hold our vectorised data. Create an embedding pipeline so each and every movie's title and synopsis fields will be embedded with a multilingual-e5-small model to store vectors. Perform the indexing operation,which goes through the above embedding pipeline. The respective fields will be vectorised Create a search query using kNN feature Tweak the k and num_candidates options as you'd want Let's dig in. Creating an inference pipeline We will need to index data via Kibana - far from ideal - but it will do for this manual framework understanding. However, every movie that gets indexed must have the title and synopsis field vectorised to enable semantic search on our data. We can do this by elegantly creating a inference pipeline processor and attaching it to our batch indexing operation. Let's create an inference pipeline: The inference pipeline movie_embedding_pipeline , as shown above, creates vector fields text embedding for title and synopsis fields. It uses the inbuilt multilingual-e5-small model to create the text embeddings. Creating index mappings We will need to create a mapping with couple of properties as dense_vector fields. The following code snippet does the job: Once the above command gets executed, we have a new movies index with the appropriate dense vector fields, including title_vector.predicted_value and synopsis_vector.predicted_value fields that hold respective vectors. The index mapping parameter was set to false by default up to release 8.10. This has been changed in release 8.11, where the parameter is set to true by default, which makes it unnecessary to specify it. Next step is to ingest the data. Indexing movies We can use _bulk operation to index a set of movies - I'm reusing a dataset that I had created for my Elasticsearch in Action 2nd edition book - which is available here : For completeness, a snippet of the ingestion using the _bulk operation is provided here: Make sure you replace the script with the full dataset. Note that the _bulk operation is suffixed with the pipeline ( ?pipeline=movie_embedding_pipeline ) so the every movie gets passed through this pipeline, thus producing the vectors. As we primed our movies indexed with vector embeddings, it's time to start our experiments on fine tuning k and num_candidates attributes. kNN search As we have vector data in our movies index, we will be using approximate k-nearest neighbor (kNN) search. For example, to recommend movies similar that has father-son sentiment (\"Father and son\" as search query), we'll use a kNN search to find the nearest neighbors: In the given example, the query leverages the top-level kNN search option parameter that directly focuses on finding documents closest to a given query vector. One key difference between this search with knn query at the top level as opposed to query at the top level is that in the former case, the query vector will be generated on-the-fly by a machine learning model. The part in bold is not technically correct. On-the-fly vector generation is only achieved by using query_vector_builder instead of query_vector where you pass in the vector (computed outside of ES) but both the top-level knn search option and the knn search query provide this capability. The script fetches the relevant results based on our search query (which is built using the query_vector_builder block). We are using a random k and num_candidates values set to 5 and 10 respectively. kNN query attributes The above query has a set of attributes that would make up the kNN query. The following information about these attributes will help you understand the query better: The field attribute specifies the field in the index that contains the vector representations of our documents. In this case, title_vector.predicted_value is the field storing the document vectors. The query_vector_builder attribute is where the example significantly diverges from simpler kNN queries. Instead of providing a static query vector, this configuration dynamically generates a query vector using a text embedding model. The model transforms a piece of text (\"Father and son\" in the example) into a vector that represents its semantic meaning. The text_embedding indicates that a text embedding model will be used to generate the query vector. The model_id is the identifier for the pre-trained machine learning model to use, It is the .multilingual-e5-small model in this example. The model_text attribute is the text input that will be converted into a vector by the specified model. Here, it's the words \"Father and son\", which the model will interpret semantically to find similar movie titles. The k is the number of nearest neighbors to retrieve - that is, it determines how many of the most similar documents to return based on the query vector. The num_candidates attribute is the broader set of candidate documents per shard as potential matches to ensure the final results are as accurate as possible. kNN results Executing the kNN basic search script should get us top 5 results - for brevity, I'm providing just the list of the movies. As you can expect, Godfather (both parts) are part of the father-and-son bonding while Pulp Fiction shouldn't have been part of the results (though the query is asking about \"bonding\" - Pulp Fiction is all about the bonding between few people). Now that we have a basic framework setup, we can tweak the parameters appropriate and deduce the approximate settings. Before we tweak the settings, let's understand the optimal setting of k attribute. Choosing optimal K value Choosing the optimal value of k in k-Nearest Neighbors (kNN) algorithms is crucial for attaining the best possible performance on our dataset with minimal errors. However, there isn't a one-size-fits-all answer, as the best k value can depend on a few factors such as specifics of our data and what we are trying to predict. To choose an optimal k value, one must create a custom framework with several strategies and considerations. k = 1: Try running the search query with k=1 as a first step. Make sure you change the input query for each run. The query should give you unreliable results as changing the input query will return incorrect results over time. This leads to a ML pattern called \"overfitting\" where the model becomes overly reliant on the specific data points in the immediate neighborhood. Model, thus, struggles to generalize to unseen examples. k = 5: Run the search query with k=5 and check the predictions. The stability of the search query should ideally improved and you should be getting adequate reliable predictions. You can either incrementally increase the value of k - may be increase in the steps of 5 or x - until you find that sweet spot where you'd find the results for the input queries are pretty much spot on with less number of errors. You can go to extreme values of k too, for example, pick a higher value of k=50 , as discussed below: k = 50: Increase the k value to 50 and check the search results. The errored results most likely outshine the actual/expected predictions. This is when you know that you are hitting the hard boundary of the k value. Larger k values leads to a ML feature called \"underfitting\" - a underfitting in KNN happens when the model is too simplistic and fails to capture the underlying patterns in the data. Choosing the optimal num_candidates value The num_candidates parameter plays a crucial role in finding the optimal balance between search accuracy and performance. Unlike k, which directly influences the number of search results returned, num_candidates determines the size of the initial candidate set from which the final k nearest neighbors are selected. As discussed earlier, the num_candidates parameter defines how many nearest neighbors will be selected on each shard. Adjusting this parameter is essential for ensuring that the search process is both efficient and yields high-quality results. num_candidates = Small Value (e.g., 10): Start with a low value (\"low-value-exploration\") for num_candidates as a preliminary step. The aim is to establish a baseline for performance at this stage. As the candidate bunch is just a handful of candidates, the search will be fast but might miss relevant results - which leads to poor accuracy. This scenario helps us to understand the minimum threshold where the search quality is noticeably compromised. num_candidates = Moderate Value (e.g., 25?): Increase the num_candidates to a moderate value (\"moderate-value-exploration\") and observe the changes in search quality and execution time. A moderate number of candidates is likely to improve the accuracy of the results by considering a wider pool of potential neighbors. As the number of candidates increased, there's going to be cost of resources, be mindful of that. So, keep monitoring the performance metrics closely. However, as the search accuracy increases, perhaps the increase in computational cost could be justifiable. num_candidates = Step Increase: Continue to incrementally increase num_candidates (incremental-increase-exploration), possibly in steps of 20 or 50 (depending on the size of your dataset). Evaluate whether the additional candidates contribute to a meaningful improvement in search accuracy with each of the increments. There will be a a point of diminishing returns where increasing num_candidates further yields little to no improvement in result quality. At the same time you may have noticed, this will strain our resources and significantly impacts performance. num_candidates = High Value (say, 1000, 5000): Experiment with a high value for num_candidates to understand the upper bounds of the impact of choosing the higher settings. There's a possibility of your search accuracy stabilizing or degrading slightly due to the inclusion of less relevant candidates. This may lead to dilute the precision of the final k results. Do note that, as we've been talking about it, the high values of num_candidates will always increase the computational load - thus longer query times and potential resource constraints. Finding the optimal balance We now know how to adjust the k and num_candidates attributes and how our experiments to different settings would change the outcome of search accuracy. The goal is to find a sweet spot where the search results are consistently accurate with lower performance overhead from processing a large candidate set is manageable. Of course, the optimal value will vary depending on the specifics of our data, the dimensionality of the vectors, and other performance requirements. Wrap up The optimal K value lies in finding the sweet spot by experiment and trials. You want to use enough neighbors (K being lower side) to capture the essential patterns but not so many ( k being on the higher side) that the model becomes overly influenced by noise or irrelevant details. You also want to tweak the candidates so that the search results are accurate at a given k value. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to How to choose the best kNN search query kNN critical parameters - k and num_candidates K: Unveiling the closest neighbors Setting lower K Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to choose the best k and num_candidates for kNN search - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-knn-and-num-candidates-strategies",
+    "meta_description": "Learn strategies for selecting the optimal values for `k` and `num_candidates` parameters in kNN search, illustrated with practical examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Using Ollama with the Inference API The Ollama API is compatible with the OpenAI API so it's very easy to integrate Ollama with Elasticsearch. Generative AI How To JR By: Jeffrey Rengifo On February 14, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this article, we'll learn how to connect local models to the Elasticsearch inference model using Ollama and then ask your documents questions using Playground. Elasticsearch allows users to connect to LLMs using the Open Inference API , supporting providers such as Amazon Bedrock, Cohere, Google AI, Azure AI Studio, HuggingFace - as a service, among others. Ollama is a tool that allows you to download and execute LLM models using your own infrastructure (your local machine/server). Here you can find a list of the available models that are compatible with Ollama. Ollama is a great option if you want to host and test different open source models without having to worry about the different ways each of the models could have to be set up, or about how to create an API to access the model functions as Ollama takes care of everything. Since the Ollama API is compatible with the OpenAI API, we can easily integrate the inference model and create a RAG application using Playground. Prerequisites Elasticsearch 8.17 Kibana 8.17 Python Steps Setting up Ollama LLM server Creating mappings Indexing data Asking questions using Playground Setting up Ollama LLM server We're going to set up a LLM server to connect it to our Playground instance using Ollama. We'll need to: Download and run Ollama. Use ngrok to access your local web server that hosts Ollama over the internet Download and run Ollama To use Ollama, we first need to download it . Ollama offers support for Linux, Windows, and macOS so just download the Ollama version compatible with your OS here. Once Ollama is installed, we can choose a model from this list of supported LLMs. In this example, we'll use the model llama3.2 , a general multilanguage model. In the setup process, you will enable the command line tool for Ollama. Once that’s downloaded you can run the following line: Which will output: Once installed, you can test it with this command: Let's ask a question: With the model running, Ollama enables an API that would run by default on port \"11434\". Let's make a request to that API, following the official documentation : This is the response we got: Note that the specific response for this endpoint is a streaming. Expose endpoint to the internet using ngrok Since our endpoint works in a local environment, it cannot be accessed from another point–like our Elastic Cloud instance–via the internet. ngrok allows us to expose a port offering a public IP. Create an account in ngrok and follow the official setup guide . Once the ngrok agent has been installed and configured, we can expose the port Ollama is using: Note: The header --host-header=\"localhost:11434\" guarantees that the \"Host\" header in the requests matches \"localhost:11434\" Executing this command will return a public link that will work as long as the ngrok and the Ollama server run locally. In \"Forwarding\" we can see that ngrok generated a URL. Save it for later. Let's try making an HTTP request to the endpoint again, now using the ngrok-generated URL: The response should be similar to the previous one. Creating mappings ELSER endpoint For this example, we'll create an inference endpoint using the Elasticsearch inference API . Additionally, we'll use ELSER to generate the embeddings. For this example, let's imagine that you have a pharmacy that sells two types of drugs: Drugs that require a prescription. Drugs that DO NOT require a prescription. This information would be included in the description field of each drug. The LLM must interpret this field, so this is the data mappings we'll use: The field text_description will store the plain text of the descriptions while semantic_field , which is a semantic_text field type, will store the embeddings generated by ELSER. The property copy_to will copy the content from the fields name and text_description into the semantic field so that the embeddings for those fields are generated. Indexing data Now, let's index the data using the _bulk API . Response: Asking questions using Playground Playground is a Kibana tool that allows you to quickly create a RAG system using Elasticsearch indexes and a LLM provider. You can read this article to learn more about it. Connecting the local LLM to Playground We first need to create a connector that uses the public URL we've just created. In Kibana, go to Search>Playground and then click on \"Connect to an LLM\". This action will reveal a menu on the left side of the Kibana interface. There, click on \"OpenAI\". We can now start configuring the OpenAI connector. Go to \"Connector settings\" and for the OpenAI provider, select \"Other (OpenAI Compatible Service)\": Now, let's configure the other fields. For this example, we'll name our model \"medicines-llm\". In the URL field, use the one generated by ngrok ( /v1/chat/completions ). On the \"Default model\" field, select \"llama3.2\". We won't use an API Key so just put any random text to proceed: Click on \"Save\" and add the index medicines by clicking on \"Add data sources\": Great! We now have access to Playground using the LLM we're running locally as RAG engine. Before testing it, let's add more specific instructions to the agent and up the number of documents sent to the model to 10, so that the answer has the most possible documents available. The context field will be semantic_field , which includes the name and description of the drugs, thanks to the copy_to property. Now let's ask the question: Can I buy Clonazepam without a prescription? and see what happens: As expected, we got the correct answer. Next steps The next step is to create your own application! Playground provides a code script in Python that you can run on your machine and customize it to meet your needs. For example, by putting it behind a FastAPI server to create a QA medicines chatbot consumed by your UI. You can find this code by clicking the View code button in the top right section of Playground: And you use the Endpoints & API keys to generate the ES_API_KEY environment variable required in the code. For this particular example the code is the following: To make it work with Ollama, you have to change the OpenAI client to connect to the Ollama server instead of the OpenAI server. You can find the full list of OpenAI examples and compatible endpoints here . And also change the model to llama3.2 when calling the completion method: Let’s add our question: Can I buy Clonazepam without a prescription? To the Elasticsearch query: And also to the completion call with a couple of prints, so we can confirm we are sending the Elasticsearch results as part of the question context: Now let’s run the command pip install -qU elasticsearch openai python main.py You should see something like this: Conclusion In this article, we can see the power and versatility of tools like Ollama when we use them together with the Elasticsearch inference API and Playground. After some simple steps, we had a working RAG application with a chat that used a LLM running in our own infrastructure at zero cost. This also allows us to have more control over resources and sensitive information, besides giving us access to a variety of models for different tasks. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Steps Setting up Ollama LLM server Download and run Ollama Expose endpoint to the internet using ngrok Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using Ollama with the Inference API - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/ollama-with-inference-api",
+    "meta_description": "Learn how to integrate Ollama with Elasticsearch using the Ollama API, which is compatible with the OpenAI API, making the integration easier."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Multilingual vector search with the E5 embedding model Here's how multilingual vector search works and how to use Elasticsearch with the multilingual E5 embedding model, including examples. Vector Database Python JD By: Josh Devins On September 12, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector search has taken the search and information retrieval world by storm in recent years. It has the ability to match the semantics of queries with documents, to incorporate context and meaning of text, and provides users with natural language querying abilities like never before. Vector search is a great source of context for prompting large language models (LLMs), and it's powering more and more modern search experiences in the age of Generative AI. This blog goes over multilingual vector search, explains how it works, and how to use Elasticsearch with E5 embeddings models. It also showcases examples of multilingual search across languages. Why multilingual embeddings? When researchers first started working with and training embedding models for vector search, they used the most widely available datasets they could find. These datasets however tended to all be in the English language. Queries were in English, Wikipedia articles indexed were in English, and quickly the non-English speaking world took notice. Language-specific language models slowly started to pop up for languages like German, French, Chinese, Japanese and so on. However those models only worked within that language. With the power of embeddings, we also have the ability to train models which embed multiple languages into the same \"embedding space\", using a single model. You can think of an embedding space as a language agnostic, mathematical representation (dense vector) of the concepts that sentences (queries or passages) represent where embeddings close to each other in the embedding space have similar semantic meaning. Since we can embed text, images and audio into an embedding space, why not embed multiple languages into the same embedding space? This is the idea behind multilingual embedding models. With aligned training datasets — datasets containing similar sentences in different languages — it's possible to make the model learn not the translation of words between languages, but the relationships and meaning underlying each sentence irrespective of language. This is a true cross-lingual model, capable of working with pairs of text in any of the languages it was trained on. Now let's see how to use these aligned multilingual models. Let's consider a few examples For this exercise, we'll map sentences from English and German into the same part of the embedding space, when they have the same underlying meaning. Let's say I have the following sentences that I'd like to index and search over. For the non-German speakers out there, we've provided the direct English translation of the German sentences. 😉 id=doc1, language=en, passage=\"I sat on the bank of the river today.\" id=doc2, language=de, passage=\"Ich bin heute zum Flussufer gegangen.\" (English: \"I walked to the riverside today.\") id=doc3, language=en, passage=\"I walked to the bank today to deposit money.\" id=doc4, language=de, passage=\"Ich saß heute bei der Bank und wartete auf mein Geld.\" (English: \"I sat at the bank today waiting for my money.\") In the example queries that follow, we show how multilingual embeddings can overcome some of the challenges that traditional lexical retrieval faces for multilingual search . Typically we talk about vector search overcoming the limitations of lexical search's semantic mismatch and vocabulary mismatch. Semantic mismatch is the case where the tokens (words) we use in the query have the same form as in the indexed documents, but different meanings. For example the \"bank\" of a river doesn't have the same meaning as a \"bank\" that holds money. With vocabulary mismatch, we're faced with the tokens being different, but the underlying concept or meaning is similar to meaning represented in the document. We may search for \"ATM\" which doesn't appear in any document, but is closely related to a \"bank that holds money\". In addition to these two improvements over lexical search, multilingual (cross-lingual) embeddings add language independence, allowing query and passage to be in different languages. For a deeper look into how vector search works and how it fits with traditional lexical search, have a look at this webinar: How vector databases power AI search . Let's try a few search examples now and see how this works. Example 1 Query : \"riverside\" (German: \"Flussufer\") Results : id=doc1, language=en, passage=\"I sat on the bank of the river today.\" id=doc2, language=de, passage=\"Ich bin heute zum Flussufer gegangen.\" (English: \"I walked to the riverside today.\") In this example, the translation of \"riverside\" is \"Flussufer\" in German. The semantic meaning however matches the English phrase \"bank of the river\", as well as the German keyword \"Flussufer\", so we match on both documents. Example 2 Query : \"Geldautomat\" (English: \"ATM\") Results : id=doc4, language=de, passage=\"Ich saß heute bei der Bank und wartete auf mein Geld.\" (English: \"I sat at the bank today waiting for my money.\") id=doc3, language=en, passage=\"I walked to the bank today to deposit money.\" In this example, the translation of \"Geldautomat\" is \"ATM\" in English. Neither \"Geldautomat\" nor \"ATM\" appear as keywords in any of the documents, however the semantic meaning is close to both the English phrase \"bank … money\", and the German phrase \"Bank … Geld\". In this case, the context matters and the query is referring to the kind of bank that holds money, and not the bank of a river, so we match only on the documents that refer to that kind of \"bank\", but we do so across languages based on the semantic meaning and not on keywords. Example 3a Query : \"movement\" Results : id=doc3, language=en, passage=\"I walked to the bank today to deposit money.\" id=doc2, language=de, passage=\"Ich bin heute zum Flussufer gegangen.\" (English: \"I walked to the riverside today.\") In this example, we're searching for the kind of motion represented in the text. We're interested in motion or walking and not sitting or being stationary in one place. As such, the closest documents are represented by the German word \"gegangen\" (English: \"have gone to\") and the English word \"walked\". Example 3b Query : \"stillness\" Results : id=doc4, language=de, passage=\"Ich saß heute bei der Bank und wartete auf mein Geld.\" (English: \"I sat at the bank today waiting for my money.\") id=doc1, language=en, passage=\"I sat on the bank of the river today.\" If we invert the query from Example 3a and look for \"stillness\" or lack of movement, we get the \"opposite\" results. Multilingual E5 embedding model In December 2022, Microsoft released a new general-purpose embedding model called E5, or E mb E ddings from bidir E ctional E ncoder r E presentations. (I know, naming things is hard.) This model was trained on a special, English-only curated dataset called CCPairs, and introduced a few new methods to their training process. The model quickly shot to the top of numerous benchmarks, and after the success of that model, they set their sights on non-English. In addition to embedding models for English, Microsoft later trained a variant of their E5 models on multilingual text, using a variety of multilingual datasets, but with the same overall process as their English counterparts. This showed that their training process was a large part of what helped produce such good English language embeddings, and this success was transferred to multilingual embeddings. In some English-only benchmarks, the multilingual embeddings are even better than other embeddings trained only on English datasets! For those interested, check out the MTEB retrieval benchmark for more details. As has become common practice for embedding models, the E5 family comes in three sizes, allowing users to make tradeoff decisions between effectiveness and efficiency for their particular use-case and budgets. Effectiveness of embeddings refers to how good they are at a task, as measured on a specific dataset. For semantic search this is a retrieval task and is measured using a search relevance metric like nDCG@10 or MRR@10. Efficiency of embeddings and embedding models is influenced by: How many dimensions the vectors are that the model produces, which impacts the storage needs (on disk and in memory) and how fast are they to search for. How large the embedding model is (number of parameters), which impacts the inference latency or the time it takes to create the embeddings at both ingest and search time. Below we can see the three multilingual E5 models and their characteristics, with effectiveness measured on a multilingual benchmark Mr. TyDi (see, naming is hard). For a baseline and as a comparison, we've included the BM25 (lexical search) effectiveness scores on Mr. TyDi, as reported by the E5 authors . Effectiveness: Avg. MRR@10 Efficiency: dimensions Efficiency: parameters BM25 33.3 n/a n/a multilingual-e5-small 64.4 384 118M multilingual-e5-base 65.9 768 278M multilingual-e5-large 70.5 1024 560M Elasticsearch for multilingual vector search with E5 Elasticsearch enables you to generate, store, and search vector embeddings. We've seen an introduction to multilingual embeddings in general, and we know a little bit about E5. Let's take a look at how to actually wire all this together into a search experience with Elasticsearch. This blog has an accompanying notebook which shows all the code in detail with the examples above, using Elasticsearch end-to-end. Here's a quick outline of what's required: Create an Elastic Cloud deployment with one ML node of size 8GB or larger (or use any Elasticsearch cluster with ML nodes) Setup the multilingual-e5-base embedding model in Elasticsearch to embed text at ingest via an inference processor Create an index and ingest documents into an ANN index for approximate kNN search Query the ANN index using a query_vector_builder Let's have a look now at a few code snippets from the notebook for each step. Setup With an Elastic Cloud cluster created or another Elasticsearch cluster ready, we can upload the embedding model using the eland library. Now that the model has been uploaded to the cluster and is ready for inference, we can create the ingest pipeline which contains an inference processor to perform the embedding of the text field of our choosing. When using Enterprise Search features such as the web crawler , you can manage ingest pipelines through the Kibana UI as well. Indexing For the simple examples above, we use just a very simple index mapping, but hopefully it gives you an idea of what your mapping might look like too. With an index created from the above mapping, we're ready to ingest documents. You can use whatever ingest method you'd like, as long as the ingest pipeline that we created at the beginning is referenced (or set as default for your index). Note that as with other embedding models, E5 does have a token limit (512 tokens or about 400 words) so longer text will need to be chunked into individual passages — for example with LangChain or another tool — before being ingested. Here's what our example documents look like. Search The documents have been indexed and embeddings created, so we're ready to search! And that's it! With the above steps, and the complete code from the notebook , you can build yourself a multilingual semantic search experience completely within Elasticsearch. Note of caution: E5 models were trained with instructions prefixed to text before embedding it. This means that when you want to embed text for semantic search, you must prefix the query with \"query: \" and indexed passages with \"passage: \". For further details and other use-cases requiring different prefixing, please refer to the FAQ in the multilingual-e5-base model card . Conclusion In this blog and the accompanying notebook , we've shown how multilingual vector search works, and how to use Elasticsearch with E5 embeddings models. We've motivated this by showing examples of multilingual search across languages, but in fact the same E5 embedding model can be used within a single language as well. For example if you have just a German corpus of text, you can freely use the same model and the same approach to search that corpus with just German queries. It's all the same model, and the same embedding space in the end! Try out the notebook , and be sure to spin up a Cloud cluster of your own to try multilingual semantic search with E5 on the language and dataset of your choice. If you have any questions or want to discuss multilingual semantic search, join us and the entire Elastic community in our discussion forum . Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Why multilingual embeddings? Let's consider a few examples Example 1 Example 2 Example 3a Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Multilingual vector search with the E5 embedding model - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/multilingual-vector-search-e5-embedding-model",
+    "meta_description": "Here's how multilingual vector search works and how to use Elasticsearch with the multilingual E5 embedding model, including examples."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to increase primary shard count in Elasticsearch Exploring methods for increasing primary shard count in Elasticsearch. How To KB By: Kofi Bartlett On April 17, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. It is not possible to increase the primary shard number of an existing index, meaning an index must be recreated if you want to increase the primary shard count. There are 2 methods that are generally used in these situations: the _reindex API and the _split API. The _split API is often a faster method than the _reindex API. Indexing must be stopped before both operations, otherwise, the source_index and target_index document counts will differ. Method 1 – using the split API The split API is used to create a new index with the desired number of primary shards by copying the settings and mapping an existing index. The desired number of primary shards can be set during creation. The following settings should be checked before implementing the split API: The source index must be read-only. This means that the indexing process needs to be stopped. The number of primary shards in the target index must be a multiple of the number of primary shards in the source index. For example, if the source index has 5 primary shards, the target index primary shards can be set to 10,15,20, and so on. Note: If only the primary shard number needs to be changed, the split API is preferred as it is much faster than the Reindex API. Implementing the split API Create a test index: The source index must be read-only in order to be split: Settings and mappings will be copied automatically from the source index: You can check the progress with: Since settings and mappings are copied from the source indices, the target index is read-only. Let’s enable the write operation for the target index: Check the source and target index docs.count before deleting the original index: Index name and alias name can’t be the same. You need to delete the source index and add the source index name as an alias to the target index: After adding the test_split_source alias to the test_split_target index, you should test it with: Method 2 – using the reindex API By creating a new index with the Reindex API, any number of primary shard counts can be given. After creating a new index with the intended number of primary shards, all data in the source index can be re-indexed to this new index. In addition to the split API features, the data can be manipulated using the ingest_pipeline in the reindex AP. With the ingest pipeline, only the specified fields that fit the filter will be indexed into the target index using the query. The data content can be changed using a painless script, and multiple indices can be merged into a single index. Implementing the reindex API Create a test reindex: Copy the settings and mappings from the source index: Create a target index with settings, mappings, and the desired shard count: *Note: setting number_of_replicas: 0 and refresh_interval: -1 will increase reindexing speed. Start the reindex process. Setting requests_per_second=-1 and slices=auto will tune the reindex speed. You will see the task_id when you run the reindex API. Copy that and check with _tasks API: Update the settings after reindexing has finished: Check the source and target index docs.count before deleting the original index, it should be the same: The index name and alias name can’t be the same. Delete the source index and add the source index name as an alias to the target index: After adding the test_split_source alias to the test_split_target index, test it using: Summary If you want to increase the primary shard count of an existing index, you need to recreate the settings and mappings to a new index. There are 2 primary methods for doing so: the reindex API and the split API. Active indexing must be stopped before using either method. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Method 1 – using the split API Implementing the split API Method 2 – using the reindex API Implementing the reindex API Summary Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to increase primary shard count in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-increase-primary-shard-count",
+    "meta_description": "Exploring methods for increasing primary shard count in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Using Amazon Nova models in Elasticsearch Learn how to use models from the Amazon Nova family in Elasticsearch. How To AL By: Andre Luiz On April 2, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this article, we will discuss Amazon's AI model family, Amazon Nova, and learn how to use it alongside Elasticsearch. About Amazon Nova Amazon Nova is a family of Amazon artificial intelligence models, available on Amazon Bedrock and designed to offer high performance and cost efficiency. These models operate with text, image and video inputs, generate textual outputs, and are optimized for different accuracy, speed and cost needs. Amazon Nova main models Amazon Nova Micro: Focused exclusively on text, this is a fast and cost-effective model, ideal for translation, reasoning, code completion and solving mathematical problems. Its generation exceeds 200 tokens per second, making it ideal for applications that require instant responses. Amazon Nova Lite: A low-cost multimodal model capable of quickly processing images, videos and texts. It stands out for its speed and accuracy, being indicated for interactive and high-volume applications where cost is a relevant factor. Amazon Nova Pro: The most advanced option, combining high accuracy, speed and cost efficiency. Ideal for complex tasks such as video summarization, questions and answers, software development and AI agents. Expert reviews attest to its excellence in textual and visual comprehension, as well as its ability to follow instructions and execute automated workflows. Amazon Nova models are suitable for a variety of applications, from content creation and data analysis to software development and AI-powered process automation. Below, we’ll demonstrate how to use Amazon Nova models in conjunction with Elasticsearch for automated product review analysis. What we will do: Create an endpoint via Inference API, integrating Amazon Bedrock with Elasticsearch. Create a pipeline using the Inference Processor, which will make calls to the Inference API endpoint. Index product reviews and automatically generate an analysis of the reviews using the pipeline. Analyze the results of the integration. Creating an Endpoint in the Inference API First, we configure the Inference API to integrate Amazon Bedrock with Elasticsearch. We define Amazon Nova Lite, id amazon.nova-lite-v1:0 , as the model to use since it offers a balance between speed, accuracy, and cost. Note: You will need valid credentials to use Amazon Bedrock. You can see the documentation for obtaining access keys here : Creating the review analysis pipeline Now, we create a processing pipeline that will use the Inference Processor to execute a review analysis prompt. This prompt will send the review data to Amazon Nova Lite, which will perform: Sentiment classification (positive, negative, or neutral). Review summarization. Keywords generation. Authenticity measurement (authentic | suspicious | generic). Indexing reviews Now, we index product reviews using the Bulk API. The pipeline created earlier will be automatically applied, adding the analysis generated by the Nova model to the indexed documents. Querying and analyzing the results Finally, we run a query to see how the Amazon Nova Lite model analyzes and classifies the reviews. By running GET products/_search, we get the documents already enriched with the fields generated from the review content. The model identifies the predominant sentiment (positive, neutral, or negative), generates concise summaries, extracts relevant keywords, and estimates the authenticity of each review. These fields help understand the customer’s opinion without having to read the full text. To interpret the results, we look at: Sentiment, which indicates the consumer’s overall perception of the product. The summary, which highlights the main points mentioned. Keywords, which can be used to group similar reviews or identify feedback patterns. Authenticity, which signals whether the review seems trustworthy. This is useful for curation or moderation. Final Thoughts The integration between Amazon Nova Lite and Elasticsearch demonstrated how language models can transform raw reviews into structured and valuable information. By processing the reviews through a pipeline, we were able to extract sentiment, authenticity, summaries, and keywords automatically and consistently. The results show that the model can understand the context of the reviews, classify user opinions, and highlight the most relevant points of each experience. This creates a much richer dataset that can be leveraged to improve search capabilities. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to About Amazon Nova Amazon Nova main models Creating an Endpoint in the Inference API Creating the review analysis pipeline Indexing reviews Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Using Amazon Nova models in Elasticsearch - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/amazon-nova-models-elasticsearch",
+    "meta_description": "Learn how to use models from the Amazon Nova family in Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Configurable chunking settings for inference API endpoints Elasticsearch open inference API extends support for configurable chunking for document ingestion with semantic text fields. Integrations How To R By: Daniel Rubinstein On March 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The Elasticsearch Inference API allows users to utilize machine learning models across a variety of providers to perform inference operations. One common use case of this API is to support semantic text fields used for semantic search within an index. As the size of a document’s data increases, creating an embedding on the whole of the data will yield less accurate results. Some inference models also have limitations on the size of inputs that can be processed. As such, the inference API utilizes a process called chunking to break down large documents being ingested into an index into smaller and more manageable subsections of the original data called chunks. The inference operations are then run against each of the individual chunks and the inference results for each chunk are stored within the index. In this blog, we’ll go over the chunking strategies, explain how Elasticsearch chunks text and how to configure chunking settings for an inference endpoint. What can I configure with chunking settings? From 8.16, users can now select from 2 strategies for generating chunks, each with their own configurable properties. Word based chunking strategy Configurable values provided by the user: (required) max_chunk_size : The maximum number of words in a chunk. (required) overlap : The number of overlapping words for chunks. Note: This can not be defined as more than half of the max_chunk_size . Word based chunking splits input data into chunks with word counts up to the provided max_chunk_size . This strategy will always fill a chunk to the maximum size before building the next chunk unless it reaches the end of the input data. Each chunk after the first will have a number of words overlapping from the previous chunk based on the provided overlap value. The purpose of this overlap is to increase inference accuracy by preventing useful context for inference results from being split across chunks. Sentence based chunking strategy Configurable values provided by the user: (required) max_chunk_size : The maximum number of words in a chunk. (required) sentence_overlap : The number of overlapping sentences for chunks. Note: This can only be defined as 0 or 1. Sentence based chunking will split input data into chunks containing full sentences. Chunks will contain only complete sentences, except when a sentence is longer than max_chunk_size , in which case it will be split across chunks. Each chunk after the first will have a number of sentences from the previous chunk overlapping based on the provided sentence_overlap value. Note: If no chunking settings are provided when creating an inference endpoint after 8.16, the default chunking settings will use a sentence strategy with max_chunk_size of 250 and a sentence_overlap of 1. For inference endpoints created before 8.16, the default chunking settings will use a word strategy with a max_chunk_size of 250 and an overlap of 1. How do I select a chunking strategy? There is no one-size-fits-all solution for the best chunking strategy. The best chunking strategy will vary based on the documents being ingested, the underlying model being used and any compute constraints you have. We recommend taking a subset of your corpus and some example queries and seeing how changing the strategy, chunk size and overlap affects your use case. For example, you might parameter sweep over different chunk overlaps and lengths and measure the time to ingest, the impact on search latency and the relevance of the top results for each query. The following are a few guidelines to help when starting out with configurable chunking: Picking a chunking strategy Generally, a sentence based chunking strategy works well to minimize context loss. However, it can often result in more chunks being generated as the process prioritizes keeping sentences intact over maximally filling each chunk. As such, an optimized word based chunking strategy may produce fewer chunks, which are more efficient to ingest and search. Picking a chunk size The chunk size should be selected to minimize useful contextual information from being split across chunks while retaining chunk topic coherence. Typically, chunks as close as possible to the maximum sequence length the model supports work better. However, long chunks are more likely to contain a mixture of topics that are less well represented. Picking a chunk overlap As the overlap between chunks increases, the number of chunks generated does as well. Similar to chunk size, you'll want to select an overlap that helps to minimize the chance of splitting important context across chunks subject to your compute constraints. Typically, more overlap, up to half the typical chunk length, results in better retrieval quality but comes at an increased cost. How does Elasticsearch chunk text? Elasticsearch uses the ICU4J library to detect word and sentence boundaries . Word boundaries are identified by following a series of rules, not just the presence of a whitespace character. For written languages that do not use whitespace, such as Chinese or Japanese, dictionary lookups are used to detect word boundaries. Sentence boundaries are similarly identified by following a series of rules, not just the presence of a period character. This ensures that sentence boundaries are accurately identified across languages in which sentence structures and sentence breaking characteristics may vary. Finally, we note that sometimes chunks benefit from long range context, which can't be retained by any simple chunking strategy. In these cases, if you are prepared to pay the cost, chunks can be enriched with additional generated context. For more details, see this discussion. How do I configure chunking settings for an inference endpoint? Pre-requisites Before configuring chunking settings, ensure that you have met the following requirements: You have a valid enterprise license. If you are configuring chunking settings for an inference endpoint connecting to any third-party integration, you have set up any necessary permissions to access these services (e.g. , created accounts, retrieved API keys, etc.). For the purposes of this guide, we will be configuring chunking settings for an inference endpoint using Elastic’s ELSER model , for which the only requirement is having a valid enterprise license. To find the information required to create an inference endpoint for a third-party integration, see the create inference endpoint API documentation . Step 1: Configure chunking settings during inference endpoint creation Step 2: Link the inference endpoint to a semantic text field in an index Step 3: Ingest a document into the index Ingest the document into the index created above by calling the index document API: The generated chunks and their corresponding inference results can be seen stored in the document in the index under the key chunks within the _inference_fields metafield. To see the stored chunks, you can search for all documents in the index with the search API : The chunks can be seen in the response. Before 8.18, the chunks were stored as full-chunk text values. From 8.18, the chunks are stored as a list of character offset values: Get started with configurable chunking today! For more information on utilizing this feature, view the documentation on configuring chunking . Try out this notebook to get started with configurable chunking settings: Configuring Chunking Settings For Inference Endpoints . Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to What can I configure with chunking settings? Word based chunking strategy Sentence based chunking strategy How do I select a chunking strategy? Picking a chunking strategy Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Configurable chunking settings for inference API endpoints - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/elasticsearch-chunking-inference-api-endpoints",
+    "meta_description": "Explore Elasticsearch chunking strategies, learn how Elasticsearch chunks text, and how to configure chunking settings for an inference endpoint."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data to Elasticsearch through Airbyte Using Airbyte to ingest data into Elasticsearch. Integrations How To AL By: Andre Luiz On March 14, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Airbyte is a data integration tool that allows you to move information from various sources to different destinations in an automated and scalable way. It enables you to extract data from APIs, databases, and other systems and load it into platforms such as Elasticsearch, which offers advanced search and efficient analysis. In this article, we will explain how to configure Airbyte to ingest data into Elasticsearch, covering key concepts, prerequisites, and step-by-step integration. Airbyte fundamental concepts Airbyte has several essential concepts for its use. Below, we highlight the main ones: Sources: Defines the origin of the data that will be extracted. Destinations: Defines where the data will be sent and stored. Connections: Configures the relationship between the source and the destination, including the synchronization frequency. Airbyte integration with Elasticsearch In this demonstration, we will perform an integration where data stored in an S3 bucket will be migrated to an Elasticsearch index. We will show how to configure the source (S3) and destination (Elasticsearch) in Airbyte. Prerequisites To follow this demonstration, the following prerequisites must be met: Create a bucket in AWS, where the JSON files containing the data will be stored. Install Airbyte locally using Docker. Create an Elasticsearch cluster in Elastic Cloud to store the ingested data. Below, we will detail each of these steps. Installing Airbyte Airbyte can be run locally using Docker or in the cloud, where there are costs associated with usage. For this demonstration, we will use the local version with Docker. The installation may take a few minutes. After following the installation instructions, Airbyte will be available at: http://localhost:8000. After logging in, we can start configuring the integration. Creating the bucket In this step, you’ll need an AWS account to create an S3 bucket. Additionally, it is essential to set the correct permissions by creating a policy and an IAM user to allow access to the bucket. In the bucket, we will upload JSON files containing different log records, which will later be migrated to Elasticsearch. The file logs have this content: Below are the files loaded into the bucket: Elastic Cloud configuration To make the demonstration easier, we will use Elastic Cloud. If you do not have an account yet, you can create a free trial account here: Elastic Cloud Registration . After configuring the deployment in Elastic Cloud, you will need to obtain: The URL of the Elasticsearch server. A user to access Elasticsearch. To obtain the URL, go to Deployments > My deployment, in application, find Elasticsearch and click on ‘Copy endpoint.‘ To create the user, follow the steps below: Access Kibana > Stack Management > Users. Create a new user with the superuser role. Fill in the fields to create the user. Now that we have everything set up, we can start configuring the connectors in Airbyte. Configuring source connector In this step, we will create the source connector for S3. To do this, we will access the Airbyte interface and select the Source option in the menu. Then, we will search for the S3 connector. Below, we detail the steps required to configure the connector: Access Airbyte and go to the Sources menu. Search for and select the S3 connector. Configure the following parameters: Source Name: Define a name for the data source. Delivery Method: Select Replicate Records (recommended for structured data). Data Format: Choose JSON Format. Stream Name: Define the name of the index in Elasticsearch. Bucket Name: Enter the name of the bucket in AWS. AWS Access Key and AWS Secret Key: Enter the access credentials. Click on Set up source and wait for validation. Configuration destination connector In this step, we will configure the destination connector, which will be Elasticsearch. To do this, we will access the menu and select the Destination option. Then, we will search for Elasticsearch and click on the returned result. Now, we will proceed with the configuration of this connection: Access Airbyte and go to the Destinations menu. Search and select the Elasticsearch connector. Configure the following parameters: Authentication Method: Choose Username/Password. Username and Password: Use the credentials created in Kibana. Server Endpoint: Paste the URL copied from Elastic Cloud. Click on Set up destination and wait for validation. Creating the Source and Destination connection Once the Source and Destination have been created, the connection between them will be created, thus completing the creation of the integration. Below are the instructions for creating the connection: 1. In the menu, go to Connections and click on Create First Connection. 2. On the next screen, you will be able to select an existing Source or create a new one. Since we already have a Source created, we will select Source S3. 3. The next step will be to select the destination. Since we have already created the Elasticsearch connector, it will be selected to finalize the configuration. In the next step, it will be necessary to define the Sync Mode and which schema will be used. Since only the log schema was created, it will be the only option available for selection. 4. We will move on to the Configure Connection step. Here, we can define the name of the connection and the frequency of the integration execution. The frequency can be configured in three ways: Cron : Runs the syncs based on the user-defined cron expression (e.g 0 0 15 * * ?, At 15:00 every day); Scheduled : Runs the syncs at the specified time interval (e.g. every 24 hours, every 2 hours); Manual : Run the syncs manually. For this demonstration, we will select the Manual option. Finally, by clicking on Set up Connection , the connection between the Source and the Destination will be established. Synchronizing Data from S3 to Elasticsearch When you return to the Connections screen, you can see the connection that was created. To execute the process, simply click on Sync. From that moment on, the migration of data from S3 to Elasticsearch will begin. If everything goes smoothly, you will get the synced status. Visualizing data in Kibana Now, we will go to Kibana to analyze the data and check if it was indexed correctly. In the Kibana Discovery section, we will create a Data View called logs. With this, we will be able to explore the data existing only in the logs index, which was created after the synchronization. Now, we can visualize the indexed data and perform analyses on it. This way, we validated the entire migration flow using Airbyte, where we loaded the data present in the bucket and indexed it in Elasticsearch. Conclusion Airbyte proved to be an efficient tool for data integration, allowing us to connect several sources and destinations in an automated way. In this tutorial, we demonstrated how to ingest data from an S3 bucket to an Elasticsearch index, highlighting the main steps of the process. This approach facilitates the ingestion of large volumes of data and allows analyses within Elasticsearch, such as complex searches, aggregations, and data visualizations. References Quickstart Airbyte: https://docs.airbyte.com/using-airbyte/getting-started/oss-quickstart#part-1-install-abctl Core concepts: https://docs.airbyte.com/using-airbyte/core-concepts/ Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Airbyte fundamental concepts Airbyte integration with Elasticsearch Prerequisites Installing Airbyte Creating the bucket Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "How to ingest data to Elasticsearch through Airbyte - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/airbyte-elasticsearch-ingest-data",
+    "meta_description": "Learn how to use Airbyte to ingest data into Elasticsearch. This blog covers the main concepts, prerequisites, and step-by-step integration."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog Playground: Experiment with RAG using Bedrock Anthropic Models and Elasticsearch in minutes Explore Elastic's Playground and learn how to use it to experiment with RAG applications using Bedrock Anthropic Models & Elasticsearch. Vector Database Generative AI Developer Experience Integrations How To JM AT By: Joe McElroy and Aditya Tripathi On July 10, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Playground is a low-code interface that enables developers to iterate and build production RAG applications by A/B testing LLMs, tuning prompts, and chunking data. With support for Amazon Bedrock, Playground brings you a wider selection of foundation models from Amazon, Anthropic, and other leading providers. Developers using Amazon Bedrock and Elasticsearch can refine retrieval to ground answers with private or proprietary data, indexed into one or more Elasticsearch indices. A/B test LLMs & retrieval with Playground inference via Amazon Bedrock The playground interface allows you to experiment and A/B test different LLMs from leading model providers such as Amazon and Anthropic. However, picking a model is only a part of the problem. Developers must also consider how to retrieve relevant search results to closely match a model’s context window size (i.e. the number of tokens a model can process). Retrieving text passages longer than the context window can lead to truncation, therefore loss of information. Text that is smaller than the context window may not embed correctly, making the representation inaccurate. The next bit of complexity may arise from having to combine retrieval from different data sources. Playground brings together a number of Elasticsearch features into a simple, yet powerful interface for tuning RAG workflows: work with a growing list of model sources, including Amazon Bedrock, for choosing the best LLM for your needs use semantic_text , for tuning chunking strategies to fit data and context window size use retrievers to add multi-stage retrieval pipelines (including re-raking) After tuning the context sent to the models to desired production standards, you can export the code and finalize your application with your Python Elasticsearch language client or LangChain Python integration. Today’s announcement, brings access to hosted models on Amazon Bedrock through the Open Inference API integration, and the ability to use the new semantic_text field type. We really hope you enjoy this experience! Playground takes all these composable elements and brings to you a true developer toolset for rapid iteration and development to match the pace that developers need. Using Elastic's Playground Within Kibana (the Elasticsearch UI), navigate to “Playground” from the navigation page on the left-hand side. To start, you need to connect to a model provider to bring your LLM of choice. Playground supports chat completion models such as Anthropic through Amazon Bedrock. This blog provides detailed steps and instructions to connect and configure the playground experience. Once you have connected an LLM and chosen an Elasticsearch index, you can start asking questions about information in your index. The LLM will provide answers based on context from your data. Connect an LLM of choice and Elasticsearch indices with private proprietary information Instantly chat with your data and assess responses from models such as Anthropic Claude 3 Haiku in this example Review and customize text and retriever queries to indices that store vector embeddings Using retrievers and hybrid search for the best context Elastic’s hybrid search helps you build the best context windows. Effective context windows are built from various types of vectorized and plain text data that can be spread across multiple indices. Developers can now take advantage of new query retrievers to simplify query creation. Three new retrievers are available from version 8.14 and on Elastic Cloud Serverless , and implementing hybrid search normalized with RRF is one unified query away. You can store vectorized data and use a kNN retriever, or add metadata and context to create a hybrid search query. Soon, you can also add semantic reranking to further improve search results. Use Playground to ship conversational search—quickly Building conversational search experiences can involve many approaches, and the choices can be paralyzing, especially given the pace of innovation in new reranking and retrieval techniques, both of which apply to RAG applications. With our playground, those choices are simplified and intuitive, even with the vast array of capabilities available to the developer. Our approach is unique in enabling hybrid search as a predominant pillar of the construction immediately, with an intuitive understanding of the shape of the selected and chunked data and amplified access across multiple external providers of LLMs. Earlier this year, Elastic was awarded the AWS Generative AI Competency , a distinction given to very few AWS partners that provide differentiating generative AI tools. Elastic’s approach to adding Bedrock support for the playground experience is guided by the same principle – to bring new and innovative capabilities to Elastic Cloud on AWS developers. Build, test, fun with Playground Head over to Playground docs to get started today! Explore Search Labs on GitHub for new cookbooks and integrations for providers such as Cohere, Anthropic, Azure OpenAI, and more. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to A/B test LLMs & retrieval with Playground inference via Amazon Bedrock Using Elastic's Playground Using retrievers and hybrid search for the best context Use Playground to ship conversational search—quickly Build, test, fun with Playground Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "Playground: Experiment with RAG using Bedrock Anthropic Models and Elasticsearch in minutes - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/rag-playground-bedrock-anthropic-models",
+    "meta_description": "Explore Elastic's Playground and learn how to use it to experiment with RAG applications using Bedrock Anthropic Models & Elasticsearch."
+  },
+  {
+    "text": "Tutorials Examples Integrations Blogs Start free trial Blog ChatGPT and Elasticsearch revisited: Part 2 - The UI Abides This blog expands on Part 1 by introducing a fully functional web UI for our RAG-based search system. By the end, you'll have a working interface that ties the retrieval, search, and generation process together—while keeping things easy to tweak and explore. Generative AI JV By: Jeff Vestal On February 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Don't want to read the whole thing? No problem, go clone the app and get searching! In Part 1, we walked through setting up our search index, using the Open Crawler to crawl Elastic blog content, configured an Inference API to an LLM, and tested our RAG setup with Elastic’s Playground in Kibana. Now, in Part 2, I’ll make good on a promise from the end of that blog by returning with a functional web UI! This guide will walk through: How the app works. The key controls and customization options available to users. Enhancements that improve search, retrieval, and response generation. What the app does At a high level, this app takes a user’s search query or question and follows these steps: Retrieves relevant documents using hybrid search—combining text matching and semantic search. Displays matching document snippets along with links to their full content. Builds a prompt using the retrieved documents and predefined instructions. Generates a response from an LLM, providing grounding documents from Elasticsearch results. Provides controls to modify the generated prompt and response from the LLM. Exploring the UI controls The application provides several controls for refining search and response generation. Here’s a breakdown of the key features: 1. Search box Users enter a query just like a search engine. The query is processed using both lexical and vector search. 2. Generated response panel Displays the LLM-generated response based on retrieved documents. Sources used to generate the response are listed for reference. Includes an Expand/Collapse toggle to adjust panel size. 3. Elasticsearch results panel Shows the top-ranked retrieved documents from Elasticsearch. Includes document titles, highlights, and direct links to the original content. Helps users see which documents influenced the LLM’s response. 4. Source filtering controls Users can select which data sources to use for retrieval after the initial search. This allows users to focus on specific domains of content. 5. Source filtering controls Users can select if the LLM can use its training to generate a response outside of the grounding context. Opens up the possibility of expanded answers beyond what is passed to the LLM. 6. Number of sources selector Allows users to adjust how many top results are passed to the LLM. Increasing sources often improves response grounding, but too many can incur unnecessary token costs. 7. Chunk vs. document toggle Determines whether grounding is done with the full document or relevant chunked . Chunking improves search granularity by breaking long texts into manageable sections. 8. LLM prompt paned Allows users to view the complete prompt passed to the LLM to generate the response. Helps users better understand how an answer was generated. App architecture The application is a Next.js web app that provides a user interface for interacting with a RAG-based search system . End-to-end data flow of a Next.js-powered UI application This architecture eliminates the need for a separate backend service, leveraging Next.js API routes for seamless search and LLM processing integration. Code snippets Let's look at a few sections of code that are the most relevant to this app and may be useful if you want to modify it to work with different datasets. ES query The Elasticsearch query is pretty straightforward. /app/api/search/route.ts By using a hybrid retriever, we should allow for matching on searches that are more keyword-based and natural language questions, which are becoming the norm for people to use in their searches these days. You'll notice we are using the highlight functionality in this query. This allows us to easily provide a relevant summary of a matched document in the Elasticsearch Results section. It also allows us to use matching chunks for grounding when we are building the prompt for the LLM, and chunk is selected as the grounding option. Extracting the ES results Next, we need to extract the results from Elasticsearch /app/api/search/route.ts We extract the search results ( hits ) from the Elasticsearch response, ensuring they exist and are in an expected array format. If the results are missing or incorrectly formatted, we log an error and return a 500 status. Parse the hits We have our results but we need to parse out the results into a format we can use to display results to the user in the UI and to build our prompt for the LLM. /app/api/search/route.ts There are a couple of key things happening in this code block. We use the highlight value from the top hit of `semantic_body` as a snippet for each ES doc displayed. Depending on the user's selection, we store the prompt context using either the `semantic_body` as the chunk or the full `body` as the body We extract the `title`. We extract the URL to the blog and ensure it is formatted correctly so users can click on it to visit the blog. Lab sources for clicking The last processing we do is to parse out the aggregation values /app/api/search/route.ts We do this to have a clickable list of the various \"Labs\" where the results came from. This way, users can select only the sources they want included, and when they hit search again, the Labs they have checked will be used as a filter. Managing state The `SearchInterface` component is the core component in the app. It uses React state hooks to manage all the data and configurations. /components/SearchInterface.tsx The first three lines here are used to track the search results from Elasticsearch, the generated response from the LLM, and the generated prompt used to instruct the LLM. The last two are used to track user settings from the UI. Selecting the number of sources to include in grounding with the LLM and if the LLM should be grounded with just matching chunks or the full blog article. Handling search queries When the user hits the submit button, handleSearch takes over. /components/SearchInterface.tsx This function sends the queries to /api/search (shown in snippets above) including the user's source selection, grounding settings, and API Credentials. The response is parsed and stored in state, which triggers UI updates. Source Extraction After fetching the results, we create a sources object. /components/SearchInterface.tsx This will later be passed to the LLM as part of the prompt. The LLM is instructed to cite any sources it uses to generate its response. Constructing & sending the LLM prompt The prompt is dynamically created based on the user's settings and includes grounding documents from Elasticsearch. /components/SearchInterface.tsx By default, we instruct the LLM to only use the provided grounding documents in its answer. However, we do provide a setting to allow the LLM to use its own training \"knowledge\" to construct a wider response. When it is allowed to use its own training, it is further instructed to append a warning to the end of the response. We instruct the LLM to cite the provided documents as sources, but only ones that it actually uses. We give some instructions on how the response should be formatted for readability in the UI. Finally, we pass it to /api/llm for processing Streaming the AI response With the documents from Elasticsearch parsed and returned to the front end immediately, we call the LLM to generate a response to the user's question. /components/SearchInterface.tsx There are a lot of lines here but essentially this part of the code calls /api/llm (covered below) and handles the streaming response. We want the LLM response to stream back to the UI as it is generated, so we parse each event as it is returned allowing the UI to dynamically update. We have to decode the stream, do a little cleanup, and update resultText with the newly received text. Calling the LLM We are calling the LLM using Elasticsearch's Inference API. This allows us to centralize the management of our data in Elasticsearch. /app/api/llm/route.ts This bit of code is pretty straightforward. We send the request to the streaming inference API Endpoint we created as part of the setup (see below under Getting Things Running), and then we stream the response back. Handling the streams We need to read the stream as it comes in chunk-by-chunk. /app/api/llm/route.ts Here we decode the streamed LLM response chunk-by-chunk and forwards each decoded part to the frontend in real-time. Getting things running Now that we've reviewed some of the key parts of the code, let's get things actually installed and up and running. Completion Inference API If you don't have a completion Inference API configured in Elasticsearch , you'll need to do that so we can generate a response to the user's question. I used Azure OpenAI with the gpt-4o model, but you should be able to use other services. The key is that it must be a service that the Stream Inference API supports. The individual service_settings depends on which service and model you use. Refer to the Inference API docs for more info. Clone If you have the GitHub CLI installed and configured, you can clone the UI repo into the directory of your choice. You can also download the zip file at then unzip it. Install dependencies Follow the step in the readme file in the repo to install the dependencies. Start the development server We are going to just run in development mode. This runs with live reloading and debugging. There is a production mode that runs an optimized production build for deployment. To start in dev mode run: This should start the server up, and if there are no errors, you should see something similar to: If you have something else running using port 3000 your app will start using the next available port. Just look at the output to see what port it uses. If you want it to run on a specific port, say 4000 you can do so by running: To the UI Once the app is running, you can try out different configurations to see what works best for you. Connection settings The first thing you need to do before using the app is set up your connection credentials. To do that click the gear icon ⚙️ in the upper right. When the box pops up, input your API Key and Elasticsearch URL. Defaults To get started simply ask a question or type in a search query into the search bar. Leave everything else as is. Learning about Hybrid Search The app will query Elasticsearch for the top relevant docs, in the above example about rrf, using rrf! The docs with a short snippet, the blog title, and a clickable URL will be returned to the user. Search All the Things! The top three chunks will be combined with a prompt and sent to the LLM. The generated response will be streamed back. This appears to be a thorough response! Bowl your own game Once the initial search results and generated response are displayed, the user can follow up and make a couple of changes to the settings. Lab sources All the blog sites that are part of the index searched will be listed under Lab Sources . If you were to add additional sites or sources to the index we created in part one with the Open Crawler, they would show up here. You can select only the sources you want to be considered for search results and click search again. The subsequent search will use the checked sources as a filter on the Elasticsearch query. Answer source One of the advantages we talk about with RAG is providing grounding documents to the LLM. This helps cut down on hallucinations (nothing's perfect). However, you may want to allow the LLM to use its training and other \"knowledge\" outside of the grounding docs to generate a response. Unchecking Context Only will allow the LLM this freedom. The LLM should provide a warning at the end of the response, letting you know if it journeyed outside the groundings. As with many things LLM, this isn't guaranteed. Either way, use caution with these responses. Number of sources We default to using three chunks as grounding information for the LLM. Increasing the number of context chunks sometimes gives the LLM more information to generate its response. Sometimes, simply providing the whole blog about a specialized topic works best. It depends on how spread out the topic is. Some topics are covered in many blogs in various ways. So, giving more sources can provide a richer answer. Something more esoteric may only be covered once so extra chunks may not help. Chunk or doc Regarding the number of sources, throwing everything at the LLM is not usually the best way to generate an answer. While most blogs are relatively short compared to many other document sources, say health insurance policy documents, throwing a long document at an LLM has several downsides. First, if the relevant information is only included in two paragraphs and you include twenty, you pay for eighteen paragraphs of useless tokens. Second, that useless information slows down the LLM's generated response. Generally, stick with chunks unless you have a good reason to send whole documents, blogs in this case. Here's the bridge-- Hopefully, this walkthrough has helped ensure you're not out of your element when it comes to setting up a UI that provides semantically retrieved documents from Elasticsearch and generated answers from an LLM. There certainly are a lot of features we could add and settings we could tweak to make the experience even better here. But it's a good start. The great thing about providing code to the community is that you're free to take it and customize and tweak it if you are happy. Be on the lookout for part 3, where we will instrument the app using Open Telemetry! Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to What the app does Exploring the UI controls App architecture Code snippets ES query Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.",
+    "title": "ChatGPT and Elasticsearch revisited: Part 2 - The UI Abides - Elasticsearch Labs",
+    "url": "https://www.elastic.co/search-labs/blog/chatgpt-elasticsearch-rag-app-ui",
+    "meta_description": "This blog expands on Part 1 by introducing a fully functional web UI for our RAG-based search system. \n\nBy the end, you'll have a working interface that ties the retrieval, search, and generation process together—while keeping things easy to tweak and explore."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Manage TLS encryption / Self-managed / Supported SSL/TLS versions by JDK version Self Managed Elasticsearch relies on your JDK’s implementation of SSL and TLS. Different JDK versions support different versions of SSL, and this may affect how Elasticsearch operates. Note This support applies when running on the default JSSE provider in the JDK. JVMs that are configured to use a FIPS 140-2 security provider might have a custom TLS implementation, which might support TLS protocol versions that differ from this list. Check your security provider’s release notes for information on TLS support. SSLv3 SSL v3 is supported on all Elasticsearch compatible JDKs but is disabled by default. See Enabling additional SSL/TLS versions on your JDK . TLSv1 TLS v1.0 is supported on all Elasticsearch compatible JDKs but is disabled by default. See Enabling additional SSL/TLS versions on your JDK . TLSv1.1 TLS v1.1 is supported on all Elasticsearch compatible JDKs but is disabled by default. See Enabling additional SSL/TLS versions on your JDK . TLSv1.2 TLS v1.2 is supported on all Elasticsearch compatible JDKs . It is enabled by default on all JDKs that are supported by Elasticsearch, including the bundled JDK. TLSv1.3 TLS v1.3 is supported on all Elasticsearch compatible JDKs . It is enabled by default on all JDKs that are supported by Elasticsearch, including the bundled JDK. Enabling additional SSL/TLS versions on your JDK The set of supported SSL/TLS versions for a JDK is controlled by a java security properties file that is installed as part of your JDK. This configuration file lists the SSL/TLS algorithms that are disabled in that JDK. Complete these steps to remove a TLS version from that list and use it in your JDK. Locate the configuration file for your JDK. Copy the jdk.tls.disabledAlgorithms setting from that file, and add it to a custom configuration file within the Elasticsearch configuration directory. In the custom configuration file, remove the value for the TLS version you want to use from jdk.tls.disabledAlgorithms . Configure Elasticsearch to pass a custom system property to the JDK so that your custom configuration file is used. Locate the configuration file for your JDK For the Elasticsearch bundled JDK , the configuration file is in a sub directory of the Elasticsearch home directory ( $ES_HOME ): Linux: $ES_HOME/jdk/conf/security/java.security Windows: $ES_HOME/jdk/conf/security/java.security macOS: $ES_HOME/jdk.app/Contents/Home/conf/security/java.security For JDK11 or later , the configuration file is within the conf/security directory of the Java installation. If $JAVA_HOME points to the home directory of the JDK that you use to run Elasticsearch, then the configuration file will be in: $JAVA_HOME/conf/security/java.security Copy the disabledAlgorithms setting Within the JDK configuration file is a line that starts with jdk.tls.disabledAlgorithms= . This setting controls which protocols and algorithms are disabled in your JDK. The value of that setting will typically span multiple lines. For example, in OpenJDK 21 the setting is: jdk.tls.disabledAlgorithms=SSLv3, TLSv1, TLSv1.1, DTLSv1.0, RC4, DES, \\ MD5withRSA, DH keySize < 1024, EC keySize < 224, 3DES_EDE_CBC, anon, NULL, \\ ECDH Create a new file in your in your Elasticsearch configuration directory named es.java.security . Copy the jdk.tls.disabledAlgorithms setting from the JDK’s default configuration file into es.java.security . You do not need to copy any other settings. Enable required TLS versions Edit the es.java.security file in your Elasticsearch configuration directory, and modify the jdk.tls.disabledAlgorithms setting so that any SSL or TLS versions that you wish to use are no longer listed. For example, to enable TLSv1.1 on OpenJDK 21 (which uses the jdk.tls.disabledAlgorithms settings shown previously), the es.java.security file would contain the previously disabled TLS algorithms except TLSv1.1 : jdk.tls.disabledAlgorithms=SSLv3, TLSv1, DTLSv1.0, RC4, DES, \\ MD5withRSA, DH keySize < 1024, EC keySize < 224, 3DES_EDE_CBC, anon, NULL, \\ ECDH Enable your custom security configuration To enable your custom security policy, add a file in the jvm.options.d directory within your Elasticsearch configuration directory. To enable your custom security policy, create a file named java.security.options within the jvm.options.d directory of your Elasticsearch configuration directory, with this content: -Djava.security.properties=/path/to/your/es.java.security Enabling TLS versions in Elasticsearch SSL/TLS versions can be enabled and disabled within Elasticsearch via the ssl.supported_protocols settings . Elasticsearch will only support the TLS versions that are enabled by the underlying JDK. If you configure ssl.supported_protocols to include a TLS version that is not enabled in your JDK, then it will be silently ignored. Similarly, a TLS version that is enabled in your JDK, will not be used unless it is configured as one of the ssl.supported_protocols in Elasticsearch. Previous Mutual authentication Next Enabling cipher suites for stronger encryption Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Enabling additional SSL/TLS versions on your JDK Locate the configuration file for your JDK Copy the disabledAlgorithms setting Enable required TLS versions Enable your custom security configuration Enabling TLS versions in Elasticsearch Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Supported SSL/TLS versions by JDK version | Elastic Docs",
+    "url": "https://www.elastic.co/docs/deploy-manage/security/supported-ssltls-versions-by-jdk-version",
+    "meta_description": "Elasticsearch relies on your JDK’s implementation of SSL and TLS. Different JDK versions support different versions of SSL, and this may affect how Elasticsearch..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / SQL / Functions and Operators / String Functions Elastic Stack Serverless Functions for performing string manipulation. ASCII ASCII(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the ASCII code value of the leftmost character of string_exp as an integer. SELECT ASCII('Elastic'); ASCII('Elastic') ---------------- 69 BIT_LENGTH BIT_LENGTH(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the length in bits of the string_exp input expression. SELECT BIT_LENGTH('Elastic'); BIT_LENGTH('Elastic') --------------------- 56 CHAR CHAR(code) Input : integer expression between 0 and 255 . If null , negative, or greater than 255 , the function returns null . Output : string Description : Returns the character that has the ASCII code value specified by the numeric input. SELECT CHAR(69); CHAR(69) --------------- E CHAR_LENGTH CHAR_LENGTH(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the length in characters of the input, if the string expression is of a character data type; otherwise, returns the length in bytes of the string expression (the smallest integer not less than the number of bits divided by 8). SELECT CHAR_LENGTH('Elastic'); CHAR_LENGTH('Elastic') ---------------------- 7 CONCAT CONCAT( string_exp1, string_exp2) Input : string expression. Treats null as an empty string. string expression. Treats null as an empty string. Output : string Description : Returns a character string that is the result of concatenating string_exp1 to string_exp2 . The resulting string cannot exceed a byte length of 1 MB. SELECT CONCAT('Elasticsearch', ' SQL'); CONCAT('Elasticsearch', ' SQL') ------------------------------- Elasticsearch SQL INSERT INSERT( source, start, length, replacement) Input : string expression. If null , the function returns null . integer expression. If null , the function returns null . integer expression. If null , the function returns null . string expression. If null , the function returns null . Output : string Description : Returns a string where length characters have been deleted from source , beginning at start , and where replacement has been inserted into source , beginning at start . The resulting string cannot exceed a byte length of 1 MB. SELECT INSERT('Elastic ', 8, 1, 'search'); INSERT('Elastic ', 8, 1, 'search') ---------------------------------- Elasticsearch LCASE LCASE(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns a string equal to that in string_exp , with all uppercase characters converted to lowercase. SELECT LCASE('Elastic'); LCASE('Elastic') ---------------- elastic LEFT LEFT( string_exp, count) Input : string expression. If null , the function returns null . integer expression. If null , the function returns null . If 0 or negative, the function returns an empty string. Output : string Description : Returns the leftmost count characters of string_exp . SELECT LEFT('Elastic',3); LEFT('Elastic',3) ----------------- Ela LENGTH LENGTH(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the number of characters in string_exp , excluding trailing blanks. SELECT LENGTH('Elastic '); LENGTH('Elastic ') -------------------- 7 LOCATE LOCATE( pattern, source [, start]<3> ) Input : string expression. If null , the function returns null . string expression. If null , the function returns null . integer expression; optional. If null , 0 , 1 , negative, or not specified, the search starts at the first character position. Output : integer Description : Returns the starting position of the first occurrence of pattern within source . The optional start specifies the character position to start the search with. If the pattern is not found within source , the function returns 0 . SELECT LOCATE('a', 'Elasticsearch'); LOCATE('a', 'Elasticsearch') ---------------------------- 3 SELECT LOCATE('a', 'Elasticsearch', 5); LOCATE('a', 'Elasticsearch', 5) ------------------------------- 10 LTRIM LTRIM(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns the characters of string_exp , with leading blanks removed. SELECT LTRIM(' Elastic'); LTRIM(' Elastic') ------------------- Elastic OCTET_LENGTH OCTET_LENGTH(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the length in bytes of the string_exp input expression. SELECT OCTET_LENGTH('Elastic'); OCTET_LENGTH('Elastic') ----------------------- 7 POSITION POSITION( string_exp1, string_exp2) Input : string expression. If null , the function returns null . string expression. If null , the function returns null . Output : integer Description : Returns the position of the string_exp1 in string_exp2 . The result is an exact numeric. SELECT POSITION('Elastic', 'Elasticsearch'); POSITION('Elastic', 'Elasticsearch') ------------------------------------ 1 REPEAT REPEAT( string_exp, count) Input : string expression. If null , the function returns null . integer expression. If 0 , negative, or null , the function returns null . Output : string Description : Returns a character string composed of string_exp repeated count times. The resulting string cannot exceed a byte length of 1 MB. SELECT REPEAT('La', 3); REPEAT('La', 3) ---------------- LaLaLa REPLACE REPLACE( source, pattern, replacement) Input : string expression. If null , the function returns null . string expression. If null , the function returns null . string expression. If null , the function returns null . Output : string Description : Search source for occurrences of pattern , and replace with replacement . The resulting string cannot exceed a byte length of 1 MB. SELECT REPLACE('Elastic','El','Fant'); REPLACE('Elastic','El','Fant') ------------------------------ Fantastic RIGHT RIGHT( string_exp, count) Input : string expression. If null , the function returns null . integer expression. If null , the function returns null . If 0 or negative, the function returns an empty string. Output : string Description : Returns the rightmost count characters of string_exp . SELECT RIGHT('Elastic',3); RIGHT('Elastic',3) ------------------ tic RTRIM RTRIM(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns the characters of string_exp with trailing blanks removed. SELECT RTRIM('Elastic '); RTRIM('Elastic ') ------------------- Elastic SPACE SPACE(count) Input : integer expression. If null or negative, the function returns null . Output : string Description : Returns a character string consisting of count spaces. The resulting string cannot exceed a byte length of 1 MB. SELECT SPACE(3); SPACE(3) --------------- STARTS_WITH STARTS_WITH( source, pattern) Input : string expression. If null , the function returns null . string expression. If null , the function returns null . Output : boolean value Description : Returns true if the source expression starts with the specified pattern, false otherwise. The matching is case sensitive. SELECT STARTS_WITH('Elasticsearch', 'Elastic'); STARTS_WITH('Elasticsearch', 'Elastic') -------------------------------- true SELECT STARTS_WITH('Elasticsearch', 'ELASTIC'); STARTS_WITH('Elasticsearch', 'ELASTIC') -------------------------------- false SUBSTRING SUBSTRING( source, start, length) Input : string expression. If null , the function returns null . integer expression. If null , the function returns null . integer expression. If null , the function returns null . Output : string Description : Returns a character string that is derived from source , beginning at the character position specified by start for length characters. SELECT SUBSTRING('Elasticsearch', 0, 7); SUBSTRING('Elasticsearch', 0, 7) -------------------------------- Elastic TRIM TRIM(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns the characters of string_exp , with leading and trailing blanks removed. SELECT TRIM(' Elastic ') AS trimmed; trimmed -------------- Elastic UCASE UCASE(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns a string equal to that of the input, with all lowercase characters converted to uppercase. SELECT UCASE('Elastic'); UCASE('Elastic') ---------------- ELASTIC Previous Mathematical Functions Next Type Conversion Functions Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page ASCII BIT_LENGTH CHAR CHAR_LENGTH CONCAT INSERT LCASE LEFT LENGTH LOCATE LTRIM OCTET_LENGTH POSITION REPEAT REPLACE RIGHT RTRIM SPACE STARTS_WITH SUBSTRING TRIM UCASE Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "String Functions | Elastic Docs",
+    "url": "https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-functions-string",
+    "meta_description": "Functions for performing string manipulation. Output: integer Description: Returns the ASCII code value of the leftmost character of string_exp as an..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / REST APIs / Elasticsearch API compatibility Elastic Stack To help REST clients mitigate the impact of non-compatible (breaking) API changes, Elasticsearch provides a per-request, opt-in API compatibility mode. Elasticsearch REST APIs are generally stable across versions. However, some improvements require changes that are not compatible with previous versions. When an API is targeted for removal or is going to be changed in a non-compatible way, the original API is deprecated for one or more releases. Using the original API triggers a deprecation warning in the logs. This enables you to review the deprecation logs and take the appropriate actions before upgrading. However, in some cases it is difficult to identify all places where deprecated APIs are being used. This is where REST API compatibility can help. When you request REST API compatibility, Elasticsearch attempts to honor the previous REST API version. Elasticsearch attempts to apply the most compatible URL, request body, response body, and HTTP parameters. For compatible APIs, this has no effect— it only impacts calls to APIs that have breaking changes from the previous version. An error can still be returned in compatibility mode if Elasticsearch cannot automatically resolve the incompatibilities. Important REST API compatibility does not guarantee the same behavior as the prior version. It instructs Elasticsearch to automatically resolve any incompatibilities so the request can be processed instead of returning an error. REST API compatibility should be a bridge to smooth out the upgrade process, not a long term strategy. REST API compatibility is only honored across one major version: honor 8.x requests/responses from 9.x. When you submit requests using REST API compatibility and Elasticsearch resolves the incompatibility, a message is written to the deprecation log with the category \"compatible_api\". Review the deprecation log to identify any gaps in usage and fully supported features. Requesting REST API compatibility REST API compatibility is implemented per request via the Accept and/or Content-Type headers. For example: Accept: \"application/vnd.elasticsearch+json;compatible-with=8\" Content-Type: \"application/vnd.elasticsearch+json;compatible-with=8\" The Accept header is always required and the Content-Type header is only required when a body is sent with the request. The following values are valid when communicating with a 8.x or 9.x Elasticsearch server: \"application/vnd.elasticsearch+json;compatible-with=8\" \"application/vnd.elasticsearch+yaml;compatible-with=8\" \"application/vnd.elasticsearch+smile;compatible-with=8\" \"application/vnd.elasticsearch+cbor;compatible-with=8\" The officially supported Elasticsearch clients can enable REST API compatibility for all requests. To enable REST API compatibility for all requests received by Elasticsearch set the environment variable ELASTIC_CLIENT_APIVERSIONING to true. REST API compatibility workflow To leverage REST API compatibility during an upgrade from the last 8.x to 9.0.0: Upgrade your Elasticsearch clients to the latest 8.x version and enable REST API compatibility. Use the Upgrade Assistant to review all critical issues and explore the deprecation logs. Some critical issues might be mitigated by REST API compatibility. Resolve all critical issues before proceeding with the upgrade. Upgrade Elasticsearch to 9.0.0. Review the deprecation logs for entries with the category compatible_api . Review the workflow associated with the requests that relied on compatibility mode. Upgrade your Elasticsearch clients to 9.x and resolve compatibility issues manually where needed. Previous Common options Next API examples Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Requesting REST API compatibility REST API compatibility workflow Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Elasticsearch API compatibility | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/rest-apis/compatibility",
+    "meta_description": "To help REST clients mitigate the impact of non-compatible (breaking) API changes, Elasticsearch provides a per-request, opt-in API compatibility mode..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Index lifecycle actions / Wait for snapshot Phases allowed: delete. Waits for the specified SLM policy to be executed before removing the index. This ensures that a snapshot of the deleted index is available. Options policy (Required, string) Name of the SLM policy that the delete action should wait for. Example PUT _ilm/policy/my_policy { \"policy\": { \"phases\": { \"delete\": { \"actions\": { \"wait_for_snapshot\" : { \"policy\": \"slm-policy-name\" } } } } } } Previous Unfollow Next REST APIs Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Options Example Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Wait for snapshot | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/index-lifecycle-actions/ilm-wait-for-snapshot",
+    "meta_description": "Phases allowed: delete. Waits for the specified SLM policy to be executed before removing the index. This ensures that a snapshot of the deleted index..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / SQL / SQL Client Applications / Qlik Sense Desktop Elastic Stack Serverless You can use the Elasticsearch ODBC driver to access Elasticsearch data from Qlik Sense Desktop. Important Elastic does not endorse, promote or provide support for this application; for native Elasticsearch integration in this product, reach out to its vendor. Prerequisites Qlik Sense Desktop November 2018 or higher Elasticsearch SQL ODBC driver A preconfigured User or System DSN (see Configuration section on how to configure a DSN). Data loading To use the Elasticsearch SQL ODBC Driver to load data into Qlik Sense Desktop perform the following steps in sequence. Create new app Once the application is launched, you’ll first need to click on the Create new app button: Name app …then give it a name, Open app …and then open it: Add data to your app Start configuring the source to load data from in the newly created app: Load from ODBC You’ll be given a choice of sources to select. Click on the ODBC icon: Choose DSN In the Create new connection (ODBC) dialog, click on the DSN name that you have previously configured for your Elasticsearch instance: Provide a username and password in the respective fields, if authentication is enabled on your instance and if these are not already part of the DSN. Press the Create button. Select source table The application will now connect to the Elasticsearch instance and query the catalog information, presenting you with a list of tables that you can load data from: Visualize the data Press on the Add data button and customize your data visualization: Previous MicroStrategy Desktop Next SQuirreL SQL Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Prerequisites Data loading Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Qlik Sense Desktop | Elastic Docs",
+    "url": "https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-client-apps-qlik",
+    "meta_description": "You can use the Elasticsearch ODBC driver to access Elasticsearch data from Qlik Sense Desktop. Qlik Sense Desktop November 2018 or higher, Elasticsearch..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / … / REST APIs / API examples / Suggester examples Elastic Stack The suggest feature suggests similar looking terms based on a provided text by using a suggester. The suggest request part is defined alongside the query part in a _search request. If the query part is left out, only suggestions are returned. Note For the most up-to-date details, refer to Search APIs . Several suggestions can be specified per request. Each suggestion is identified with an arbitrary name. In the example below two suggestions are requested. Both my-suggest-1 and my-suggest-2 suggestions use the term suggester, but have a different text . POST _search { \"suggest\": { \"my-suggest-1\" : { \"text\" : \"tring out Elasticsearch\", \"term\" : { \"field\" : \"message\" } }, \"my-suggest-2\" : { \"text\" : \"kmichy\", \"term\" : { \"field\" : \"user.id\" } } } } The following suggest response example includes the suggestion response for my-suggest-1 and my-suggest-2 . Each suggestion part contains entries. Each entry is effectively a token from the suggest text and contains the suggestion entry text, the original start offset and length in the suggest text and if found an arbitrary number of options. { \"_shards\": ... \"hits\": ... \"took\": 2, \"timed_out\": false, \"suggest\": { \"my-suggest-1\": [ { \"text\": \"tring\", \"offset\": 0, \"length\": 5, \"options\": [ {\"text\": \"trying\", \"score\": 0.8, \"freq\": 1 } ] }, { \"text\": \"out\", \"offset\": 6, \"length\": 3, \"options\": [] }, { \"text\": \"elasticsearch\", \"offset\": 10, \"length\": 13, \"options\": [] } ], \"my-suggest-2\": ... } } Each options array contains an option object that includes the suggested text, its document frequency and score compared to the suggest entry text. The meaning of the score depends on the used suggester. The term suggester's score is based on the edit distance. Global suggest text To avoid repetition of the suggest text, it is possible to define a global text. In the following example the suggest text is defined globally and applies to the my-suggest-1 and my-suggest-2 suggestions. POST _search { \"suggest\": { \"text\" : \"tring out Elasticsearch\", \"my-suggest-1\" : { \"term\" : { \"field\" : \"message\" } }, \"my-suggest-2\" : { \"term\" : { \"field\" : \"user\" } } } } The suggest text can in the above example also be specified as suggestion specific option. The suggest text specified on suggestion level override the suggest text on the global level. Term suggester The term suggester suggests terms based on edit distance. The provided suggest text is analyzed before terms are suggested. The suggested terms are provided per analyzed suggest text token. The term suggester doesn't take the query into account that is part of request. Common suggest options include: text The suggest text. The suggest text is a required option that needs to be set globally or per suggestion. field The field to fetch the candidate suggestions from. This is a required option that either needs to be set globally or per suggestion. analyzer The analyzer to analyse the suggest text with. Defaults to the search analyzer of the suggest field. size The maximum corrections to be returned per suggest text token. sort Defines how suggestions should be sorted per suggest text term. Two possible values: score : Sort by score first, then document frequency and then the term itself. frequency : Sort by document frequency first, then similarity score and then the term itself. suggest_mode The suggest mode controls what suggestions are included or controls for what suggest text terms, suggestions should be suggested. Three possible values can be specified: missing : Only provide suggestions for suggest text terms that are not in the index (default). popular : Only suggest suggestions that occur in more docs than the original suggest text term. always : Suggest any matching suggestions based on terms in the suggest text. Phrase suggester The term suggester provides a very convenient API to access word alternatives on a per token basis within a certain string distance. The API allows accessing each token in the stream individually while suggest-selection is left to the API consumer. Yet, often pre-selected suggestions are required in order to present to the end-user. The phrase suggester adds additional logic on top of the term suggester to select entire corrected phrases instead of individual tokens weighted based on ngram-language models. In practice this suggester will be able to make better decisions about which tokens to pick based on co-occurrence and frequencies. In general the phrase suggester requires special mapping up front to work. The phrase suggester examples on this page need the following mapping to work. The reverse analyzer is used only in the last example. PUT test { \"settings\": { \"index\": { \"number_of_shards\": 1, \"analysis\": { \"analyzer\": { \"trigram\": { \"type\": \"custom\", \"tokenizer\": \"standard\", \"filter\": [\"lowercase\",\"shingle\"] }, \"reverse\": { \"type\": \"custom\", \"tokenizer\": \"standard\", \"filter\": [\"lowercase\",\"reverse\"] } }, \"filter\": { \"shingle\": { \"type\": \"shingle\", \"min_shingle_size\": 2, \"max_shingle_size\": 3 } } } } }, \"mappings\": { \"properties\": { \"title\": { \"type\": \"text\", \"fields\": { \"trigram\": { \"type\": \"text\", \"analyzer\": \"trigram\" }, \"reverse\": { \"type\": \"text\", \"analyzer\": \"reverse\" } } } } } } POST test/_doc?refresh=true {\"title\": \"noble warriors\"} POST test/_doc?refresh=true {\"title\": \"nobel prize\"} Once you have the analyzers and mappings set up you can use the phrase suggester in the same spot you'd use the term suggester: POST test/_search { \"suggest\": { \"text\": \"noble prize\", \"simple_phrase\": { \"phrase\": { \"field\": \"title.trigram\", \"size\": 1, \"gram_size\": 3, \"direct_generator\": [ { \"field\": \"title.trigram\", \"suggest_mode\": \"always\" } ], \"highlight\": { \"pre_tag\": \"<em>\", \"post_tag\": \"</em>\" } } } } } The response contains suggestions scored by the most likely spelling correction first. In this case we received the expected correction \"nobel prize\". { \"_shards\": ... \"hits\": ... \"timed_out\": false, \"took\": 3, \"suggest\": { \"simple_phrase\" : [ { \"text\" : \"noble prize\", \"offset\" : 0, \"length\" : 11, \"options\" : [ { \"text\" : \"nobel prize\", \"highlighted\": \"<em>nobel</em> prize\", \"score\" : 0.48614594 }] } ] } } Basic phrase suggest API parameters include: field The name of the field used to do n-gram lookups for the language model, the suggester will use this field to gain statistics to score corrections. This field is mandatory. gram_size Sets max size of the n-grams (shingles) in the field . If the field doesn't contain n-grams (shingles), this should be omitted or set to 1 . Note that Elasticsearch tries to detect the gram size based on the specified field . If the field uses a shingle filter, the gram_size is set to the max_shingle_size if not explicitly set. real_word_error_likelihood The likelihood of a term being misspelled even if the term exists in the dictionary. The default is 0.95 , meaning 5% of the real words are misspelled. confidence The confidence level defines a factor applied to the input phrases score which is used as a threshold for other suggest candidates. Only candidates that score higher than the threshold will be included in the result. For instance a confidence level of 1.0 will only return suggestions that score higher than the input phrase. If set to 0.0 the top N candidates are returned. The default is 1.0 . max_errors The maximum percentage of the terms considered to be misspellings in order to form a correction. This method accepts a float value in the range [0..1) as a fraction of the actual query terms or a number >=1 as an absolute number of query terms. The default is set to 1.0 , meaning only corrections with at most one misspelled term are returned. Note that setting this too high can negatively impact performance. Low values like 1 or 2 are recommended; otherwise the time spend in suggest calls might exceed the time spend in query execution. separator The separator that is used to separate terms in the bigram field. If not set the whitespace character is used as a separator. size The number of candidates that are generated for each individual query term. Low numbers like 3 or 5 typically produce good results. Raising this can bring up terms with higher edit distances. The default is 5 . analyzer Sets the analyzer to analyze to suggest text with. Defaults to the search analyzer of the suggest field passed via field . shard_size Sets the maximum number of suggested terms to be retrieved from each individual shard. During the reduce phase, only the top N suggestions are returned based on the size option. Defaults to 5 . text Sets the text / query to provide suggestions for. highlight Sets up suggestion highlighting. If not provided then no highlighted field is returned. If provided must contain exactly pre_tag and post_tag , which are wrapped around the changed tokens. If multiple tokens in a row are changed the entire phrase of changed tokens is wrapped rather than each token. collate Checks each suggestion against the specified query to prune suggestions for which no matching docs exist in the index. The collate query for a suggestion is run only on the local shard from which the suggestion has been generated from. The query must be specified and it can be templated. Refer to Search templates . The current suggestion is automatically made available as the {{suggestion}} variable, which should be used in your query. You can still specify your own template params — the suggestion value will be added to the variables you specify. Additionally, you can specify a prune to control if all phrase suggestions will be returned; when set to true the suggestions will have an additional option collate_match , which will be true if matching documents for the phrase was found, false otherwise. The default value for prune is false . POST test/_search { \"suggest\": { \"text\" : \"noble prize\", \"simple_phrase\" : { \"phrase\" : { \"field\" : \"title.trigram\", \"size\" : 1, \"direct_generator\" : [ { \"field\" : \"title.trigram\", \"suggest_mode\" : \"always\", \"min_word_length\" : 1 } ], \"collate\": { \"query\": { \"source\" : { \"match\": { \"{{field_name}}\" : \"{{suggestion}}\" } } }, \"params\": {\"field_name\" : \"title\"}, \"prune\": true } } } } } This query will be run once for every suggestion. The {{suggestion}} variable will be replaced by the text of each suggestion. An additional field_name variable has been specified in params and is used by the match query. All suggestions will be returned with an extra collate_match option indicating whether the generated phrase matched any document. Smoothing models The phrase suggester supports multiple smoothing models to balance weight between infrequent grams (grams (shingles) are not existing in the index) and frequent grams (appear at least once in the index). The smoothing model can be selected by setting the smoothing parameter to one of the following options. Each smoothing model supports specific properties that can be configured. stupid_backoff A simple backoff model that backs off to lower order n-gram models if the higher order count is 0 and discounts the lower order n-gram model by a constant factor. The default discount is 0.4 . Stupid Backoff is the default model. laplace A smoothing model that uses an additive smoothing where a constant (typically 1.0 or smaller) is added to all counts to balance weights. The default alpha is 0.5 . linear_interpolation A smoothing model that takes the weighted mean of the unigrams, bigrams, and trigrams based on user supplied weights (lambdas). Linear Interpolation doesn't have any default values. All parameters ( trigram_lambda , bigram_lambda , unigram_lambda ) must be supplied. POST test/_search { \"suggest\": { \"text\" : \"obel prize\", \"simple_phrase\" : { \"phrase\" : { \"field\" : \"title.trigram\", \"size\" : 1, \"smoothing\" : { \"laplace\" : { \"alpha\" : 0.7 } } } } } } Candidate generators The phrase suggester uses candidate generators to produce a list of possible terms per term in the given text. A single candidate generator is similar to a term suggester called for each individual term in the text. The output of the generators is subsequently scored in combination with the candidates from the other terms for suggestion candidates. Currently only one type of candidate generator is supported, the direct_generator . The phrase suggest API accepts a list of generators under the key direct_generator ; each of the generators in the list is called per term in the original text. Direct generators The parameters that direct generators support include: field The field to fetch the candidate suggestions from. This is a required option that either needs to be set globally or per suggestion. size The maximum corrections to be returned per suggest text token. suggest_mode The suggest mode controls what suggestions are included on the suggestions generated on each shard. All values other than always can be thought of as an optimization to generate fewer suggestions to test on each shard and are not rechecked when combining the suggestions generated on each shard. Thus missing will generate suggestions for terms on shards that do not contain them even if other shards do contain them. Those should be filtered out using confidence . Three possible values can be specified: missing : Only generate suggestions for terms that are not in the shard. This is the default. popular : Only suggest terms that occur in more docs on the shard than the original term. always : Suggest any matching suggestions based on terms in the suggest text. max_edits The maximum edit distance candidate suggestions can have in order to be considered as a suggestion. Can only be a value between 1 and 2. Any other value results in a bad request error being thrown. Defaults to 2. prefix_length The number of minimal prefix characters that must match in order be a candidate suggestions. Defaults to 1. Increasing this number improves spellcheck performance. Usually misspellings don't occur in the beginning of terms. min_word_length The minimum length a suggest text term must have in order to be included. Defaults to 4. max_inspections A factor that is used to multiply with the shard_size in order to inspect more candidate spelling corrections on the shard level. Can improve accuracy at the cost of performance. Defaults to 5. min_doc_freq The minimal threshold in number of documents a suggestion should appear in. This can be specified as an absolute number or as a relative percentage of number of documents. This can improve quality by only suggesting high frequency terms. Defaults to 0f and is not enabled. If a value higher than 1 is specified, then the number cannot be fractional. The shard level document frequencies are used for this option. max_term_freq The maximum threshold in number of documents in which a suggest text token can exist in order to be included. Can be a relative percentage number (e.g., 0.4) or an absolute number to represent document frequencies. If a value higher than 1 is specified, then fractional can not be specified. Defaults to 0.01f. This can be used to exclude high frequency terms — which are usually spelled correctly — from being spellchecked. This also improves the spellcheck performance. The shard level document frequencies are used for this option. pre_filter A filter (analyzer) that is applied to each of the tokens passed to this candidate generator. This filter is applied to the original token before candidates are generated. post_filter A filter (analyzer) that is applied to each of the generated tokens before they are passed to the actual phrase scorer. The following example shows a phrase suggest call with two generators: the first one is using a field containing ordinary indexed terms, and the second one uses a field that uses terms indexed with a reverse filter (tokens are index in reverse order). This is used to overcome the limitation of the direct generators to require a constant prefix to provide high-performance suggestions. The pre_filter and post_filter options accept ordinary analyzer names. POST test/_search { \"suggest\": { \"text\" : \"obel prize\", \"simple_phrase\" : { \"phrase\" : { \"field\" : \"title.trigram\", \"size\" : 1, \"direct_generator\" : [ { \"field\" : \"title.trigram\", \"suggest_mode\" : \"always\" }, { \"field\" : \"title.reverse\", \"suggest_mode\" : \"always\", \"pre_filter\" : \"reverse\", \"post_filter\" : \"reverse\" } ] } } } } pre_filter and post_filter can also be used to inject synonyms after candidates are generated. For instance for the query captain usq we might generate a candidate usa for the term usq , which is a synonym for america . This allows us to present captain america to the user if this phrase scores high enough. Completion suggester The completion suggester provides auto-complete/search-as-you-type functionality. This is a navigational feature to guide users to relevant results as they are typing, improving search precision. It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters. Ideally, auto-complete functionality should be as fast as a user types to provide instant feedback relevant to what a user has already typed in. Hence, completion suggester is optimized for speed. The suggester uses data structures that enable fast lookups, but are costly to build and are stored in-memory. Mapping To use the completion suggester , map the field from which you want to generate suggestions as type completion . This indexes the field values for fast completions. PUT music { \"mappings\": { \"properties\": { \"suggest\": { \"type\": \"completion\" } } } } The parameters that are accepted by completion fields include: analyzer The index analyzer to use, defaults to simple . search_analyzer The search analyzer to use, defaults to value of analyzer . preserve_separators Preserves the separators, defaults to true . If disabled, you could find a field starting with Foo Fighters , if you suggest for foof . preserve_position_increments Enables position increments, defaults to true . If disabled and using stopwords analyzer, you could get a field starting with The Beatles , if you suggest for b . Note : You could also achieve this by indexing two inputs, Beatles and The Beatles , no need to change a simple analyzer, if you are able to enrich your data. max_input_length Limits the length of a single input, defaults to 50 UTF-16 code points. This limit is only used at index time to reduce the total number of characters per input string in order to prevent massive inputs from bloating the underlying datastructure. Most use cases won't be influenced by the default value since prefix completions seldom grow beyond prefixes longer than a handful of characters. Indexing You index suggestions like any other field. A suggestion is made of an input and an optional weight attribute. An input is the expected text to be matched by a suggestion query and the weight determines how the suggestions will be scored. Indexing a suggestion is as follows: PUT music/_doc/1?refresh { \"suggest\" : { \"input\": [ \"Nevermind\", \"Nirvana\" ], \"weight\" : 34 } } The supported parameters include: input The input to store, this can be an array of strings or just a string. This field is mandatory. Note This value cannot contain the following UTF-16 control characters: \\u0000 (null) \\u001f (information separator one) \\u001e (information separator two) weight A positive integer or a string containing a positive integer, which defines a weight and allows you to rank your suggestions. This field is optional. You can index multiple suggestions for a document as follows: PUT music/_doc/1?refresh { \"suggest\": [ { \"input\": \"Nevermind\", \"weight\": 10 }, { \"input\": \"Nirvana\", \"weight\": 3 } ] } You can use the following shorthand form. Note that you can not specify a weight with suggestion(s) in the shorthand form. PUT music/_doc/1?refresh { \"suggest\" : [ \"Nevermind\", \"Nirvana\" ] } Querying Suggesting works as usual, except that you have to specify the suggest type as completion . Suggestions are near real-time, which means new suggestions can be made visible by refresh and documents once deleted are never shown. This request: POST music/_search?pretty { \"suggest\": { \"song-suggest\": { \"prefix\": \"nir\", \"completion\": { \"field\": \"suggest\" } } } } Prefix used to search for suggestions Type of suggestions Name of the field to search for suggestions in It returns this response: { \"_shards\" : { \"total\" : 1, \"successful\" : 1, \"skipped\" : 0, \"failed\" : 0 }, \"hits\": ... \"took\": 2, \"timed_out\": false, \"suggest\": { \"song-suggest\" : [ { \"text\" : \"nir\", \"offset\" : 0, \"length\" : 3, \"options\" : [ { \"text\" : \"Nirvana\", \"_index\": \"music\", \"_id\": \"1\", \"_score\": 1.0, \"_source\": { \"suggest\": [\"Nevermind\", \"Nirvana\"] } } ] } ] } } Important _source metadata field must be enabled, which is the default behavior, to enable returning _source with suggestions. The configured weight for a suggestion is returned as _score . The text field uses the input of your indexed suggestion. Suggestions return the full document _source by default. The size of the _source can impact performance due to disk fetch and network transport overhead. To save some network overhead, filter out unnecessary fields from the _source using source filtering to minimize _source size. Note that the _suggest endpoint doesn't support source filtering but using suggest on the _search endpoint does: POST music/_search { \"_source\": \"suggest\", \"suggest\": { \"song-suggest\": { \"prefix\": \"nir\", \"completion\": { \"field\": \"suggest\", \"size\": 5 } } } } Filter the source to return only the suggest field Name of the field to search for suggestions in Number of suggestions to return Which should look like: { \"took\": 6, \"timed_out\": false, \"_shards\": { \"total\": 1, \"successful\": 1, \"skipped\": 0, \"failed\": 0 }, \"hits\": { \"total\": { \"value\": 0, \"relation\": \"eq\" }, \"max_score\": null, \"hits\": [] }, \"suggest\": { \"song-suggest\": [ { \"text\": \"nir\", \"offset\": 0, \"length\": 3, \"options\": [ { \"text\": \"Nirvana\", \"_index\": \"music\", \"_id\": \"1\", \"_score\": 1.0, \"_source\": { \"suggest\": [ \"Nevermind\", \"Nirvana\" ] } } ] } ] } } The supported parameters for a basic completion suggester query include: field The name of the field on which to run the query (required). size The number of suggestions to return (defaults to 5 ). skip_duplicates Whether duplicate suggestions should be filtered out (defaults to false ). Note The completion suggester considers all documents in the index. See Context suggester for an explanation of how to query a subset of documents instead. Note In case of completion queries spanning more than one shard, the suggest is executed in two phases, where the last phase fetches the relevant documents from shards, implying executing completion requests against a single shard is more performant due to the document fetch overhead when the suggest spans multiple shards. To get best performance for completions, it is recommended to index completions into a single shard index. In case of high heap usage due to shard size, it is still recommended to break index into multiple shards instead of optimizing for completion performance. Skip duplicate suggestions Queries can return duplicate suggestions coming from different documents. It is possible to modify this behavior by setting skip_duplicates to true. When set, this option filters out documents with duplicate suggestions from the result. POST music/_search?pretty { \"suggest\": { \"song-suggest\": { \"prefix\": \"nor\", \"completion\": { \"field\": \"suggest\", \"skip_duplicates\": true } } } } Warning When set to true, this option can slow down search because more suggestions need to be visited to find the top N. Fuzzy queries The completion suggester also supports fuzzy queries — this means you can have a typo in your search and still get results back. POST music/_search?pretty { \"suggest\": { \"song-suggest\": { \"prefix\": \"nor\", \"completion\": { \"field\": \"suggest\", \"fuzzy\": { \"fuzziness\": 2 } } } } } Suggestions that share the longest prefix to the query prefix will be scored higher. The fuzzy query can take specific fuzzy parameters. For example: fuzziness The fuzziness factor, defaults to AUTO . See Fuzziness for allowed settings. transpositions If set to true , transpositions are counted as one change instead of two, defaults to true min_length Minimum length of the input before fuzzy suggestions are returned, defaults 3 prefix_length Minimum length of the input, which is not checked for fuzzy alternatives, defaults to 1 unicode_aware If true , all measurements (like fuzzy edit distance, transpositions, and lengths) are measured in Unicode code points instead of in bytes. This is slightly slower than raw bytes, so it is set to false by default. Note If you want to stick with the default values, but still use fuzzy, you can either use fuzzy: {} or fuzzy: true . Regex queries The completion suggester also supports regex queries meaning you can express a prefix as a regular expression POST music/_search?pretty { \"suggest\": { \"song-suggest\": { \"regex\": \"n[ever|i]r\", \"completion\": { \"field\": \"suggest\" } } } } The regex query can take specific regex parameters. For example: flags Possible flags are ALL (default), ANYSTRING , COMPLEMENT , EMPTY , INTERSECTION , INTERVAL , or NONE . See regexp-syntax for their meaning max_determinized_states Regular expressions are dangerous because it's easy to accidentally create an innocuous looking one that requires an exponential number of internal determinized automaton states (and corresponding RAM and CPU) for Lucene to execute. Lucene prevents these using the max_determinized_states setting (defaults to 10000). You can raise this limit to allow more complex regular expressions to execute. Context suggester The completion suggester considers all documents in the index, but it is often desirable to serve suggestions filtered and/or boosted by some criteria. For example, you want to suggest song titles filtered by certain artists or you want to boost song titles based on their genre. To achieve suggestion filtering and/or boosting, you can add context mappings while configuring a completion field. You can define multiple context mappings for a completion field. Every context mapping has a unique name and a type. There are two types: category and geo . Context mappings are configured under the contexts parameter in the field mapping. Note It is mandatory to provide a context when indexing and querying a context enabled completion field. The maximum allowed number of completion field context mappings is 10. The following example defines types, each with two context mappings for a completion field: PUT place { \"mappings\": { \"properties\": { \"suggest\": { \"type\": \"completion\", \"contexts\": [ { \"name\": \"place_type\", \"type\": \"category\" }, { \"name\": \"location\", \"type\": \"geo\", \"precision\": 4 } ] } } } } PUT place_path_category { \"mappings\": { \"properties\": { \"suggest\": { \"type\": \"completion\", \"contexts\": [ { \"name\": \"place_type\", \"type\": \"category\", \"path\": \"cat\" }, { \"name\": \"location\", \"type\": \"geo\", \"precision\": 4, \"path\": \"loc\" } ] }, \"loc\": { \"type\": \"geo_point\" } } } } Defines a category context named place_type where the categories must be sent with the suggestions. Defines a geo context named location where the categories must be sent with the suggestions. Defines a category context named place_type where the categories are read from the cat field. Defines a geo context named location where the categories are read from the loc field. Note Adding context mappings increases the index size for completion field. The completion index is entirely heap resident, you can monitor the completion field index size using index statistics . Category context The category context allows you to associate one or more categories with suggestions at index time. At query time, suggestions can be filtered and boosted by their associated categories. The mappings are set up like the place_type fields above. If path is defined then the categories are read from that path in the document, otherwise they must be sent in the suggest field like this: PUT place/_doc/1 { \"suggest\": { \"input\": [ \"timmy's\", \"starbucks\", \"dunkin donuts\" ], \"contexts\": { \"place_type\": [ \"cafe\", \"food\" ] } } } These suggestions will be associated with cafe and food category. If the mapping had a path then the following index request would be enough to add the categories: PUT place_path_category/_doc/1 { \"suggest\": [\"timmy's\", \"starbucks\", \"dunkin donuts\"], \"cat\": [\"cafe\", \"food\"] } These suggestions will be associated with cafe and food category. Note If context mapping references another field and the categories are explicitly indexed, the suggestions are indexed with both set of categories. Category query Suggestions can be filtered by one or more categories. The following filters suggestions by multiple categories: POST place/_search?pretty { \"suggest\": { \"place_suggestion\": { \"prefix\": \"tim\", \"completion\": { \"field\": \"suggest\", \"size\": 10, \"contexts\": { \"place_type\": [ \"cafe\", \"restaurants\" ] } } } } } Note If multiple categories or category contexts are set on the query they are merged as a disjunction. This means that suggestions match if they contain at least one of the provided context values. Suggestions with certain categories can be boosted higher than others. The following example filters suggestions by categories and additionally boosts suggestions associated with some categories: POST place/_search?pretty { \"suggest\": { \"place_suggestion\": { \"prefix\": \"tim\", \"completion\": { \"field\": \"suggest\", \"size\": 10, \"contexts\": { \"place_type\": [ { \"context\": \"cafe\" }, { \"context\": \"restaurants\", \"boost\": 2 } ] } } } } } The context query filter suggestions associated with categories cafe and restaurants and boosts the suggestions associated with restaurants by a factor of 2 In addition to accepting category values, a context query can be composed of multiple category context clauses. The parameters that are supported for a category context clause include: context The value of the category to filter/boost on. This is mandatory. boost The factor by which the score of the suggestion should be boosted, the score is computed by multiplying the boost with the suggestion weight, defaults to 1 prefix Whether the category value should be treated as a prefix or not. For example, if set to true , you can filter category of type1 , type2 and so on, by specifying a category prefix of type . Defaults to false Note If a suggestion entry matches multiple contexts the final score is computed as the maximum score produced by any matching contexts. Geo location context A geo context allows you to associate one or more geo points or geohashes with suggestions at index time. At query time, suggestions can be filtered and boosted if they are within a certain distance of a specified geo location. Internally, geo points are encoded as geohashes with the specified precision. Geo mapping In addition to the path setting, geo context mapping accepts settings such as: precision This defines the precision of the geohash to be indexed and can be specified as a distance value ( 5m , 10km etc.), or as a raw geohash precision ( 1 .. 12 ). Defaults to a raw geohash precision value of 6 . Note The index time precision setting sets the maximum geohash precision that can be used at query time. Indexing geo contexts geo contexts can be explicitly set with suggestions or be indexed from a geo point field in the document via the path parameter, similar to category contexts. Associating multiple geo location context with a suggestion, will index the suggestion for every geo location. The following indexes a suggestion with two geo location contexts: PUT place/_doc/1 { \"suggest\": { \"input\": \"timmy's\", \"contexts\": { \"location\": [ { \"lat\": 43.6624803, \"lon\": -79.3863353 }, { \"lat\": 43.6624718, \"lon\": -79.3873227 } ] } } } Geo location query Suggestions can be filtered and boosted with respect to how close they are to one or more geo points. The following filters suggestions that fall within the area represented by the encoded geohash of a geo point: POST place/_search { \"suggest\": { \"place_suggestion\": { \"prefix\": \"tim\", \"completion\": { \"field\": \"suggest\", \"size\": 10, \"contexts\": { \"location\": { \"lat\": 43.662, \"lon\": -79.380 } } } } } } Note When a location with a lower precision at query time is specified, all suggestions that fall within the area will be considered. If multiple categories or category contexts are set on the query they are merged as a disjunction. This means that suggestions match if they contain at least one of the provided context values. Suggestions that are within an area represented by a geohash can also be boosted higher than others, as shown by the following: POST place/_search?pretty { \"suggest\": { \"place_suggestion\": { \"prefix\": \"tim\", \"completion\": { \"field\": \"suggest\", \"size\": 10, \"contexts\": { \"location\": [ { \"lat\": 43.6624803, \"lon\": -79.3863353, \"precision\": 2 }, { \"context\": { \"lat\": 43.6624803, \"lon\": -79.3863353 }, \"boost\": 2 } ] } } } } } The context query filters for suggestions that fall under the geo location represented by a geohash of (43.662, -79.380) with a precision of 2 and boosts suggestions that fall under the geohash representation of (43.6624803, -79.3863353) with a default precision of 6 by a factor of 2 Note If a suggestion entry matches multiple contexts the final score is computed as the maximum score produced by any matching contexts. In addition to accepting context values, a context query can be composed of multiple context clauses. The parameters that are supported for a geo context clause include: context A geo point object or a geo hash string to filter or boost the suggestion by. This is mandatory. boost The factor by which the score of the suggestion should be boosted, the score is computed by multiplying the boost with the suggestion weight, defaults to 1 precision The precision of the geohash to encode the query geo point. This can be specified as a distance value ( 5m , 10km etc.), or as a raw geohash precision ( 1 .. 12 ). Defaults to index time precision level. neighbours Accepts an array of precision values at which neighbouring geohashes should be taken into account. precision value can be a distance value ( 5m , 10km etc.) or a raw geohash precision ( 1 .. 12 ). Defaults to generating neighbours for index time precision level. Note The precision field does not result in a distance match. Specifying a distance value like 10km only results in a geohash precision value that represents tiles of that size. The precision will be used to encode the search geo point into a geohash tile for completion matching. A consequence of this is that points outside that tile, even if very close to the search point, will not be matched. Reducing the precision, or increasing the distance, can reduce the risk of this happening, but not entirely remove it. Returning the type of the suggester Sometimes you need to know the exact type of a suggester in order to parse its results. The typed_keys parameter can be used to change the suggester's name in the response so that it will be prefixed by its type. Considering the following example with two suggesters term and phrase : POST _search?typed_keys { \"suggest\": { \"text\" : \"some test mssage\", \"my-first-suggester\" : { \"term\" : { \"field\" : \"message\" } }, \"my-second-suggester\" : { \"phrase\" : { \"field\" : \"message\" } } } } In the response, the suggester names will be changed to respectively term#my-first-suggester and phrase#my-second-suggester , reflecting the types of each suggestion: { \"suggest\": { \"term#my-first-suggester\": [ { \"text\": \"some\", \"offset\": 0, \"length\": 4, \"options\": [] }, { \"text\": \"test\", \"offset\": 5, \"length\": 4, \"options\": [] }, { \"text\": \"mssage\", \"offset\": 10, \"length\": 6, \"options\": [ { \"text\": \"message\", \"score\": 0.8333333, \"freq\": 4 } ] } ], \"phrase#my-second-suggester\": [ { \"text\": \"some test mssage\", \"offset\": 0, \"length\": 16, \"options\": [ { \"text\": \"some test message\", \"score\": 0.030227963 } ] } ] }, ... } The name my-first-suggester now contains the term prefix. The name my-second-suggester now contains the phrase prefix. Previous The shard request cache Next Profile search requests Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Global suggest text Term suggester Phrase suggester Smoothing models Candidate generators Direct generators Completion suggester Mapping Indexing Querying Skip duplicate suggestions Fuzzy queries Regex queries Context suggester Category context Geo location context Returning the type of the suggester Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Suggester examples | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/rest-apis/search-suggesters",
+    "meta_description": "The suggest feature suggests similar looking terms based on a provided text by using a suggester. The suggest request part is defined alongside the query..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / REST APIs / Elasticsearch API conventions Elastic Stack Serverless The Elasticsearch REST APIs are exposed over HTTP. Except where noted, the following conventions apply across all APIs. Content-type requirements The type of the content sent in a request body must be specified using the Content-Type header. The value of this header must map to one of the supported formats that the API supports. Most APIs support JSON, YAML, CBOR, and SMILE. The bulk and multi-search APIs support NDJSON, JSON, and SMILE; other types will result in an error response. When using the source query string parameter, the content type must be specified using the source_content_type query string parameter. Elasticsearch only supports UTF-8-encoded JSON. Elasticsearch ignores any other encoding headings sent with a request. Responses are also UTF-8 encoded. X-Opaque-Id HTTP header You can pass an X-Opaque-Id HTTP header to track the origin of a request in Elasticsearch logs and tasks. If provided, Elasticsearch surfaces the X-Opaque-Id value in the: Response of any request that includes the header Task management API response Slow logs Deprecation logs For the deprecation logs, Elasticsearch also uses the X-Opaque-Id value to throttle and deduplicate deprecation warnings. See Deprecation logs throttling . The X-Opaque-Id header accepts any arbitrary value. However, we recommend you limit these values to a finite set, such as an ID per client. Don’t generate a unique X-Opaque-Id header for every request. Too many unique X-Opaque-Id values can prevent Elasticsearch from deduplicating warnings in the deprecation logs. traceparent HTTP header Elasticsearch also supports a traceparent HTTP header using the official W3C trace context spec . You can use the traceparent header to trace requests across Elastic products and other services. Because it’s only used for traces, you can safely generate a unique traceparent header for each request. If provided, Elasticsearch surfaces the header’s trace-id value as trace.id in the: JSON Elasticsearch server logs Slow logs Deprecation logs For example, the following traceparent value would produce the following trace.id value in the above logs. `traceparent`: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01 `trace.id`: 0af7651916cd43dd8448eb211c80319c GET and POST requests A number of Elasticsearch GET APIs— most notably the search API— support a request body. While the GET action makes sense in the context of retrieving information, GET requests with a body are not supported by all HTTP libraries. All Elasticsearch GET APIs that require a body can also be submitted as POST requests. Alternatively, you can pass the request body as the source query string parameter when using GET. Cron expressions A cron expression is a string of the following form: <seconds> <minutes> <hours> <day_of_month> <month> <day_of_week> [year] Elasticsearch uses the cron parser from the Quartz Job Scheduler . For more information about writing Quartz cron expressions, see the Quartz CronTrigger Tutorial . All schedule times are in coordinated universal time (UTC); other timezones are not supported. Tip You can use the elasticsearch-croneval command line tool to validate your cron expressions. Cron expression elements All elements are required except for year . See Cron special characters for information about the allowed special characters. <seconds> (Required) Valid values: 0 - 59 and the special characters , - * / <minutes> (Required) Valid values: 0 - 59 and the special characters , - * / <hours> (Required) Valid values: 0 - 23 and the special characters , - * / <day_of_month> (Required) Valid values: 1 - 31 and the special characters , - * / ? L W <month> (Required) Valid values: 1 - 12 , JAN - DEC , jan - dec , and the special characters , - * / <day_of_week> (Required) Valid values: 1 - 7 , SUN - SAT , sun - sat , and the special characters , - * / ? L # <year> (Optional) Valid values: 1970 - 2099 and the special characters , - * / Cron special characters * Selects every possible value for a field. For example, * in the hours field means \"every hour\". ? No specific value. Use when you don’t care what the value is. For example, if you want the schedule to trigger on a particular day of the month, but don’t care what day of the week that happens to be, you can specify ? in the day_of_week field. - A range of values (inclusive). Use to separate a minimum and maximum value. For example, if you want the schedule to trigger every hour between 9:00 a.m. and 5:00 p.m., you could specify 9-17 in the hours field. , Multiple values. Use to separate multiple values for a field. For example, if you want the schedule to trigger every Tuesday and Thursday, you could specify TUE,THU in the day_of_week field. / Increment. Use to separate values when specifying a time increment. The first value represents the starting point, and the second value represents the interval. For example, if you want the schedule to trigger every 20 minutes starting at the top of the hour, you could specify 0/20 in the minutes field. Similarly, specifying 1/5 in day_of_month field will trigger every 5 days starting on the first day of the month. L Last. Use in the day_of_month field to mean the last day of the month— day 31 for January, day 28 for February in non-leap years, day 30 for April, and so on. Use alone in the day_of_week field in place of 7 or SAT , or after a particular day of the week to select the last day of that type in the month. For example 6L means the last Friday of the month. You can specify LW in the day_of_month field to specify the last weekday of the month. Avoid using the L option when specifying lists or ranges of values, as the results likely won’t be what you expect. W Weekday. Use to specify the weekday (Monday-Friday) nearest the given day. As an example, if you specify 15W in the day_of_month field and the 15th is a Saturday, the schedule will trigger on the 14th. If the 15th is a Sunday, the schedule will trigger on Monday the 16th. If the 15th is a Tuesday, the schedule will trigger on Tuesday the 15th. However if you specify 1W as the value for day_of_month , and the 1st is a Saturday, the schedule will trigger on Monday the 3rd— it won’t jump over the month boundary. You can specify LW in the day_of_month field to specify the last weekday of the month. You can only use the W option when the day_of_month is a single day— it is not valid when specifying a range or list of days. # Nth XXX day in a month. Use in the day_of_week field to specify the nth XXX day of the month. For example, if you specify 6#1 , the schedule will trigger on the first Friday of the month. Note that if you specify 3#5 and there are not 5 Tuesdays in a particular month, the schedule won’t trigger that month. Examples Setting daily triggers 0 5 9 * * ? Trigger at 9:05 a.m. UTC every day. 0 5 9 * * ? 2020 Trigger at 9:05 a.m. UTC every day during the year 2020. Restricting triggers to a range of days or times 0 5 9 ? * MON-FRI Trigger at 9:05 a.m. UTC Monday through Friday. 0 0-5 9 * * ? Trigger every minute starting at 9:00 a.m. UTC and ending at 9:05 a.m. UTC every day. Setting interval triggers 0 0/15 9 * * ? Trigger every 15 minutes starting at 9:00 a.m. UTC and ending at 9:45 a.m. UTC every day. 0 5 9 1/3 * ? Trigger at 9:05 a.m. UTC every 3 days every month, starting on the first day of the month. Setting schedules that trigger on a particular day 0 1 4 1 4 ? Trigger every April 1st at 4:01 a.m. UTC. 0 0,30 9 ? 4 WED Trigger at 9:00 a.m. UTC and at 9:30 a.m. UTC every Wednesday in the month of April. 0 5 9 15 * ? Trigger at 9:05 a.m. UTC on the 15th day of every month. 0 5 9 15W * ? Trigger at 9:05 a.m. UTC on the nearest weekday to the 15th of every month. 0 5 9 ? * 6#1 Trigger at 9:05 a.m. UTC on the first Friday of every month. Setting triggers using last 0 5 9 L * ? Trigger at 9:05 a.m. UTC on the last day of every month. 0 5 9 ? * 2L Trigger at 9:05 a.m. UTC on the last Monday of every month. 0 5 9 LW * ? Trigger at 9:05 a.m. UTC on the last weekday of every month. Date math support in index and index alias names Date math name resolution lets you to search a range of time series indices or index aliases rather than searching all of your indices and filtering the results. Limiting the number of searched indices reduces cluster load and improves search performance. For example, if you are searching for errors in your daily logs, you can use a date math name template to restrict the search to the past two days. Most APIs that accept an index or index alias argument support date math. A date math name takes the following form: <static_name{date_math_expr{date_format|time_zone}}> Where: static_name Static text date_math_expr Dynamic date math expression that computes the date dynamically date_format Optional format in which the computed date should be rendered. Defaults to yyyy.MM.dd . Format should be compatible with java-time https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html time_zone Optional time zone. Defaults to UTC . Note Pay attention to the usage of small vs capital letters used in the date_format . For example: mm denotes minute of hour, while MM denotes month of year. Similarly hh denotes the hour in the 1-12 range in combination with AM/PM , while HH denotes the hour in the 0-23 24-hour range. Date math expressions are resolved locale-independent. Consequently, it is not possible to use any other calendars than the Gregorian calendar. You must enclose date math names in angle brackets. If you use the name in a request path, special characters must be URI encoded. For example: # PUT /<my-index-{now/d}> PUT /%3Cmy-index-%7Bnow%2Fd%7D%3E Percent encoding of date math characters The special characters used for date rounding must be URI encoded as follows: < %3C > %3E / %2F { %7B } %7D | %7C + %2B : %3A , %2C The following example shows different forms of date math names and the final names they resolve to given the current time is 22nd March 2024 noon UTC. Expression Resolves to <logstash-{now/d}> logstash-2024.03.22 <logstash-{now/M}> logstash-2024.03.01 <logstash-{now/M{yyyy.MM}}> logstash-2024.03 <logstash-{now/M-1M{yyyy.MM}}> logstash-2024.02 <logstash-{now/d{yyyy.MM.dd&#124;+12:00}}> logstash-2024.03.23 To use the characters { and } in the static part of a name template, escape them with a backslash \\ , for example: <elastic\\\\{ON\\\\}-{now/M}> resolves to elastic{{ON}}-2024.03.01 The following example shows a search request that searches the Logstash indices for the past three days, assuming the indices use the default Logstash index name format, logstash-YYYY.MM.dd . # GET /<logstash-{now/d-2d}>,<logstash-{now/d-1d}>,<logstash-{now/d}>/_search GET /%3Clogstash-%7Bnow%2Fd-2d%7D%3E%2C%3Clogstash-%7Bnow%2Fd-1d%7D%3E%2C%3Clogstash-%7Bnow%2Fd%7D%3E/_search { \"query\" : { \"match\": { \"test\": \"data\" } } } Multi-target syntax Most APIs that accept a <data-stream> , <index> , or <target> request path parameter also support multi-target syntax . In multi-target syntax, you can use a comma-separated list to run a request on multiple resources, such as data streams, indices, or aliases: test1,test2,test3 . You can also use glob-like wildcard ( * ) expressions to target resources that match a pattern: test* or *test or te*t or *test* . You can exclude targets using the - character: test*,-test3 . Important Aliases are resolved after wildcard expressions. This can result in a request that targets an excluded alias. For example, if test3 is an index alias, the pattern test*,-test3 still targets the indices for test3 . To avoid this, exclude the concrete indices for the alias instead. You can also exclude clusters from a list of clusters to search using the - character: remote*:*,-remote1:*,-remote4:* will search all clusters with an alias that starts with \"remote\" except for \"remote1\" and \"remote4\". Note that to exclude a cluster with this notation you must exclude all of its indexes. Excluding a subset of indexes on a remote cluster is currently not supported. For example, this will throw an exception: remote*:*,-remote1:logs* . Multi-target APIs that can target indices support the following query string parameters: ignore_unavailable (Optional, Boolean) If false , the request returns an error if it targets a missing or closed index. Defaults to false . allow_no_indices (Optional, Boolean) If false , the request returns an error if any wildcard expression, index alias , or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar . expand_wildcards (Optional, string) Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden . Valid values are: all Match any data stream or index, including hidden ones. open Match open, non-hidden indices. Also matches any non-hidden data stream. closed Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed. hidden Match hidden data streams and hidden indices. Must be combined with open , closed , or both. none Wildcard patterns are not accepted. The defaults settings for the above parameters depend on the API being used. Some multi-target APIs that can target indices also support the following query string parameter: ignore_throttled (Optional, Boolean) If true , concrete, expanded or aliased indices are ignored when frozen. Defaults to true . Deprecated in 7.16.0 This parameter was deprecated in 7.16.0. Note APIs with a single target, such as the get document API , do not support multi-target syntax. Hidden data streams and indices For most APIs, wildcard expressions do not match hidden data streams and indices by default. To match hidden data streams and indices using a wildcard expression, you must specify the expand_wildcards query parameter. Alternatively, querying an index pattern starting with a dot, such as .watcher_hist* , will match hidden indices by default. This is intended to mirror Unix file-globbing behavior and provide a smoother transition path to hidden indices. You can create hidden data streams by setting data_stream.hidden to true in the stream’s matching index template . You can hide indices using the index.hidden index setting. The backing indices for data streams are hidden automatically. Some features, such as machine learning, store information in hidden indices. Global index templates that match all indices are not applied to hidden indices. System indices Elasticsearch modules and plugins can store configuration and state information in internal system indices . You should not directly access or modify system indices as they contain data essential to the operation of the system. Important Direct access to system indices is deprecated and will no longer be allowed in a future major version. To view system indices within cluster: GET _cluster/state/metadata?filter_path=metadata.indices.*.system Warning When overwriting current cluster state, system indices should be restored as part of their feature state . Node specification Some cluster-level APIs may operate on a subset of the nodes which can be specified with node filters. For example, task management , node stats , and node info APIs can all report results from a filtered set of nodes rather than from all nodes. Node filters are written as a comma-separated list of individual filters, each of which adds or removes nodes from the chosen subset. Each filter can be one of the following: _all , to add all nodes to the subset. _local , to add the local node to the subset. _master , to add the currently-elected master node to the subset. a node ID or name, to add this node to the subset. an IP address or hostname, to add all matching nodes to the subset. a pattern, using * wildcards, which adds all nodes to the subset whose name, address, or hostname matches the pattern. master:true , data:true , ingest:true , voting_only:true , ml:true , or coordinating_only:true , which respectively add to the subset all master-eligible nodes, all data nodes, all ingest nodes, all voting-only nodes, all machine learning nodes, and all coordinating-only nodes. master:false , data:false , ingest:false , voting_only:false , ml:false , or coordinating_only:false , which respectively remove from the subset all master-eligible nodes, all data nodes, all ingest nodes, all voting-only nodes, all machine learning nodes, and all coordinating-only nodes. a pair of patterns, using * wildcards, of the form attrname:attrvalue , which adds to the subset all nodes with a custom node attribute whose name and value match the respective patterns. Custom node attributes are configured by setting properties in the configuration file of the form node.attr.attrname: attrvalue . Node filters run in the order in which they are given, which is important if using filters that remove nodes from the set. For example, _all,master:false means all the nodes except the master-eligible ones. master:false,_all means the same as _all because the _all filter runs after the master:false filter. If no filters are given, the default is to select all nodes. If any filters are specified, they run starting with an empty chosen subset. This means that filters such as master:false which remove nodes from the chosen subset are only useful if they come after some other filters. When used on its own, master:false selects no nodes. Here are some examples of the use of node filters with some cluster APIs : # If no filters are given, the default is to select all nodes GET /_nodes # Explicitly select all nodes GET /_nodes/_all # Select just the local node GET /_nodes/_local # Select the elected master node GET /_nodes/_master # Select nodes by name, which can include wildcards GET /_nodes/node_name_goes_here GET /_nodes/node_name_goes_* # Select nodes by address, which can include wildcards GET /_nodes/10.0.0.3,10.0.0.4 GET /_nodes/10.0.0.* # Select nodes by role GET /_nodes/_all,master:false GET /_nodes/data:true,ingest:true GET /_nodes/coordinating_only:true GET /_nodes/master:true,voting_only:false # Select nodes by custom attribute # (for example, with something like `node.attr.rack: 2` in the configuration file) GET /_nodes/rack:2 GET /_nodes/ra*:2 GET /_nodes/ra*:2* Parameters Rest parameters (when using HTTP, map to HTTP URL parameters) follow the convention of using underscore casing. Request body in query string For libraries that don’t accept a request body for non-POST requests, you can pass the request body as the source query string parameter instead. When using this method, the source_content_type parameter should also be passed with a media type value that indicates the format of the source, such as application/json . REST API version compatibility Major version upgrades often include a number of breaking changes that impact how you interact with Elasticsearch. While we recommend that you monitor the deprecation logs and update applications before upgrading Elasticsearch, having to coordinate the necessary changes can be an impediment to upgrading. You can enable an existing application to function without modification after an upgrade by including API compatibility headers, which tell Elasticsearch you are still using the previous version of the REST API. Using these headers allows the structure of requests and responses to remain the same; it does not guarantee the same behavior. You set version compatibility on a per-request basis in the Content-Type and Accept headers. Setting compatible-with to the same major version as the version you’re running has no impact, but ensures that the request will still work after Elasticsearch is upgraded. To tell Elasticsearch 8.0 you are using the 7.x request and response format, set compatible-with=7 : Content-Type: application/vnd.elasticsearch+json; compatible-with=7 Accept: application/vnd.elasticsearch+json; compatible-with=7 HTTP 429 Too Many Requests status code push back Elasticsearch APIs may respond with the HTTP 429 Too Many Requests status code, indicating that the cluster is too busy to handle the request. When this happens, consider retrying after a short delay. If the retry also receives a 429 Too Many Requests response, extend the delay by backing off exponentially before each subsequent retry. URL-based access control Many users use a proxy with URL-based access control to secure access to Elasticsearch data streams and indices. For multi-search , multi-get , and bulk requests, the user has the choice of specifying a data stream or index in the URL and on each individual request within the request body. This can make URL-based access control challenging. To prevent the user from overriding the data stream or index specified in the URL, set rest.action.multi.allow_explicit_index to false in elasticsearch.yml . This causes Elasticsearch to reject requests that explicitly specify a data stream or index in the request body. Boolean Values All REST API parameters (both request parameters and JSON body) support providing boolean \"false\" as the value false and boolean \"true\" as the value true . All other values will raise an error. Number Values When passing a numeric parameter in a request body, you may use a string containing the number instead of the native numeric type. For example: POST /_search { \"size\": \"1000\" } Integer-valued fields in a response body are described as integer (or occasionally long ) in this manual, but there are generally no explicit bounds on such values. JSON, SMILE, CBOR and YAML all permit arbitrarily large integer values. Do not assume that integer fields in a response body will always fit into a 32-bit signed integer. Byte size units Whenever the byte size of data needs to be specified, e.g. when setting a buffer size parameter, the value must specify the unit, like 10kb for 10 kilobytes. Note that these units use powers of 1024, so 1kb means 1024 bytes. The supported units are: b Bytes kb Kilobytes mb Megabytes gb Gigabytes tb Terabytes pb Petabytes Distance Units Wherever distances need to be specified, such as the distance parameter in the Geo-distance ), the default unit is meters if none is specified. Distances can be specified in other units, such as \"1km\" or \"2mi\" (2 miles). The full list of units is listed below: Mile mi or miles Yard yd or yards Feet ft or feet Inch in or inch Kilometer km or kilometers Meter m or meters Centimeter cm or centimeters Millimeter mm or millimeters Nautical mile NM , nmi , or nauticalmiles Time units Whenever durations need to be specified, e.g. for a timeout parameter, the duration must specify the unit, like 2d for 2 days. The supported units are: d Days h Hours m Minutes s Seconds ms Milliseconds micros Microseconds nanos Nanoseconds Unit-less quantities Unit-less quantities means that they don’t have a \"unit\" like \"bytes\" or \"Hertz\" or \"meter\" or \"long tonne\". If one of these quantities is large we’ll print it out like 10m for 10,000,000 or 7k for 7,000. We’ll still print 87 when we mean 87 though. These are the supported multipliers: k Kilo m Mega g Giga t Tera p Peta Previous REST APIs Next Common options Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Content-type requirements X-Opaque-Id HTTP header traceparent HTTP header GET and POST requests Cron expressions Cron expression elements Cron special characters Examples Date math support in index and index alias names Multi-target syntax Hidden data streams and indices System indices Node specification Parameters Request body in query string REST API version compatibility HTTP 429 Too Many Requests status code push back URL-based access control Boolean Values Number Values Byte size units Distance Units Time units Unit-less quantities Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Elasticsearch API conventions | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/rest-apis/api-conventions",
+    "meta_description": "The Elasticsearch REST APIs are exposed over HTTP. Except where noted, the following conventions apply across all APIs. The type of the content sent in..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / SQL / SQL REST API / Run an async SQL search Elastic Stack Serverless By default, SQL searches are synchronous. They wait for complete results before returning a response. However, results can take longer for searches across large data sets or frozen data . To avoid long waits, run an async SQL search. Set wait_for_completion_timeout to a duration you’d like to wait for synchronous results. POST _sql?format=json { \"wait_for_completion_timeout\": \"2s\", \"query\": \"SELECT * FROM library ORDER BY page_count DESC\", \"fetch_size\": 5 } If the search doesn’t finish within this period, the search becomes async. The API returns: An id for the search. An is_partial value of true , indicating the search results are incomplete. An is_running value of true , indicating the search is still running in the background. For CSV, TSV, and TXT responses, the API returns these values in the respective Async-ID , Async-partial , and Async-running HTTP headers instead. { \"id\": \"FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=\", \"is_partial\": true, \"is_running\": true, \"rows\": [ ] } To check the progress of an async search, use the search ID with the get async SQL search status API . GET _sql/async/status/FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU= If is_running and is_partial are false , the async search has finished with complete results. { \"id\": \"FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=\", \"is_running\": false, \"is_partial\": false, \"expiration_time_in_millis\": 1611690295000, \"completion_status\": 200 } To get the results, use the search ID with the get async SQL search API . If the search is still running, specify how long you’d like to wait using wait_for_completion_timeout . You can also specify the response format . GET _sql/async/FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=?wait_for_completion_timeout=2s&format=json Change the search retention period By default, Elasticsearch stores async SQL searches for five days. After this period, Elasticsearch deletes the search and its results, even if the search is still running. To change this retention period, use the keep_alive parameter. POST _sql?format=json { \"keep_alive\": \"2d\", \"wait_for_completion_timeout\": \"2s\", \"query\": \"SELECT * FROM library ORDER BY page_count DESC\", \"fetch_size\": 5 } You can use the get async SQL search API’s keep_alive parameter to later change the retention period. The new period starts after the request runs. GET _sql/async/FmdMX2pIang3UWhLRU5QS0lqdlppYncaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQToxOTI=?keep_alive=5d&wait_for_completion_timeout=2s&format=json Use the delete async SQL search API to delete an async search before the keep_alive period ends. If the search is still running, Elasticsearch cancels it. DELETE _sql/async/delete/FmdMX2pIang3UWhLRU5QS0lqdlppYncaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQToxOTI= Store synchronous SQL searches By default, Elasticsearch only stores async SQL searches. To save a synchronous search, specify wait_for_completion_timeout and set keep_on_completion to true . POST _sql?format=json { \"keep_on_completion\": true, \"wait_for_completion_timeout\": \"2s\", \"query\": \"SELECT * FROM library ORDER BY page_count DESC\", \"fetch_size\": 5 } If is_partial and is_running are false , the search was synchronous and returned complete results. { \"id\": \"Fnc5UllQdUVWU0NxRFNMbWxNYXplaFEaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQTo0NzA=\", \"is_partial\": false, \"is_running\": false, \"rows\": ..., \"columns\": ..., \"cursor\": ... } You can get the same results later using the search ID with the get async SQL search API . Saved synchronous searches are still subject to the keep_alive retention period. When this period ends, Elasticsearch deletes the search results. You can also delete saved searches using the delete async SQL search API . Previous Use runtime fields Next SQL Translate API Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Change the search retention period Store synchronous SQL searches Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Run an async SQL search | Elastic Docs",
+    "url": "https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-async",
+    "meta_description": "By default, SQL searches are synchronous. They wait for complete results before returning a response. However, results can take longer for searches across..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / Querying and filtering Elastic Stack Serverless You can use Elasticsearch as a basic document store to retrieve documents and their metadata. However, the real power of Elasticsearch comes from its advanced search and analytics capabilities. Elasticsearch makes JSON documents searchable and aggregatable. The documents are stored in an index or data stream , which represent one type of data. Searchable means that you can filter the documents for conditions.** For example, you can filter for data \"within the last 7 days\" or data that \"contains the word Kibana\". Kibana provides many ways for you to construct filters, which are also called queries or search terms. Aggregatable means that you can extract summaries from matching documents.** The simplest aggregation is count , and it is frequently used in combination with the date histogram , to see count over time. The terms aggregation shows the most frequent values. Querying You’ll use a combination of an API endpoint and a query language to interact with your data. Elasticsearch provides a number of query languages . From Query DSL to the newest ES|QL, find the one that's most appropriate for you. You can call Elasticsearch's REST APIs by submitting requests directly from the command line or through the Dev Tools Console in Kibana. From your applications, you can use a client in your programming language of choice. A number of tools are available for you to save, debug, and optimize your queries. If you're just getting started with Elasticsearch, try the hands-on API quickstart to learn how to add data and run basic searches using Query DSL and the _search endpoint. Filtering When querying your data in Kibana, additional options let you filter the results to just the subset you need. Some of these options are common to most Elastic apps. Check Filtering in Kibana for more details on how to recognize and use them in the UI. Previous Explore and analyze Next Query languages Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Querying Filtering Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Querying and filtering | Elastic Docs",
+    "url": "https://www.elastic.co/docs/explore-analyze/query-filter",
+    "meta_description": "You can use Elasticsearch as a basic document store to retrieve documents and their metadata. However, the real power of Elasticsearch comes from its..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / Query languages / SQL / Getting Started with SQL Elastic Stack Serverless To start using Elasticsearch SQL, create an index with some data to experiment with: PUT /library/_bulk?refresh {\"index\":{\"_id\": \"Leviathan Wakes\"}} {\"name\": \"Leviathan Wakes\", \"author\": \"James S.A. Corey\", \"release_date\": \"2011-06-02\", \"page_count\": 561} {\"index\":{\"_id\": \"Hyperion\"}} {\"name\": \"Hyperion\", \"author\": \"Dan Simmons\", \"release_date\": \"1989-05-26\", \"page_count\": 482} {\"index\":{\"_id\": \"Dune\"}} {\"name\": \"Dune\", \"author\": \"Frank Herbert\", \"release_date\": \"1965-06-01\", \"page_count\": 604} And now you can execute SQL using the SQL search API : POST /_sql?format=txt { \"query\": \"SELECT * FROM library WHERE release_date < '2000-01-01'\" } Which should return something along the lines of: author | name | page_count | release_date ---------------+---------------+---------------+------------------------ Dan Simmons |Hyperion |482 |1989-05-26T00:00:00.000Z Frank Herbert |Dune |604 |1965-06-01T00:00:00.000Z You can also use the SQL CLI . There is a script to start it shipped in the Elasticsearch bin directory: $ ./bin/elasticsearch-sql-cli From there you can run the same query: sql> SELECT * FROM library WHERE release_date < '2000-01-01'; author | name | page_count | release_date ---------------+---------------+---------------+------------------------ Dan Simmons |Hyperion |482 |1989-05-26T00:00:00.000Z Frank Herbert |Dune |604 |1965-06-01T00:00:00.000Z Previous Overview Next Conventions Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Getting Started with SQL | Elastic Docs",
+    "url": "https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-getting-started",
+    "meta_description": "To start using Elasticsearch SQL, create an index with some data to experiment with: And now you can execute SQL using the SQL search API: Which should..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / SQL / SQL REST API / Overview Elastic Stack Serverless The SQL search API accepts SQL in a JSON document, executes it, and returns the results. For example: POST /_sql?format=txt { \"query\": \"SELECT * FROM library ORDER BY page_count DESC LIMIT 5\" } Which returns: author | name | page_count | release_date -----------------+--------------------+---------------+------------------------ Peter F. Hamilton|Pandora's Star |768 |2004-03-02T00:00:00.000Z Vernor Vinge |A Fire Upon the Deep|613 |1992-06-01T00:00:00.000Z Frank Herbert |Dune |604 |1965-06-01T00:00:00.000Z Alastair Reynolds|Revelation Space |585 |2000-03-15T00:00:00.000Z James S.A. Corey |Leviathan Wakes |561 |2011-06-02T00:00:00.000Z Using Kibana Console If you are using Kibana Console (which is highly recommended), take advantage of the triple quotes \"\"\" when creating the query. This not only automatically escapes double quotes ( \" ) inside the query string but also support multi-line as shown below: Previous SQL REST API Next Response Data Formats Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Overview | Elastic Docs",
+    "url": "https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-rest-overview",
+    "meta_description": "The SQL search API accepts SQL in a JSON document, executes it, and returns the results. For example: Which returns: "
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / Query languages / SQL / SQL Limitations Elastic Stack Serverless Large queries may throw ParsingException Extremely large queries can consume too much memory during the parsing phase, in which case the Elasticsearch SQL engine will abort parsing and throw an error. In such cases, consider reducing the query to a smaller size by potentially simplifying it or splitting it into smaller queries. Nested fields in SYS COLUMNS and DESCRIBE TABLE Elasticsearch has a special type of relationship fields called nested fields. In Elasticsearch SQL they can be used by referencing their inner sub-fields. Even though SYS COLUMNS in non-driver mode (in the CLI and in REST calls) and DESCRIBE TABLE will still display them as having the type NESTED , they cannot be used in a query. One can only reference its sub-fields in the form: [nested_field_name].[sub_field_name] For example: SELECT dep.dep_name.keyword FROM test_emp GROUP BY languages; Scalar functions on nested fields are not allowed in WHERE and ORDER BY clauses Elasticsearch SQL doesn’t support the usage of scalar functions on top of nested fields in WHERE and ORDER BY clauses with the exception of comparison and logical operators. For example: SELECT * FROM test_emp WHERE LENGTH(dep.dep_name.keyword) > 5; and SELECT * FROM test_emp ORDER BY YEAR(dep.start_date); are not supported but: SELECT * FROM test_emp WHERE dep.start_date >= CAST('2020-01-01' AS DATE) OR dep.dep_end_date IS NULL; is supported. Multi-nested fields Elasticsearch SQL doesn’t support multi-nested documents, so a query cannot reference more than one nested field in an index. This applies to multi-level nested fields, but also multiple nested fields defined on the same level. For example, for this index: column | type | mapping ----------------------+---------------+------------- nested_A |STRUCT |NESTED nested_A.nested_X |STRUCT |NESTED nested_A.nested_X.text|VARCHAR |KEYWORD nested_A.text |VARCHAR |KEYWORD nested_B |STRUCT |NESTED nested_B.text |VARCHAR |KEYWORD nested_A and nested_B cannot be used at the same time, nor nested_A / nested_B and nested_A.nested_X combination. For such situations, Elasticsearch SQL will display an error message. Paginating nested inner hits When SELECTing a nested field, pagination will not work as expected, Elasticsearch SQL will return at least the page size records. This is because of the way nested queries work in Elasticsearch: the root nested field will be returned and it’s matching inner nested fields as well, pagination taking place on the root nested document and not on its inner hits . Normalized keyword fields keyword fields in Elasticsearch can be normalized by defining a normalizer . Such fields are not supported in Elasticsearch SQL. Array type of fields Array fields are not supported due to the \"invisible\" way in which Elasticsearch handles an array of values: the mapping doesn’t indicate whether a field is an array (has multiple values) or not, so without reading all the data, Elasticsearch SQL cannot know whether a field is a single or multi value. When multiple values are returned for a field, by default, Elasticsearch SQL will throw an exception. However, it is possible to change this behavior through field_multi_value_leniency parameter in REST (disabled by default) or field.multi.value.leniency in drivers (enabled by default). Sorting by aggregation When doing aggregations ( GROUP BY ) Elasticsearch SQL relies on Elasticsearch's composite aggregation for its support for paginating results. However this type of aggregation does come with a limitation: sorting can only be applied on the key used for the aggregation’s buckets. Elasticsearch SQL overcomes this limitation by doing client-side sorting however as a safety measure, allows only up to 65535 rows. It is recommended to use LIMIT for queries that use sorting by aggregation, essentially indicating the top N results that are desired: SELECT * FROM test GROUP BY age ORDER BY COUNT(*) LIMIT 100; It is possible to run the same queries without a LIMIT however in that case if the maximum size ( 10000 ) is passed, an exception will be returned as Elasticsearch SQL is unable to track (and sort) all the results returned. Moreover, the aggregation(s) used in the ORDER BY must be only plain aggregate functions. No scalar functions or operators can be used, and therefore no complex columns that combine two ore more aggregate functions can be used for ordering. Here are some examples of queries that are not allowed : SELECT age, ROUND(AVG(salary)) AS avg FROM test GROUP BY age ORDER BY avg; SELECT age, MAX(salary) - MIN(salary) AS diff FROM test GROUP BY age ORDER BY diff; Using a sub-select Using sub-selects ( SELECT X FROM (SELECT Y) ) is supported to a small degree : any sub-select that can be \"flattened\" into a single SELECT is possible with Elasticsearch SQL. For example: SELECT * FROM (SELECT first_name, last_name FROM emp WHERE last_name NOT LIKE '%a%') WHERE first_name LIKE 'A%' ORDER BY 1; first_name | last_name ---------------+--------------- Alejandro |McAlpine Anneke |Preusig Anoosh |Peyn Arumugam |Ossenbruggen The query above is possible because it is equivalent with: SELECT first_name, last_name FROM emp WHERE last_name NOT LIKE '%a%' AND first_name LIKE 'A%' ORDER BY 1; But, if the sub-select would include a GROUP BY or HAVING or the enclosing SELECT would be more complex than SELECT X FROM (SELECT ...) WHERE [simple_condition] , this is currently un-supported . Using FIRST / LAST aggregation functions in HAVING clause Using FIRST and LAST in the HAVING clause is not supported. The same applies to MIN and MAX when their target column is of type keyword or unsigned_long as they are internally translated to FIRST and LAST . Using TIME data type in GROUP BY or HISTOGRAM Using TIME data type as a grouping key is currently not supported. For example: SELECT count(*) FROM test GROUP BY CAST(date_created AS TIME); On the other hand, it can still be used if it’s wrapped with a scalar function that returns another data type, for example: SELECT count(*) FROM test GROUP BY MINUTE((CAST(date_created AS TIME)); TIME data type is also currently not supported in histogram grouping function. For example: SELECT HISTOGRAM(CAST(birth_date AS TIME), INTERVAL '10' MINUTES) as h, COUNT(*) FROM t GROUP BY h Geo-related functions Since geo_shape fields don’t have doc values these fields cannot be used for filtering, grouping or sorting. By default, geo_points fields are indexed and have doc values. However only latitude and longitude are stored and indexed with some loss of precision from the original values (4.190951585769653E-8 for the latitude and 8.381903171539307E-8 for longitude). The altitude component is accepted but not stored in doc values nor indexed. Therefore calling ST_Z function in the filtering, grouping or sorting will return null . Retrieving using the fields search parameter Elasticsearch SQL retrieves column values using the search API’s fields parameter . Any limitations on the fields parameter also apply to Elasticsearch SQL queries. For example, if _source is disabled for any of the returned fields or at index level, the values cannot be retrieved. Aggregations in the PIVOT clause The aggregation expression in PIVOT will currently accept only one aggregation. It is thus not possible to obtain multiple aggregations for any one pivoted column. Using a subquery in PIVOT 's IN -subclause The values that the PIVOT query could pivot must be provided in the query as a list of literals; providing a subquery instead to build this list is not currently supported. For example, in this query: SELECT * FROM test_emp PIVOT (SUM(salary) FOR languages IN (1, 2)) the languages of interest must be listed explicitly: IN (1, 2) . On the other hand, this example would not work : SELECT * FROM test_emp PIVOT (SUM(salary) FOR languages IN (SELECT languages FROM test_emp WHERE languages <=2 GROUP BY languages)) Previous Reserved keywords Next EQL Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Large queries may throw ParsingException Nested fields in SYS COLUMNS and DESCRIBE TABLE Scalar functions on nested fields are not allowed in WHERE and ORDER BY clauses Multi-nested fields Paginating nested inner hits Normalized keyword fields Array type of fields Sorting by aggregation Using a sub-select Using FIRST/LAST aggregation functions in HAVING clause Using TIME data type in GROUP BY or HISTOGRAM Geo-related functions Retrieving using the fields search parameter Aggregations in the PIVOT clause Using a subquery in PIVOT's IN-subclause Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "SQL Limitations | Elastic Docs",
+    "url": "https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-limitations",
+    "meta_description": "Extremely large queries can consume too much memory during the parsing phase, in which case the Elasticsearch SQL engine will abort parsing and throw..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Cross-cluster replication / Set up cross-cluster replication / Create a follower index to replicate a specific index ECE ECK Elastic Cloud Hosted Self Managed When you create a follower index, you reference the remote cluster and the leader index in your remote cluster. To create a follower index from Stack Management in Kibana: Select Cross-Cluster Replication in the side navigation and choose the Follower Indices tab. Choose the cluster (ClusterA) containing the leader index you want to replicate. Enter the name of the leader index, which is kibana_sample_data_ecommerce if you are following the tutorial. Enter a name for your follower index, such as follower-kibana-sample-data . Elasticsearch initializes the follower using the remote recovery process, which transfers the existing Lucene segment files from the leader index to the follower index. The index status changes to Paused . When the remote recovery process is complete, the index following begins and the status changes to Active . When you index documents into your leader index, Elasticsearch replicates the documents in the follower index. × API example You can also use the create follower API to create follower indices. When you create a follower index, you must reference the remote cluster and the leader index that you created in the remote cluster. When initiating the follower request, the response returns before the remote recovery process completes. To wait for the process to complete, add the wait_for_active_shards parameter to your request. PUT /server-metrics-follower/_ccr/follow?wait_for_active_shards=1 { \"remote_cluster\" : \"leader\", \"leader_index\" : \"server-metrics\" } Use the get follower stats API to inspect the status of replication. Previous Configure privileges for cross-cluster replication Next Create an auto-follow pattern to replicate time series indices Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Create a follower index to replicate a specific index | Elastic Docs",
+    "url": "https://www.elastic.co/docs/deploy-manage/tools/cross-cluster-replication/ccr-getting-started-follower-index",
+    "meta_description": "When you create a follower index, you reference the remote cluster and the leader index in your remote cluster. To create a follower index from Stack..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Manage TLS encryption / Self-managed / Enabling cipher suites for stronger encryption Self Managed The TLS and SSL protocols use a cipher suite that determines the strength of encryption used to protect the data. You may want to increase the strength of encryption used when using a Oracle JVM; the IcedTea OpenJDK ships without these restrictions in place. This step is not required to successfully use encrypted communication. The Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files enable the use of additional cipher suites for Java in a separate JAR file that you need to add to your Java installation. You can download this JAR file from Oracle’s download page . The JCE Unlimited Strength Jurisdiction Policy Files` are required for encryption with key lengths greater than 128 bits, such as 256-bit AES encryption. After installation, all cipher suites in the JCE are available for use but requires configuration in order to use them. To enable the use of stronger cipher suites with Elasticsearch security features, configure the cipher_suites parameter . Note The JCE Unlimited Strength Jurisdiction Policy Files must be installed on all nodes in the cluster to establish an improved level of encryption strength. Previous Supported SSL/TLS versions by JDK version Next ECK Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Enabling cipher suites for stronger encryption | Elastic Docs",
+    "url": "https://www.elastic.co/docs/deploy-manage/security/enabling-cipher-suites-for-stronger-encryption",
+    "meta_description": "The TLS and SSL protocols use a cipher suite that determines the strength of encryption used to protect the data. You may want to increase the strength..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / Query languages / SQL / Conventions and Terminology Elastic Stack Serverless For clarity, it is important to establish the meaning behind certain words as, the same wording might convey different meanings to different readers depending on one’s familiarity with SQL versus Elasticsearch. Note This documentation while trying to be complete, does assume the reader has basic understanding of Elasticsearch and/or SQL. If that is not the case, continue reading the documentation however take notes and pursue the topics that are unclear either through the main Elasticsearch documentation or through the plethora of SQL material available in the open (there are simply too many excellent resources here to enumerate). As a general rule, Elasticsearch SQL as the name indicates provides a SQL interface to Elasticsearch. As such, it follows the SQL terminology and conventions first, whenever possible. However the backing engine itself is Elasticsearch for which Elasticsearch SQL was purposely created hence why features or concepts that are not available, or cannot be mapped correctly, in SQL appear in Elasticsearch SQL. Last but not least, Elasticsearch SQL tries to obey the principle of least surprise , though as all things in the world, everything is relative. Mapping concepts across SQL and Elasticsearch While SQL and Elasticsearch have different terms for the way the data is organized (and different semantics), essentially their purpose is the same. So let’s start from the bottom; these roughly are: SQL Elasticsearch Description column field In both cases, at the lowest level, data is stored in named entries, of a variety of data types , containing one value. SQL calls such an entry a column while Elasticsearch a field .Notice that in Elasticsearch a field can contain multiple values of the same type (essentially a list) while in SQL, a column can contain exactly one value of said type.Elasticsearch SQL will do its best to preserve the SQL semantic and, depending on the query, reject those that return fields with more than one value. row document Column s and field s do not exist by themselves; they are part of a row or a document . The two have slightly different semantics: a row tends to be strict (and have more enforcements) while a document tends to be a bit more flexible or loose (while still having a structure). table index The target against which queries, whether in SQL or Elasticsearch get executed against. schema implicit In RDBMS, schema is mainly a namespace of tables and typically used as a security boundary. Elasticsearch does not provide an equivalent concept for it. However when security is enabled, Elasticsearch automatically applies the security enforcement so that a role sees only the data it is allowed to (in SQL jargon, its schema ). catalog or database cluster instance In SQL, catalog or database are used interchangeably and represent a set of schemas that is, a number of tables.In Elasticsearch the set of indices available are grouped in a cluster . The semantics also differ a bit; a database is essentially yet another namespace (which can have some implications on the way data is stored) while an Elasticsearch cluster is a runtime instance, or rather a set of at least one Elasticsearch instance (typically running distributed).In practice this means that while in SQL one can potentially have multiple catalogs inside an instance, in Elasticsearch one is restricted to only one . cluster cluster (federated) Traditionally in SQL, cluster refers to a single RDBMS instance which contains a number of catalog s or database s (see above). The same word can be reused inside Elasticsearch as well however its semantic clarified a bit. While RDBMS tend to have only one running instance, on a single machine ( not distributed), Elasticsearch goes the opposite way and by default, is distributed and multi-instance. Further more, an Elasticsearch cluster can be connected to other cluster s in a federated fashion thus cluster means: single cluster<>Multiple Elasticsearch instances typically distributed across machines, running within the same namespace.multiple clustersMultiple clusters, each with its own namespace, connected to each other in a federated setup (see Cross-cluster search ). As one can see while the mapping between the concepts are not exactly one to one and the semantics somewhat different, there are more things in common than differences. In fact, thanks to SQL declarative nature, many concepts can move across Elasticsearch transparently and the terminology of the two likely to be used interchangeably throughout the rest of the material. Previous Getting started Next Security Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Mapping concepts across SQL and Elasticsearch Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Conventions and Terminology | Elastic Docs",
+    "url": "https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-concepts",
+    "meta_description": "For clarity, it is important to establish the meaning behind certain words as, the same wording might convey different meanings to different readers depending..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-users If you use file-based user authentication, the elasticsearch-users command enables you to add and remove users, assign user roles, and manage passwords per node. Synopsis bin/elasticsearch-users ([useradd <username>] [-p <password>] [-r <roles>]) | ([list] <username>) | ([passwd <username>] [-p <password>]) | ([roles <username>] [-a <roles>] [-r <roles>]) | ([userdel <username>]) Description If you use the built-in file internal realm, users are defined in local files on each node in the cluster. Usernames and roles must be at least 1 and no more than 1024 characters. They can contain alphanumeric characters ( a-z , A-Z , 0-9 ), spaces, punctuation, and printable symbols in the Basic Latin (ASCII) block . Leading or trailing whitespace is not allowed. Passwords must be at least 6 characters long. For more information, see File-based user authentication . Tip To ensure that Elasticsearch can read the user and role information at startup, run elasticsearch-users useradd as the same user you use to run Elasticsearch. Running the command as root or some other user updates the permissions for the users and users_roles files and prevents Elasticsearch from accessing them. Parameters -a <roles> If used with the roles parameter, adds a comma-separated list of roles to a user. list List the users that are registered with the file realm on the local node. If you also specify a user name, the command provides information for that user. -p <password> Specifies the user’s password. If you do not specify this parameter, the command prompts you for the password. Tip Omit the -p option to keep plaintext passwords out of the terminal session’s command history. passwd <username> Resets a user’s password. You can specify the new password directly with the -p parameter. -r <roles> If used with the useradd parameter, defines a user’s roles. This option accepts a comma-separated list of role names to assign to the user. If used with the roles parameter, removes a comma-separated list of roles from a user. roles Manages the roles of a particular user. You can combine adding and removing roles within the same command to change a user’s roles. useradd <username> Adds a user to your local node. userdel <username> Deletes a user from your local node. Examples The following example adds a new user named jacknich to the file realm. The password for this user is theshining , and this user is associated with the network and monitoring roles. bin/elasticsearch-users useradd jacknich -p theshining -r network,monitoring The following example lists the users that are registered with the file realm on the local node: bin/elasticsearch-users list rdeniro : admin alpacino : power_user jacknich : monitoring,network Users are in the left-hand column and their corresponding roles are listed in the right-hand column. The following example resets the jacknich user’s password: bin/elasticsearch-users passwd jachnich Since the -p parameter was omitted, the command prompts you to enter and confirm a password in interactive mode. The following example removes the network and monitoring roles from the jacknich user and adds the user role: bin/elasticsearch-users roles jacknich -r network,monitoring -a user The following example deletes the jacknich user: bin/elasticsearch-users userdel jacknich Previous elasticsearch-syskeygen Next Curator Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "elasticsearch-users | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/users-command",
+    "meta_description": "If you use file-based user authentication, the elasticsearch-users command enables you to add and remove users, assign user roles, and manage passwords..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-syskeygen The elasticsearch-syskeygen command creates a system key file in the elasticsearch config directory. Synopsis bin/elasticsearch-syskeygen [-E <KeyValuePair>] [-h, --help] ([-s, --silent] | [-v, --verbose]) Description The command generates a system_key file, which you can use to symmetrically encrypt sensitive data. For example, you can use this key to prevent Watcher from returning and storing information that contains clear text credentials. See Encrypting sensitive data in Watcher . Important The system key is a symmetric key, so the same key must be used on every node in the cluster. Parameters -E <KeyValuePair> Configures a setting. For example, if you have a custom installation of Elasticsearch, you can use this parameter to specify the ES_PATH_CONF environment variable. -h, --help Returns all of the command parameters. -s, --silent Shows minimal output. -v, --verbose Shows verbose output. Examples The following command generates a system_key file in the default $ES_HOME/config directory: bin/elasticsearch-syskeygen Previous elasticsearch-shard Next elasticsearch-users Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "elasticsearch-syskeygen | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/syskeygen",
+    "meta_description": "The elasticsearch-syskeygen command creates a system key file in the elasticsearch config directory. The command generates a system_key file, which you..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / Users and roles / Cluster or deployment / User roles ECE ECK Elastic Cloud Hosted Self Managed After a user is authenticated , Elastic Stack needs to determine whether the user behind an incoming request is allowed to execute the request. The primary method of authorization in a cluster is role-based access control (RBAC), although Elastic Stack also supports Attribute-based access control (ABAC). Tip If you use Elastic Cloud Enterprise or Elastic Cloud Hosted, then you can also implement RBAC at the level of your Elastic Cloud Enterprise orchestrator or Elastic Cloud organization . If you use Elastic Cloud Serverless, then you can only manage RBAC at the Elastic Cloud organization level . You must authenticate users at the same level where you implement RBAC. For example, if you want to use organization-level roles, than you must authenticate your users at the organization level. How role-based access control works Role-based access control (RBAC) enables you to authorize users by assigning privileges to roles and assigning roles to users or groups. This is the primary way of controlling access to resources in Elastic Stack. × The authorization process revolves around the following constructs: Secured Resource A resource to which access is restricted. Indices, aliases, documents, fields, users, and the Elasticsearch cluster itself are all examples of secured objects. Privilege A named group of one or more actions that a user may execute against a secured resource. Each secured resource has its own sets of available privileges. For example, read is an index privilege that represents all actions that enable reading the indexed/stored data. For a complete list of available privileges, see Elasticsearch privileges . Permissions A set of one or more privileges against a secured resource. Permissions can easily be described in words, here are few examples: read privilege on the products data stream or index manage privilege on the cluster run_as privilege on john user read privilege on documents that match query X read privilege on credit_card field Role A named set of permissions User The authenticated user. Group One or more groups to which a user belongs. Groups are not supported in some realms, such as native, file, or PKI realms. A role has a unique name and identifies a set of permissions that translate to privileges on resources. You can associate a user or group with an arbitrary number of roles. When you map roles to groups, the roles of a user in that group are the combination of the roles assigned to that group and the roles assigned to that user. Likewise, the total set of permissions that a user has is defined by the union of the permissions in all its roles. Set up user authorization using RBAC Review these topics to learn how to configure RBAC in your cluster or deployment: Learn about built-in roles Define your own roles Learn about the Elasticsearch and Kibana privileges you can assign to roles Learn how to control access at the document and field level Assign roles to users The way that you assign roles to users depends on your authentication realm: Native realm : Using Elasticsearch API _security endpoints In Kibana , using the Stack Management > Security > Users page File realm : Using a user_roles file In ECK: As part of a basic authentication secret External realms : By mapping users and groups to roles Advanced topics Learn how to delegate authorization to another realm Learn how to build a custom authorization plugin for unsupported systems or advanced applications Learn how to submit requests on behalf of other users Learn about attribute-based access control Tip User roles are also used to control access to Kibana spaces . Attribute-based access control Attribute-based access control (ABAC) enables you to use attributes to restrict access to documents in search queries and aggregations. For example, you can assign attributes to users and documents, then implement an access policy in a role definition. Users with that role can read a specific document only if they have all the required attributes. For more information, see Document-level attribute-based access control with Elasticsearch . Previous Manage authentication for multiple clusters Next Built-in roles Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page How role-based access control works Set up user authorization using RBAC Assign roles to users Advanced topics Attribute-based access control Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "User roles | Elastic Docs",
+    "url": "https://www.elastic.co/docs/deploy-manage/users-roles/cluster-or-deployment-auth/user-roles",
+    "meta_description": "After a user is authenticated, Elastic Stack needs to determine whether the user behind an incoming request is allowed to execute the request. The primary..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-setup-passwords Deprecated in 8.0. The elasticsearch-setup-passwords tool is deprecated and will be removed in a future release. To manually reset the password for the built-in users (including the elastic user), use the elasticsearch-reset-password tool, the Elasticsearch change password API, or the User Management features in Kibana. The elasticsearch-setup-passwords command sets the passwords for the built-in users . Synopsis bin/elasticsearch-setup-passwords auto|interactive [-b, --batch] [-h, --help] [-E <KeyValuePair>] [-s, --silent] [-u, --url \"<URL>\"] [-v, --verbose] Description This command is intended for use only during the initial configuration of the Elasticsearch security features. It uses the elastic bootstrap password to run user management API requests. If your Elasticsearch keystore is password protected, before you can set the passwords for the built-in users, you must enter the keystore password. After you set a password for the elastic user, the bootstrap password is no longer active and you cannot use this command. Instead, you can change passwords by using the Management > Users UI in Kibana or the Change Password API . This command uses an HTTP connection to connect to the cluster and run the user management requests. If your cluster uses TLS/SSL on the HTTP layer, the command automatically attempts to establish the connection by using the HTTPS protocol. It configures the connection by using the xpack.security.http.ssl settings in the elasticsearch.yml file. If you do not use the default config directory location, ensure that the ES_PATH_CONF environment variable returns the correct path before you run the elasticsearch-setup-passwords command. You can override settings in your elasticsearch.yml file by using the -E command option. For more information about debugging connection failures, see Setup-passwords command fails due to connection failure . Parameters auto Outputs randomly-generated passwords to the console. -b, --batch If enabled, runs the change password process without prompting the user. -E <KeyValuePair> Configures a standard Elasticsearch or X-Pack setting. -h, --help Shows help information. interactive Prompts you to manually enter passwords. -s, --silent Shows minimal output. -u, --url \"<URL>\" Specifies the URL that the tool uses to submit the user management API requests. The default value is determined from the settings in your elasticsearch.yml file. If xpack.security.http.ssl.enabled is set to true , you must specify an HTTPS URL. -v, --verbose Shows verbose output. Examples The following example uses the -u parameter to tell the tool where to submit its user management API requests: bin/elasticsearch-setup-passwords auto -u \"http://localhost:9201\" Previous elasticsearch-service-tokens Next elasticsearch-shard Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "elasticsearch-setup-passwords | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/setup-passwords",
+    "meta_description": "The elasticsearch-setup-passwords command sets the passwords for the built-in users. This command is intended for use only during the initial configuration..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-saml-metadata The elasticsearch-saml-metadata command can be used to generate a SAML 2.0 Service Provider Metadata file. Synopsis bin/elasticsearch-saml-metadata [--realm <name>] [--out <file_path>] [--batch] [--attribute <name>] [--service-name <name>] [--locale <name>] [--contacts] ([--organisation-name <name>] [--organisation-display-name <name>] [--organisation-url <url>]) ([--signing-bundle <file_path>] | [--signing-cert <file_path>][--signing-key <file_path>]) [--signing-key-password <password>] [-E <KeyValuePair>] [-h, --help] ([-s, --silent] | [-v, --verbose]) Description The SAML 2.0 specification provides a mechanism for Service Providers to describe their capabilities and configuration using a metadata file . The elasticsearch-saml-metadata command generates such a file, based on the configuration of a SAML realm in Elasticsearch. Some SAML Identity Providers will allow you to automatically import a metadata file when you configure the Elastic Stack as a Service Provider. You can optionally select to digitally sign the metadata file in order to ensure its integrity and authenticity before sharing it with the Identity Provider. The key used for signing the metadata file need not necessarily be the same as the keys already used in the saml realm configuration for SAML message signing. If your Elasticsearch keystore is password protected, you are prompted to enter the password when you run the elasticsearch-saml-metadata command. Parameters --attribute <name> Specifies a SAML attribute that should be included as a <RequestedAttribute> element in the metadata. Any attribute configured in the Elasticsearch realm is automatically included and does not need to be specified as a commandline option. --batch Do not prompt for user input. --contacts Specifies that the metadata should include one or more <ContactPerson> elements. The user will be prompted to enter the details for each person. -E <KeyValuePair> Configures an Elasticsearch setting. -h, --help Returns all of the command parameters. --locale <name> Specifies the locale to use for metadata elements such as <ServiceName> . Defaults to the JVM’s default system locale. --organisation-display-name <name Specified the value of the <OrganizationDisplayName> element. Only valid if --organisation-name is also specified. --organisation-name <name> Specifies that an <Organization> element should be included in the metadata and provides the value for the <OrganizationName> . If this is specified, then --organisation-url must also be specified. --organisation-url <url> Specifies the value of the <OrganizationURL> element. This is required if --organisation-name is specified. --out <file_path> Specifies a path for the output files. Defaults to saml-elasticsearch-metadata.xml --service-name <name> Specifies the value for the <ServiceName> element in the metadata. Defaults to elasticsearch . --signing-bundle <file_path> Specifies the path to an existing key pair (in PKCS#12 format). The private key of that key pair will be used to sign the metadata file. --signing-cert <file_path> Specifies the path to an existing certificate (in PEM format) to be used for signing of the metadata file. You must also specify the --signing-key parameter. This parameter cannot be used with the --signing-bundle parameter. --signing-key <file_path> Specifies the path to an existing key (in PEM format) to be used for signing of the metadata file. You must also specify the --signing-cert parameter. This parameter cannot be used with the --signing-bundle parameter. --signing-key-password <password> Specifies the password for the signing key. It can be used with either the --signing-key or the --signing-bundle parameters. --realm <name> Specifies the name of the realm for which the metadata should be generated. This parameter is required if there is more than 1 saml realm in your Elasticsearch configuration. -s, --silent Shows minimal output. -v, --verbose Shows verbose output. Examples The following command generates a default metadata file for the saml1 realm: bin/elasticsearch-saml-metadata --realm saml1 The file will be written to saml-elasticsearch-metadata.xml . You may be prompted to provide the \"friendlyName\" value for any attributes that are used by the realm. The following command generates a metadata file for the saml2 realm, with a <ServiceName> of kibana-finance , a locale of en-GB and includes <ContactPerson> elements and an <Organization> element: bin/elasticsearch-saml-metadata --realm saml2 \\ --service-name kibana-finance \\ --locale en-GB \\ --contacts \\ --organisation-name \"Mega Corp. Finance Team\" \\ --organisation-url \"http://mega.example.com/finance/\" Previous elasticsearch-reset-password Next elasticsearch-service-tokens Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "elasticsearch-saml-metadata | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/saml-metadata",
+    "meta_description": "The elasticsearch-saml-metadata command can be used to generate a SAML 2.0 Service Provider Metadata file. The SAML 2.0 specification provides a mechanism..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Cluster or deployment / User authentication / Looking up users without authentication ECE ECK Elastic Cloud Hosted Self Managed Elasticsearch realms exist primarily to support user authentication . Some realms authenticate users with a password (such as the native and ldap realms), and other realms use more complex authentication protocols (such as the saml and oidc realms). In each case, the primary purpose of the realm is to establish the identity of the user who has made a request to the Elasticsearch API. However, some Elasticsearch features need to look up a user without using their credentials. The run_as feature executes requests on behalf of another user. An authenticated user with run_as privileges can perform requests on behalf of another unauthenticated user. The delegated authorization feature links two realms together so that a user who authenticates against one realm can have the roles and metadata associated with a user from a different realm. In each of these cases, a user must first authenticate to one realm and then Elasticsearch will query the second realm to find another user. The authenticated user credentials are used to authenticate in the first realm only, The user in the second realm is retrieved by username, without needing credentials. When Elasticsearch resolves a user using their credentials (as performed in the first realm), it is known as user authentication . When Elasticsearch resolves a user using the username only (as performed in the second realm), it is known as user lookup . See the run_as and delegated authorization documentation to learn more about these features, including which realms and authentication methods support run_as or delegated authorization. In both cases, only the following realms can be used for the user lookup: The reserved, native and file realms always support user lookup. The ldap realm supports user lookup when the realm is configured in user search mode . User lookup is not support when the realm is configured with user_dn_templates . User lookup support in the active_directory realm requires that the realm be configured with a bind_dn and a bind password. The pki , saml , oidc , kerberos and jwt realms do not support user lookup. Note If you want to use a realm only for user lookup and prevent users from authenticating against that realm, you can configure the realm and set authentication.enabled to false The user lookup feature is an internal capability that is used to implement the run_as and delegated authorization features - there are no APIs for user lookup. If you want to test your user lookup configuration, then you can do this with run_as . Use the Authenticate API, authenticate as a superuser (e.g. the builtin elastic user) and specify the es-security-runas-user request header . Note The Get users API and User profiles feature are alternative ways to retrieve information about a Elastic Stack user. Those APIs are not related to the user lookup feature. Previous User profiles Next Controlling the user cache Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Looking up users without authentication | Elastic Docs",
+    "url": "https://www.elastic.co/docs/deploy-manage/users-roles/cluster-or-deployment-auth/looking-up-users-without-authentication",
+    "meta_description": "Elasticsearch realms exist primarily to support user authentication. Some realms authenticate users with a password (such as the native and ldap realms),..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-reset-password The elasticsearch-reset-password command resets the passwords of users in the native realm and built-in users. Synopsis bin/elasticsearch-reset-password [-a, --auto] [-b, --batch] [-E <KeyValuePair] [-f, --force] [-h, --help] [-i, --interactive] [-s, --silent] [-u, --username] [--url] [-v, --verbose] Description Use this command to reset the password of any user in the native realm or any built-in user. By default, a strong password is generated for you. To explicitly set a password, run the tool in interactive mode with -i . The command generates (and subsequently removes) a temporary user in the file realm to run the request that changes the user password. Important You cannot use this tool if the file realm is disabled in your elasticsearch.yml file. This command uses an HTTP connection to connect to the cluster and run the user management requests. The command automatically attempts to establish the connection over HTTPS by using the xpack.security.http.ssl settings in the elasticsearch.yml file. If you do not use the default configuration directory location, ensure that the ES_PATH_CONF environment variable returns the correct path before you run the elasticsearch-reset-password command. You can override settings in your elasticsearch.yml file by using the -E command option. For more information about debugging connection failures, see Setup-passwords command fails due to connection failure . Parameters -a, --auto Resets the password of the specified user to an auto-generated strong password. (Default) -b, --batch Runs the reset password process without prompting the user for verification. -E <KeyValuePair> Configures a standard Elasticsearch or X-Pack setting. -f, --force Forces the command to run against an unhealthy cluster. -h, --help Returns all of the command parameters. -i, --interactive Prompts for the password of the specified user. Use this option to explicitly set a password. -s --silent Shows minimal output in the console. -u, --username The username of the native realm user or built-in user. --url Specifies the base URL (hostname and port of the local node) that the tool uses to submit API requests to Elasticsearch. The default value is determined from the settings in your elasticsearch.yml file. If xpack.security.http.ssl.enabled is set to true , you must specify an HTTPS URL. -v --verbose Shows verbose output in the console. Examples The following example resets the password of the elastic user to an auto-generated value and prints the new password in the console: bin/elasticsearch-reset-password -u elastic The following example resets the password of a native user with username user1 after prompting in the terminal for the desired password: bin/elasticsearch-reset-password --username user1 -i The following example resets the password of a native user with username user2 to an auto-generated value prints the new password in the console. The specified URL indicates where the elasticsearch-reset-password tool attempts to reach the local Elasticsearch node: bin/elasticsearch-reset-password --url \"https://172.0.0.3:9200\" --username user2 -i Previous elasticsearch-reconfigure-node Next elasticsearch-saml-metadata Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "elasticsearch-reset-password | Elastic Documentation",
+    "url": "https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/reset-password",
+    "meta_description": "The elasticsearch-reset-password command resets the passwords of users in the native realm and built-in users. Use this command to reset the password..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / User authentication / Operator privileges / Operator-only functionality ECE ECK Elastic Cloud Hosted Indirect use only This feature is designed for indirect use by Elastic Cloud Hosted, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes. Direct use is not supported. Operator privileges provide protection for APIs and dynamic cluster settings. Any API or cluster setting that is protected by operator privileges is known as operator-only functionality . When the operator privileges feature is enabled, operator-only APIs can be executed only by operator users. Likewise, operator-only settings can be updated only by operator users. The list of operator-only APIs and dynamic cluster settings are pre-determined in the codebase. The list may evolve in future releases but it is otherwise fixed in a given Elasticsearch version. Operator-only APIs Voting configuration exclusions Delete license Update license Create or update autoscaling policy Delete autoscaling policy Create or update desired nodes Get desired nodes Delete desired nodes Get desired balance Reset desired balance Operator-only dynamic cluster settings All IP filtering settings The following dynamic machine learning settings : xpack.ml.node_concurrent_job_allocations xpack.ml.max_machine_memory_percent xpack.ml.use_auto_machine_memory_percent xpack.ml.max_lazy_ml_nodes xpack.ml.process_connect_timeout xpack.ml.nightly_maintenance_requests_per_second xpack.ml.max_ml_node_size xpack.ml.enable_config_migration xpack.ml.persist_results_max_retries The cluster.routing.allocation.disk.threshold_enabled setting The following recovery settings for managed services : node.bandwidth.recovery.operator.factor node.bandwidth.recovery.operator.factor.read node.bandwidth.recovery.operator.factor.write node.bandwidth.recovery.operator.factor.max_overcommit Previous Configure operator privileges Next Operator privileges for snapshot and restore Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Operator-only APIs Operator-only dynamic cluster settings Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Operator-only functionality | Elastic Docs",
+    "url": "https://www.elastic.co/docs/deploy-manage/users-roles/cluster-or-deployment-auth/operator-only-functionality",
+    "meta_description": "Operator privileges provide protection for APIs and dynamic cluster settings. Any API or cluster setting that is protected by operator privileges is known..."
+  },
+  {
+    "text": "Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Cluster or deployment / User authentication / Controlling the user cache ECE ECK Elastic Cloud Hosted Self Managed User credentials are cached in memory on each node to avoid connecting to a remote authentication service or hitting the disk for every incoming request. You can configure characteristics of the user cache with the cache.ttl , cache.max_users , and cache.hash_algo realm settings. Note JWT realms use jwt.cache.ttl and jwt.cache.size realm settings. Note PKI and JWT realms do not cache user credentials, but do cache the resolved user object to avoid unnecessarily needing to perform role mapping on each request. The cached user credentials are hashed in memory. By default, the Elasticsearch security features use a salted sha-256 hash algorithm. You can use a different hashing algorithm by setting the cache.hash_algo realm settings. See User cache and password hash algorithms . Evicting users from the cache You can use the clear cache API to force the eviction of cached users . For example, the following request evicts all users from the ad1 realm: $ curl -XPOST 'http://localhost:9200/_security/realm/ad1/_clear_cache' To clear the cache for multiple realms, specify the realms as a comma-separated list: $ curl -XPOST 'http://localhost:9200/_security/realm/ad1,ad2/_clear_cache' You can also evict specific users: $ curl -XPOST 'http://localhost:9200/_security/realm/ad1/_clear_cache?usernames=rdeniro,alpacino' Previous Looking up users without authentication Next Manage authentication for multiple clusters Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Evicting users from the cache Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .",
+    "title": "Controlling the user cache | Elastic Docs",
+    "url": "https://www.elastic.co/docs/deploy-manage/users-roles/cluster-or-deployment-auth/controlling-user-cache",
+    "meta_description": "User credentials are cached in memory on each node to avoid connecting to a remote authentication service or hitting the disk for every incoming request..."
+  }
+]
\ No newline at end of file
diff --git a/supporting-blog-content/why-rag-still-matters/dataset.ndjson b/supporting-blog-content/why-rag-still-matters/dataset.ndjson
new file mode 100644
index 00000000..527450a6
--- /dev/null
+++ b/supporting-blog-content/why-rag-still-matters/dataset.ndjson
@@ -0,0 +1,303 @@
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it Lucene BT AL By: Benjamin Trent and Ao Li On February 7, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Yep, another bug fixing blog. But this one has a twist, an open-source hero swoops in and saves the day. Debugging concurrency bugs is no picnic, but we're going to get into it. Enter Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, that turns flaky failures into reliably reproducible ones. Thanks to Fray’s clever shadow lock design and precise thread control, we tracked down a tricky Lucene bug and finally squashed it. This post explores how open-source heroes and tools are making concurrency debugging less painful—and the software world a whole lot better. Software Engineer's Bane Concurrency bugs are the worst. Not only are they difficult to fix, simply getting them to fail reliably is the hardest part. Take this test failure, TestIDVersionPostingsFormat#testGlobalVersions , as an example. It spawns multiple document writing and updating threads, challenging Lucene’s optimistic concurrency model. This test exposed a race condition in the optimistic concurrency control. Meaning, a document operation may falsely claim to be the latest in a sequence of operations 😱. Meaning, in certain conditions, an update or delete operation might actually succeed when it should have failed given optimistic concurrency constraints. Apologies for those who hate Java stack traces. Note, delete doesn’t necessarily mean “delete”. It can also indicate a document “update”, as Lucene’s segments are read-only. Apache Lucene manages each thread that is writing documents through the DocumentsWriter class. This class will create or reuse threads for document writing and each write action controls its information within the DocumentsWriterPerThread (DWPT) class. Additionally, the writer keeps track of what documents are deleted in the DocumentsWriterDeleteQueue (DWDQ). These structures keep all document mutation actions in memory and will periodically flush, freeing up in-memory resources and persisting structures to disk. In an effort to prevent blocking threads and ensuring high throughput in concurrent systems, Apache Lucene tries to only synchronize in very critical sections. While this can be good in practice, like in any concurrent systems, there are dragons. A False Hope My initial investigation pointed me to a couple of critical sections that were not appropriately synchronized. All interactions to a given DocumentsWriterDeleteQueue are controlled by its enclosing DocumentsWriter . So while individual methods may not be appropriately synchronized in the DocumentsWriterDeleteQueue , their access to the world is (or should be). (Let’s not delve into how this muddles ownership and access—it’s a long-lived project written by many contributors. Cut it some slack.) However, I found one place during a flush that was not synchronized. These actions aren’t synchronized into a single atomic operation. Meaning, between newQueue being created, and calling getMaxSeqNo , other code could have executed incrementing the sequence number in the documentsWriter class. I found the bug! If only it were that easy. But, as with most complex bugs, finding the root cause wasn't simple. That's when a hero stepped in. A hero in the fray Enter our hero: Ao Li and his colleagues at the PASTA Lab. I will let him explain how they saved the day with Fray. Fray is a deterministic concurrency testing framework developed by researchers at the PASTA Lab , Carnegie Mellon University. The motivation behind building Fray stems from a noticeable gap between academia and industry: while deterministic concurrency testing has been extensively studied in academic research for over 20 years, practitioners continue to rely on stress testing—a method widely acknowledged as unreliable and flaky—to test their concurrent programs. Thus, we wanted to design and implement a deterministic concurrency testing framework with generality and practical applicability as the primary goal. The Core Idea At its heart, Fray leverages a straightforward yet powerful principle: sequential execution. Java’s concurrency model provides a key property —if a program is free of data races, all executions will appear sequentially consistent. This means the program’s behavior can be represented as a sequence of program statements. Fray operates by running the target program in a sequential manner: at each step, it pauses all threads except one, allowing Fray to precisely control thread scheduling. Threads are selected randomly to simulate concurrency, but the choices are recorded for subsequent deterministic replay. To optimize execution, Fray only performs context-switches when a thread is about to execute a synchronizing instruction such as locking or atomic/volatile access. A nice property about data-race freedom is that this limited context switching is sufficient to explore all observable behaviors due to any thread interleaving ( our paper has a proof sketch). The Challenge: Controlling Thread Scheduling While the core idea seems simple, implementing Fray presented significant challenges. To control thread scheduling, Fray must manage the execution of each application thread. At first glance, this might seem straightforward—replacing concurrency primitives with customized implementations. However, concurrency control in the JVM is intricate, involving a mix of bytecode instructions , high-level libraries , and native methods . This turned out to be a rabbit hole: For example, every MONITORENTER instruction must have a corresponding MONITOREXIT in the same method. If Fray replaces MONITORENTER with a method call to a stub/mock, it also needs to replace MONITOREXIT . In code that makes use of object.wait/notify , If MONITORENTER is replaced, the corresponding object.wait must also be replaced. This replacement chain extends to object.notify and beyond. JVM invokes certain concurrency-related methods (e.g., object.notify when a thread ends) within native code. Replacing these operations would require modifying the JVM itself. JVM functions, such as class loaders and garbage collection (GC) threads, also use concurrency primitives. Modifying these primitives can create mismatches with those JVM functions. Replacing concurrency primitives in the JDK often results in JVM crashes during its initialization phase. These challenges made it clear that a comprehensive replacement of concurrency primitives was not feasible. Our Solution: Shadow Lock Design To address these challenges, Fray uses a novel shadow lock mechanism to orchestrate thread execution without replacing concurrency primitives. Shadow locks act as intermediaries that guide thread execution. For example, before acquiring a lock, an application thread must interact with its corresponding shadow lock. The shadow lock determines whether the thread can acquire the lock. If the thread cannot proceed, the shadow lock blocks it and allows other threads to execute, avoiding deadlocks and allowing controlled concurrency. This design enables Fray to control thread interleaving transparently while preserving the correctness of concurrency semantics. Each concurrency primitive is carefully modeled within the shadow lock framework to ensure soundness and completeness. More technical details can be found in our paper. Moreover, this design is intended to be future-proof. By requiring only the instrumentation of shadow locks around concurrency primitives, it ensures compatibility with newer versions of JVM. This is feasible because the interfaces of concurrency primitives in the JVM are relatively stable and have remained unchanged for years. Testing Fray After building Fray, the next step was evaluation. Fortunately, many applications, such as Apache Lucene, already include concurrency tests. Such concurrency tests are regular JUnit tests that spawn multiple threads, do some work, then (usually) wait for those threads to finish, and then assert some property. Most of the time, these tests pass because they exercise only one interleaving. Worse yet, some tests only fail occasionally in the CI/CD environment, as described earlier, making these failures extremely difficult to debug. When we executed the same tests with Fray, we uncovered numerous bugs. Notably, Fray rediscovered previously reported bugs that had remained unfixed due to the lack of a reliable reproduction, including this blog’s focus: TestIDVersionPostingsFormat.testGlobalVersions . Luckily, with Fray, we can deterministically replay them and provide developers with detailed information, enabling them to reliably reproduce and fix the issue. Next Steps for Fray We are thrilled to hear from developers at Elastic that Fray has been helpful in debugging concurrency bugs. We will continue to work on Fray to make it available to more developers. Our short-term goals include enhancing Fray’s ability to deterministically replay the schedule, even in the presence of other non-deterministic operations such as a random-value generator or the use of object.hashcode . We also aim to improve the usability of Fray, enabling developers to analyze and debug existing concurrency tests without any manual intervention. Most importantly, if you are facing challenges debugging or testing concurrency issues in your program, we’d love to hear from you. Please don’t hesitate to create an issue in the Fray Github repository . Time to fix the danged thing Thanks to Ao Li and the PASTA lab, we now have a reliably failing instance of this test! We can finally fix this thing. The key issue resided in how DocumentsWriterPerThreadPool allowed for thread and resource reuse. Here we can see each thread being created, referencing the initial delete queue at generation 0. Then the queue advance will occur on flush, correctly seeing the previous 7 actions in the queue. But, before all the threads can finish flushing, two are reused for an additional document: These will then increment the seqNo above the assumed maximum, which was calculated during the flush as 7. Note the additional numDocsInRAM for segments _3 and _0 Thus causing Lucene to incorrectly account for the sequence of document actions during a flush and tripping this test failure. Like all good bug fixes, the actual fix is about 10 lines of code . But took two engineers multiple days to actually figure out: Some lines of code take longer to write than others. And even require the help of some new friends. Not all heroes wear capes Yes, it's cliche – but it's true. Concurrent program debugging is incredibly important. These tricky concurrency bugs take an inordinate amount of time to debug and work through. While new languages like Rust have built in mechanisms to help prevent race conditions like this, the majority of software in the world is already written, and written in something other than Rust . Java, even after all these years, is still one of the most used languages. Improving debugging on JVM based languages makes the software engineering world better. And given how some folks think that code will be written by Large Language Models, maybe our jobs as engineers will eventually just be debugging bad LLM code instead of just our own bad code. But, no matter the future of software engineering, concurrent program debugging will remain critical for maintaining and building software. Thank you Ao Li and his colleagues from the PASTA Lab for making it that much better. Report an issue Related content Lucene December 27, 2024 Lucene bug adventures: Fixing a corrupted index exception Sometimes, a single line of code takes days to write. Here, we get a glimpse of an engineer's pain and debugging over multiple days to fix a potential Apache Lucene index corruption. BT By: Benjamin Trent Lucene January 3, 2025 Lucene Wrapped 2024 2024 has been another major year for Apache Lucene. In this blog, we’ll explore the key highlights. CH By: Chris Hegarty Vector Database Lucene June 26, 2024 Elasticsearch vs. OpenSearch: Vector Search Performance Comparison Elasticsearch is out-of-the-box 2x–12x faster than OpenSearch for vector search US By: Ugo Sangiorgi Lucene ML Research November 11, 2023 Understanding scalar quantization in Lucene Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights. BT By: Benjamin Trent Vector Database February 6, 2025 A quick introduction to vector search This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. VC By: Valentin Crettaz Jump to Software Engineer's Bane A False Hope A hero in the fray The Core Idea The Challenge: Controlling Thread Scheduling Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Concurrency bugs in Lucene: How to fix optimistic concurrency failures - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/optimistic-concurrency-lucene-debugging","meta_description":"Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Elastic Cloud Serverless Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Elastic Cloud Serverless Ingestion September 20, 2024 Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. VR MR By: Vishal Raj and Marc Lopez Rubio Elastic Cloud Serverless September 6, 2024 Stateless: Data safety in a stateless world We discuss the data durability guarantees in stateless including how we fence new writes and deletes with a safety check which prevents stale nodes from acknowledging new writes or deletes HA By: Henning Andersen Elastic Cloud Serverless September 3, 2024 Leveraging Kubernetes controller patterns to orchestrate Elastic workloads globally Understand how Kubernetes controller primitives are used at very large scale to power Elastic Cloud Serverless. SG By: Sebastien Guilloux Elastic Cloud Serverless August 8, 2024 Search tier autoscaling in Elasticsearch Serverless Explore search tier autoscaling in Elasticsearch Serverless. Learn how autoscaling works, how the search load is calculated and more. MP JV By: Matteo Piergiovanni and John Verwolf 1 2 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elastic Cloud Serverless - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/elastic-cloud-serverless","meta_description":"Elastic Cloud Serverless articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog A tutorial on building local agent using LangGraph, LLaMA3 and Elasticsearch vector store from scratch This article will provide a detailed tutorial on implementing a local, reliable agent using LangGraph, combining concepts from Adaptive RAG, Corrective RAG, and Self-RAG papers, and integrating Langchain, Elasticsearch Vector Store, Tavily AI for web search, and LLaMA3 via Ollama. Generative AI Vector Database Agent How To PR By: Pratik Rana On September 2, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this tutorial we are going to see how we can create a reliable agent using LangGraph, LLaMA3 and Elasticsearch Vector Store from scratch. We will be combining ideas from 3 Advanced RAG papers: Adaptive RAG for Routing : Which directs questions to a vector store or web search based on the content Corrective RAG for Fallback : Using this we will introduce a Fallback retrival where if a question isn't relevant to the vector store, we will use a web-search instead. Self RAG for Self Correction : Additonally, we will add self-correction to check generations for hallucinations and relevance, and if they're not suitable, we'll fallback to web search again. Hence what we are aiming to build is a complex RAG flow and demonstrate its reliability and local execution on our system. Background information What is an LLM agent? An LLM-powered agent can be described as a system that leverages a Large Language Model (LLM) to reason through problems, devise plans to solve them, and execute these plans using a set of tools. In essence, these agents possess complex reasoning abilities, memory, and the means to carry out tasks. Building agents with an LLM as the core controller is an exciting concept. Several proof-of-concept demonstrations, such as AutoGPT, GPT-Engineer, and BabyAGI, serve as inspiring examples. The potential of LLMs extends beyond generating well-written text, stories, essays, and programs; they can be framed as powerful general problem solvers. Agent system overview In an LLM-powered autonomous agent system, the LLM functions as the agent’s brain, complemented by several key components: Planning Subgoal and decomposition: The agent breaks down large tasks into smaller, manageable subgoals, enabling efficient handling of complex tasks. Reflection and refinement: The agent engages in self-criticism and self-reflection over past actions, learns from mistakes, and refines future steps, thereby improving the quality of final results. Memory Short-term memory: Serves as a dynamic repository of the agent's current actions and thoughts, akin to its \"train of thought,\" as it endeavors to respond to a user's query in real-time. It allows the agent to maintain a contextual understanding of the ongoing interaction, enabling seamless and coherent communication. Long-term memory: Acts as a comprehensive logbook, chronicling the agent's interactions with users over an extended period, spanning weeks or even months. It captures the history of conversations, preserving valuable context and insights gleaned from past exchanges. This repository of accumulated knowledge enhances the agent's ability to provide personalized and informed responses, drawing upon past experiences to enrich its interactions with users. Hybrid memory: It combines the advantages of both STM and LTM to enhance the agent's cognitive abilities. STM ensures that the agent can quickly access and manipulate recent data, maintaining context within a conversation or task. LTM expands the agent's knowledge base by storing past interactions, learned patterns, and domain-specific information, enabling it to provide more informed responses and make better decisions over time. Tool use In the context of LLM (Large Language Model) agents, tools refer to external resources, services, or APIs (Application Programming Interfaces) that the agent can utilize to perform specific tasks or enhance its capabilities. These tools serve as supplementary components that extend the functionality of the LLM agent beyond its inherent language generation capabilities. Tools could also include databases, knowledge bases, and external models. As an illustration, agents can employ a RAG pipeline for producing contextually relevant responses, a code interpreter for addressing programming challenges, an API for conducting internet searches, or even utilize straightforward API services such as those for weather updates or instant messaging applications. Types of LLM agents and use cases Conversational agents : Engage users in natural language dialogues to provide information, answer questions, and assist with tasks. They utilize LLMs to generate human-like responses. Task-oriented agents : Focus on completing specific tasks or objectives by understanding user needs and executing relevant actions. Examples include virtual assistants and automation tools. Creative agents : Generate original content such as artwork, music, or writing. They use LLMs to understand human preferences and artistic styles, producing content that appeals to audiences. Collaborative agents : Work with humans to achieve shared goals by facilitating communication and cooperation. LLMs help these agents assist in decision-making, report generation, and providing insights. Approach: ReAct/Langchain agent vs LangGraph ? Now, let's consider using an agent to build a corrective RAG (Retrieval-Augmented Generation) system, represented by that middle blue component that can bee seen in the diagram above. When people think about agents, they often mention \"ReAct\"—a popular framework for building agents (not to be confused with the React.js framework). The typical flow in a ReAct agent looks like this: The LLM (Language Learning Model) plans by selecting an action, observing the result, reflecting on it, and then choosing the next action. ReAct agents usually leverage memories, such as chat history or a vector store, and can utilize various tools. If we were to implement this flow as a ReAct agent, it would look something like this: The agent would receive a question and perform an action, such as using its vector store to retrieve relevant documents. It would then observe the retrieved documents and decide to grade them. The agent would go back to its action phase and select the grading tool. This process would repeat in a loop, following a defined trajectory until the task is complete. This is how ReAct-based agents typically function. However, this approach can be quite complex and involve a lot of decision-making. Instead, we’ll use a different method to implement this system. Rather than having the agent make decisions at every step in the loop, we’ll define a \"control flow\" in advance. As engineers, we can lay out the exact sequence of steps we want our agent to follow each time it runs, effectively taking the planning responsibility away from the LLM. This predefined control flow allows the LLM to focus on specific tasks within each step. In terms of memory, we can use what’s called a \"graph state\" to persist information across the control flow, making it relevant to the RAG process (e.g., documents and questions). For tool usage, each graph node can utilize a different tool: the Vectorstore Retrieval node (depicted in grey) will use a retriever tool, the Grade Documents node (depicted in blue) will use a grading tool, and the Web Search node (depicted in red) will use a web search tool: This method simplifies decision-making for the LLM, making the system more reliable, especially when using smaller LLMs. Prerequisites Before diving into the code, we need to set up the necessary tools: Elasticsearch : In this tutorial, we’ll use Elasticsearch as our data store because it offers more than just a vector database for a superior search experience. Elasticsearch provides a complete vector database, multiple retrieval methods (text, sparse and dense vector, hybrid), and the flexibility to choose your machine learning model architectures. There’s a reason it’s the world’s most downloaded database! To follow along, you’ll need to deploy an Elasticsearch Cluster, which can be done in under 3 minutes as part of our 14-day free trial (no credit card required). Get started by clicking here . Ollama : Ollama is a platform that simplifies local development with open-source large language models (LLMs). It packages everything you need to run an LLM—model weights and configurations—into a single Modelfile, similar to how Docker works for containers. You can download Ollama for your machine by clicking here . Just one small thing here to note is that the llama3 comes with a particular prompt format , that one needs to pay attention to. After installation, verify it by running the following command: Next, install the llama3 model, which will serve as our local LLM for this tutorial: Tavily search : Tavily's Search API is a specialized search engine designed for AI agents (LLMs), providing real-time, accurate, and factual results with impressive speed. To use this API in your tutorial, you'll need to sign up on the Tavily platform and obtain an API key. The good news is that this powerful tool is free to use. You can get started by clicking here . Great!! So now that your environment is ready, we can move on to the fun part—writing our Python code! Python code 1. Install required packages : To begin, install all the necessary packages by running the following command: 2. Set up the local LLM and the Tavily search API : After the installation is complete, set the variable local_llm to \"llama3\" . This will define the local LLM you’ll be using in this tutorial. Feel free to change this parameter later if you want to experiment with other local LLMs on your system, and also define the Tavily Search API key obtained in the Prerequisites in your environment variable like below: 1. Indexing First we will need to load, process, and index our targetted data into our Vector Store. In this tutorial we will be indexing documents from these respective Blog posts: \" https://lilianweng.github.io/posts/2023-06-23-agent/ \", \" https://lilianweng.github.io/posts/2023-03-15-prompt-engineering/ \", \" https://lilianweng.github.io/posts/2023-10-25-adv-attack-llm/ \", Into our vector store, which will then add as a data source for our RAG implementation, as the index is the key component of our RAG flow without which we won't be able to retrive the documents. Code description: A list of URLs is defined, pointing to three different blog posts on Lilian Weng's website. The content from each URL is loaded using WebBaseLoader , and the result is stored in the docs list. The loaded documents are stored as a list of lists (each containing one or more documents). These lists are flattened into a single list using a list comprehension. The RecursiveCharacterTextSplitter is initialized with a specific chunk size (250 characters) and no overlap. This is used to split the documents into smaller chunks. The split chunks are stored in the documents variable. An instance of NomicEmbeddings is created to generate embeddings for the document chunks. The model used is specified as \"nomic-embed-text-v1.5\" , and inference is done locally. The documents, along with their embeddings, are stored in an Elasticsearch database. The connection details (URL, username, password) and the index name are provided. Finally, a retriever object is created from the Elasticsearch database, which can be used to query and retrieve documents based on their embeddings. 2. Retrieval grader Once we index our respective documents into the data store will need to create a grader that evaluates the relevance of our retrieved document to a given user question. Now this is where llama3 comes in, I set my local_llm to llama3 and llama has \"json\" mode which confirm the output from LLM is also json, so my prompt basically says grade a document and return a json with score yes / no Code description: LLM initialization: The ChatOllama model is instantiated with a specific configuration. The model is set to output responses in JSON format with a temperature of 0, meaning the output is deterministic (no randomness). Prompt template: A PromptTemplate is defined, which sets up the instructions that will be sent to the LLM. This prompt instructs the LLM to act as a grader that assesses whether a retrieved document is relevant to a user’s question. The grader’s task is simple: if the document contains keywords related to the user question, it should return a binary score ( yes or no ) indicating relevance. The response is expected in a JSON format with a single key score . Retrieval grader pipeline: The retrieval_grader is created by chaining the prompt , llm , and JsonOutputParser together. This forms a pipeline where the user’s question and the document are first formatted by the PromptTemplate , then processed by the LLM, and finally, the output is parsed by JsonOutputParser . Example usage: A sample question (\"agent memory\") is defined. The retriever.invoke(question) method is used to fetch documents related to the question. The content of the second retrieved document ( docs[1] ) is extracted. The retrieval_grader pipeline is then invoked with the question and document as inputs. The output is the JSON-formatted binary score indicating whether the document is relevant. 3. Generator Moving on we need to script a code that can generate a concise answer to user's question using context from retrieved documents. Code description: prompt : This is a PromptTemplate object that defines the structure of the prompt sent to the language model (LLM). The prompt instructs the LLM to act as an assistant for answering questions. The LLM is provided with a question and context (retrieved documents) and is instructed to generate a concise answer in three sentences or fewer. If the LLM doesn't know the answer, it is told to simply say that it doesn't know. llm : This initializes the LLM using the ChatOllama model with a temperature of 0, which ensures that the output is more deterministic and less random. format_docs(docs) : This function takes a list of document objects and concatenates their content ( page_content ) into a single string, with each document's content separated by a double newline ( \\n\\n ). This formatted string is then used as the context in the prompt. rag_chain : This creates a processing chain that combines the prompt , the LLM ( llm ), and the StrOutputParser . The prompt is filled with the question and context , sent to the LLM for processing, and the output is parsed into a string using StrOutputParser . Running the chain : question : The user's question, in this case, \"agent memory.\" docs : A list of documents retrieved using the retriever.invoke(question) function, which retrieves documents relevant to the question . format_docs(docs) : Formats the retrieved documents into a single string of context, separated by double newlines. rag_chain.invoke({\"context\": format_docs(docs), \"question\": question}) : This line executes the chain. It passes the formatted context and question into the rag_chain , which processes the input through the LLM and returns the generated answer. print(generation) : Outputs the generated answer to the console. 4. Hallucination grader and answer grader This code snippet defines two separate graders—one for assessing hallucination in a generated answer and another for evaluating the usefulness of the answer in resolving a question. Both graders use a language model (LLM) to provide binary scores (\"yes\" or \"no\") based on specific criteria Hallucination grader Code description: LLM Initialization: llm : Initializes the ChatOllama language model with a JSON output format and a temperature of 0, making the model's output deterministic. Prompt Creation: prompt : A PromptTemplate is created to define the structure of the prompt sent to the LLM. The prompt instructs the LLM to assess whether a given answer ( generation ) is grounded in or supported by a set of facts ( documents ). The model is instructed to output a binary score ( \"yes\" or \"no\" ) in JSON format, indicating whether the answer is factual according to the provided documents. Hallucination grader setup: hallucination_grader : This is a pipeline combining the prompt , the LLM, and the JsonOutputParser . The prompt is filled with the input variables ( generation and documents ), processed by the LLM, and the output is parsed into a JSON format by JsonOutputParser . Running the hallucination grader: hallucination_grader.invoke(...) : Executes the hallucination grader by passing in the documents (facts) and the generation (the answer being assessed). The LLM then evaluates whether the answer is grounded in the provided facts and returns a binary score in JSON format. Answer grader Code description: LLM initialization: llm : Similar to the hallucination grader, this initializes the ChatOllama model with the same settings for deterministic output. Prompt creation: prompt : A PromptTemplate is created for evaluating the usefulness of an answer. This prompt instructs the LLM to assess whether a given answer ( generation ) is useful in resolving a specific question ( question ). Again, the LLM outputs a binary score ( \"yes\" or \"no\" ) in JSON format, indicating whether the answer is useful. Answer grader setup: answer_grader : This pipeline combines the prompt , the LLM, and the JsonOutputParser , similar to the hallucination grader. Running the answer grader: answer_grader.invoke(...) : Executes the answer grader by passing in the question and generation (the answer being evaluated). The LLM assesses whether the answer is useful in resolving the question and returns a binary score in JSON format. 5. Router This code snippet defines a \"Router\" system designed to determine whether a user’s question should be directed to a vectorstore or a web search for further information retrieval. Here’s a detailed explanation of each par: Code description: LLM initialization: llm : Initializes the ChatOllama language model with a JSON output format and a temperature of 0, ensuring deterministic (non-random) results from the model. Prompt creation: prompt : A PromptTemplate is created to define the structure of the input prompt sent to the LLM. This prompt instructs the LLM to act as an expert in routing user questions to the appropriate datasource: either a vectorstore or a web search. The decision is based on the content of the question: If the question relates to topics like \"LLM agents,\" \"prompt engineering,\" or \"adversarial attacks,\" it should be routed to a vectorstore. Otherwise, the question should be routed to a web search. The LLM is instructed to return a binary choice: either \"vectorstore\" or \"web_search\" . The response should be in JSON format with a single key \"datasource\" . Router setup: question_router : This is a processing chain that combines the prompt , the LLM, and the JsonOutputParser . The prompt is populated with the question, processed by the LLM to make the routing decision, and the output is parsed into JSON format by the JsonOutputParser . Running the Router: question : The user's query, in this case, \"llm agent memory.\" docs : A list of documents retrieved using the retriever.get_relevant_documents(question) function, which fetches documents relevant to the question. This part of the code appears to retrieve documents but is not directly involved in the routing decision. question_router.invoke({\"question\": question}) : This line executes the router. The question is passed to the question_router , which processes it through the LLM and returns a JSON object with a key \"datasource\" indicating whether the question should be routed to a \"vectorstore\" or \"web_search\" . print(question_router.invoke(...)) : Outputs the routing decision (either \"vectorstore\" or \"web_search\" ) to the console. 6. Web search The code sets up a web search tool that can be used to query the web and retrieve a limited number of search results (in this case, 3). This is useful in scenarios where you want to integrate external web search capabilities into a system, enabling it to fetch information from the internet and use that information for further processing or decision-making. Code description: Imports: TavilySearchResults : This is a class imported from the langchain_community.tools.tavily_search module. It is used to perform web searches and retrieve search results. Web Search Tool Initialization: web_search_tool : This variable is an instance of the TavilySearchResults class. It represents a tool configured to perform web searches. k=3 : This parameter specifies that the tool should return the top 3 search results for any given query. The k value determines how many results are fetched and processed by the search tool. 7. Control flow This code defines a stateful, graph-based workflow for processing user queries. It retrieves documents, generates answers, grades relevance, and routes the process based on the current state. This system is highly modular, allowing each step in the process to be independently defined and controlled, making it flexible and scalable for various use cases involving document retrieval, question answering, and ensuring the quality and relevance of generated content. State definition GraphState : A TypedDict that defines the structure of the state that the graph will manage. It includes: question : The user's query. generation : The answer generated by the LLM. web_search : A flag indicating whether a web search should be added. documents : A list of documents retrieved during the process. Node functions Each of the following functions represents a node in the graph, performing a specific task within the workflow. retrieve(state) Purpose : Retrieves documents from a vectorstore based on the user's question. Returns : Updates the state with the retrieved documents. generate(state) Purpose : Generates an answer using a Retrieval-Augmented Generation (RAG) model on the retrieved documents. Returns : Updates the state with the generated answer. grade_documents(state) Purpose : Grades the relevance of each retrieved document to the question and filters out irrelevant documents. If any document is irrelevant, it sets a flag to indicate that a web search is needed. Returns : Updates the state with the filtered documents and the web search flag. web_search(state) Purpose : Conducts a web search based on the user's question and appends the results to the list of documents. Returns : Updates the state with the web search results. Conditional edges These functions determine the next step in the workflow based on the current state. route_question(state) Purpose : Routes the question to either a web search or vectorstore retrieval based on its content. Returns : The next node to execute, either \"websearch\" or \"vectorstore\" . decide_to_generate(state) Purpose : Decides whether to generate an answer or perform a web search based on the relevance of the graded documents. Returns : The next node to execute, either \"websearch\" or \"generate\" . grade_generation_v_documents_and_question(state) Purpose : Grades the generated answer for hallucinations (whether it is grounded in the provided documents) and checks if the answer addresses the user's question. Returns : The next node to execute, based on whether the answer is grounded and useful. Workflow definition StateGraph : Initializes a graph that will manage the state transitions. add_node : Adds the nodes (functions) to the graph, associating each node with a name that can be used to call it in the workflow. 8. Build graph This code builds the logic and flow of the stateful workflow using a state graph. It determines how the process should move from one node (operation) to the next based on the conditions and results at each step. The workflow starts by deciding whether to retrieve documents from a vectorstore or perform a web search based on the user's question. It then assesses the relevance of the retrieved documents, deciding whether to generate an answer or conduct further web searches if the documents aren't relevant. Finally, it generates an answer and checks whether it is well-supported and useful, repeating steps or ending the workflow based on the outcome. This structure ensures that the workflow is dynamic, able to adjust based on the results at each stage, and ultimately aims to produce a well-supported and relevant answer to the user's question. Code description: Set the Conditional Entry Point set_conditional_entry_point : This method sets the starting point of the workflow based on a conditional decision. route_question : The function that determines whether the question should be routed to a web search or a vectorstore retrieval. \"websearch\": \"websearch\" : If route_question decides that the question should be routed to a web search, the workflow starts with the websearch node. \"vectorstore\": \"retrieve\" : If route_question decides that the question should be routed to the vectorstore, the workflow starts with the retrieve node. Add an Edge Between Nodes add_edge : This method creates a direct transition from one node to another in the workflow. \"retrieve\" -> \"grade_documents\" : After the documents are retrieved in the retrieve node, the workflow moves to the grade_documents node, where the retrieved documents are assessed for relevance. Add conditional edges add_conditional_edges : This method creates conditional transitions between nodes based on the result of a decision function. \"grade_documents\" : The node where the relevance of retrieved documents is assessed. decide_to_generate : The function that decides the next step based on the relevance of the documents. \"websearch\": \"websearch\" : If decide_to_generate determines that a web search is necessary (because the documents are not relevant), the workflow transitions to the websearch node. \"generate\": \"generate\" : If the documents are relevant, the workflow transitions to the generate node, where an answer is generated using the documents. Add an edge between nodes \"websearch\" -> \"generate\" : After performing a web search, the workflow moves to the generate node to generate an answer using the results from the web search. Add conditional edges for final decision \"generate\" : The node where an answer is generated using the documents (retrieved or from the web search). grade_generation_v_documents_and_question : The function that checks whether the generated answer is grounded in the documents and relevant to the question. \"not supported\": \"generate\" : If the generated answer is not well-supported by the documents, the workflow loops back to the generate node to attempt generating a better answer. \"useful\": END : If the generated answer is both grounded in the documents and addresses the question, the workflow ends ( END ). \"not useful\": \"websearch\" : If the generated answer is grounded in the documents but does not address the question adequately, the workflow transitions back to the websearch node to gather more information and try again. All done !! Now that our implementation is complete, let's test the graph by compiling and executing it as a whole, the good thins is this will also print out the steps as we go: Test 1 : Lets write a question which is relevant to the Blog Posts with respect to which we created our index in the data store? Test 2 : Lets write another question related to current affairs i.e completely out of context to the data that we indexed from the blog posts ? What do you see in the output of both these tests? For Test 1 The output shows the step-by-step execution of the workflow and the decisions made at each stage: Routing the question : Output : ---ROUTE QUESTION--- Question : \"What is agent memory?\" Decision : The workflow determines that the question should be routed to the vectorstore based on the question's content. Result : {'datasource': 'vectorstore'} and ---ROUTE QUESTION TO RAG--- . Retrieving documents : Output : ---RETRIEVE--- The workflow retrieves documents related to the question from the vectorstore. Grading document relevance : Output : ---CHECK DOCUMENT RELEVANCE TO QUESTION--- The workflow grades each retrieved document to determine if it is relevant to the question. Results : All retrieved documents are graded as relevant ( ---GRADE: DOCUMENT RELEVANT--- repeated four times). Deciding to generate an answer : Output : ---ASSESS GRADED DOCUMENTS--- Since the documents are relevant, the workflow decides to proceed with generating an answer ( ---DECISION: GENERATE--- ). Generating the answer : Output : ---GENERATE--- The workflow generates an answer using the relevant documents. Checking for hallucinations : Output : ---CHECK HALLUCINATIONS--- The workflow checks if the generated answer is grounded in the documents. Result : The answer is grounded ( ---DECISION: GENERATION IS GROUNDED IN DOCUMENTS--- ). Grading the answer against the question : Output : ---GRADE GENERATION vs QUESTION--- The workflow evaluates whether the generated answer addresses the question. Result : The answer is useful ( {'score': 'yes'} and ---DECISION: GENERATION ADDRESSES QUESTION--- ). Final output : Output : 'Finished running: generate:' Generated answer : For Test 2 This output follows the same workflow as the previous example but with a different question related to the NBA draft and the LA Lakers. Here's a breakdown of what happened during this run: Routing the question : Output : ---ROUTE QUESTION--- Question : \"Who are the LA Lakers expected to draft first in the NBA draft?\" Decision : The workflow determines that the question should be routed to a web search ( 'datasource': 'web_search' ), as it likely requires up-to-date information that isn't stored in the vectorstore. Result : web_search and ---ROUTE QUESTION TO WEB SEARCH--- . Web search : Output : ---WEB SEARCH--- The workflow performs a web search to gather the most current and relevant information regarding the Laker's draft picks. Result : 'Finished running: websearch:' indicates that the web search step is complete. Generating the answer : Output : ---GENERATE--- Using the information retrieved from the web search, the workflow generates an answer to the question. Checking for hallucinations : Output : ---CHECK HALLUCINATIONS--- The workflow checks if the generated answer is grounded in the retrieved web search documents. Result : The answer is well-supported ( ---DECISION: GENERATION IS GROUNDED IN DOCUMENTS--- ). Grading the answer against the question : Output : ---GRADE GENERATION vs QUESTION--- The workflow evaluates whether the generated answer directly addresses the question. Result : The answer is useful and relevant ( {'score': 'yes'} and ---DECISION: GENERATION ADDRESSES QUESTION--- ). Final output : Output : 'Finished running: generate:' Generated Answer : Key points of the workflow for test 1 vs test 2 Routing to web search : The workflow correctly identified that the question needed current information, so it directed the query to a web search rather than a vectorstore. Answer generation : The workflow successfully used the latest information from the web to generate a coherent and relevant response about the Lakers' expected draft pick. Grounded and useful nswer : The workflow validated that the generated answer was both grounded in the search results and directly addressed the question. Conclusion In a relatively short amount of time, we've managed to build a sophisticated Retrieval-Augmented Generation (RAG) workflow that includes routing, retrieval, grading, and various decision points such as fallback to web search and dual-criteria grading of generated content. What’s particularly impressive is that this complex RAG flow, incorporating concepts from multiple research papers, can run reliably on a local machine. The key to achieving this lies in the well-defined control flow, which ensures that the local agent operates smoothly and effectively. We encourage you to experiment with different queries and implementations, as this approach provides a powerful foundation for creating more advanced RAG agents. Hopefully, this serves as a useful guide for developing your own RAG workflows. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to Background information What is an LLM agent? Agent system overview Planning Memory Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"A tutorial on building local agent using LangGraph, LLaMA3 and Elasticsearch vector store from scratch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/local-rag-agent-elasticsearch-langgraph-llama3","meta_description":"Create a reliable agent using LangGraph, LLaMA3 & Elasticsearch. Follow this LangGraph tutorial to implement agents combining concepts from RAG."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. Vector Database Generative AI Elastic Cloud Serverless Python How To QP By: Quentin Pradet On October 4, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. This blog will show you how to use Eland to import machine learning models to Elasticsearch Serverless, and then how to explore Elasticsearch using a Pandas-like API. NLP in Elasticsearch Serverless & Eland Since Elasticsearch 8.0 , it is possible to use NLP machine learning models directly from Elasticsearch. While some models such as ELSER (for English data) or E5 (for multilingual data) can be deployed directly from Kibana, all other compatible PyTorch models need to be uploaded using Eland. Since Eland 8.14.0, eland_import_hub_model fully supports Serverless. To get the connection details, open your Serverless project in Kibana, select the \"cURL\" client, create an API key, and export the environment variables: You can then use those variables when running eland_import_hub_model : Next, search for \"Trained Models\" in Kibana, which will offer to synchronize your trained models. Once done, you will get the option to deploy your model: Less than a minute later, your model should be deployed and you'll be able to test it directly from Kibana. In this test sentence, the model successfully identified Joe as \"Person\" and \"Reunion Island\" as a location, with high probability. For more details on using Eland for machine learning models (including scikit-learn, XGBoost and LightGBM, not covered here), consider reading the detailed Accessing machine learning models in Elastic blog post and referring to the Eland documentation . Data frames in Eland The other main functionality of Eland is exploring Elasticsearch data using a Pandas-like API. Ingesting test data Let's first index some test data to Elasticsearch. We'll use a fake flights dataset. While uploading using the Python Elasticsearch client is possible, in this post we'll use Kibana's file upload functionality instead, which is enough for quick tests. First, download the dataset https://github.com/elastic/eland/blob/main/tests/flights.json.gz and decompress it ( gunzip flights.json.gz ). Next, type \"File Upload\" in Kibana's search bar and import the flights.json file. Kibana will show you the resulting fields, with \"Cancelled\" detected as a boolean, for example. Click on \"Import\". On the next screen, choose \"flights\" for the index name and click \"Import\" again. As in the screenshot below, you should see that the 13059 documents were successfully ingested in the \"flights\" index. Connecting to Elasticsearch Now that we have data to search, let's setup the Elasticsearch Serverless Python client. (While we could use the main client, the Serverless Elasticsearch Python client is usually easier to use, as it only supports Elasticsearch Serverless features and APIs.) From the Kibana home page, you can select Python which will explain how to install the Elasticsearch Serverless Python client, create an API key, and use it in your code. You should end up with this code: Searching data with Eland Finally, assuming that the above code worked, we can start using Eland. After having installed it with python -m pip install eland>=8.14 , we can start exploring our flights dataset. If you run this code in a notebook, the result will be the following table: AvgTicketPrice Cancelled Carrier Dest DestAirportID DestCityName DestCountry DestLocation.lat DestLocation.lon DestRegion ... Origin OriginAirportID OriginCityName OriginCountry OriginLocation.lat OriginLocation.lon OriginRegion OriginWeather dayOfWeek timestamp 882.982662 False Logstash Airways Venice Marco Polo Airport VE05 Venice IT 45.505299 12.3519 IT-34 ... Cape Town International Airport CPT Cape Town ZA -33.96480179 18.60169983 SE-BD Clear 0 2018-01-01T18:27:00 730.041778 False Kibana Airlines Xi'an Xianyang International Airport XIY Xi'an CN 34.447102 108.751999 SE-BD ... Licenciado Benito Juarez International Airport AICM Mexico City MX 19.4363 -99.072098 MX-DIF Damaging Wind 0 2018-01-01T05:13:00 841.265642 False Kibana Airlines Sydney Kingsford Smith International Airport SYD Sydney AU -33.94609833 151.177002 SE-BD ... Frankfurt am Main Airport FRA Frankfurt am Main DE 50.033333 8.570556 DE-HE Sunny 0 2018-01-01T00:00:00 181.694216 True Kibana Airlines Treviso-Sant'Angelo Airport TV01 Treviso IT 45.648399 12.1944 IT-34 ... Naples International Airport NA01 Naples IT 40.886002 14.2908 IT-72 Thunder & Lightning 0 2018-01-01T10:33:28 552.917371 False Logstash Airways Luis Munoz Marin International Airport SJU San Juan PR 18.43939972 -66.00180054 PR-U-A ... Ciampino___G. B. Pastine International Airport RM12 Rome IT 41.7994 12.5949 IT-62 Cloudy 0 2018-01-01T17:42:53 You can also run more complex queries such as aggregations: which outputs the following: DistanceKilometers AvgTicketPrice sum 9.261629e+07 8.204365e+06 min 0.000000e+00 1.000205e+02 std 4.578614e+03 2.664071e+02 The demo notebook in the documentation has many more examples that use the same dataset and the reference documentation lists all supported operations. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to NLP in Elasticsearch Serverless & Eland Data frames in Eland Ingesting test data Connecting to Elasticsearch Searching data with Eland Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using Eland on Elasticsearch Serverless - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/eland-elasticsearch-serverless","meta_description":"Learn how to use Eland on Elasticsearch Serverless: import machine learning models using eland_import_hub_model and easily search data with Eland."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog LangChain and Elasticsearch: Building LangGraph retrieval agent template Elasticsearch and LangChain collaborate on a new retrieval agent template for LangGraph for agentic apps Generative AI Agent Integrations JM AT SC By: Joe McElroy , Aditya Tripathi and Serena Chou On September 20, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The new LangGraph retrieval agent template is designed to simplify the development of Generative AI (GenAI) agentic applications that require agents to use Elasticsearch for agentic retrieval. This template comes pre-configured to use Elasticsearch, allowing developers to build agents with LangChain and Elasticsearch quickly. To get started right away, access the project on Github: https://github.com/langchain-ai/retrieval-agent-template What is LangGraph? LangGraph helps developers build stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. There are a few new concepts to learn, like cycles, branching, and persistence – these allow developers to implement loops, conditions, and error handling mechanisms in applications. This makes LangGraph a great choice for creating complex workflows, where agents can pause for user input or correction. For more details you can check the Intro to LangGraph course on LangChain Academy. The new Retrieval Agent Template focuses on question-answering tasks by leveraging knowledge retrieval with Elasticsearch. Users can set up agents capable of retrieving relevant information based on natural language queries. The template provides an easy, configurable interface to Elasticsearch, making it a great starting point for developers looking to build search retrieval-based agents​. About LangGraph’s default Elasticsearch template Elasticsearch Vector Database Capabilities: The template leverages Elasticsearch’s Vector Storage and Search capabilities to enable more precise and relevant knowledge retrieval. Retrieval Agent Capability: This enables an agent to use Retrieval-Augmented Generation (RAG), helping Large Language Models (LLMs) provide more accurate and context-rich answers by retrieving the most relevant information from data stored within Elasticsearch. Integration with LangGraph Studio : With LangGraph Studio, developers can better understand and build complex agentic applications. It provides intuitive visualization and debugging tools in a user-friendly interface, making it easier to develop, optimize, and troubleshoot AI applications. Start building with LangGraph retrieval agent template Elastic and LangChain are excited to give developers a headstart building the next generation of intelligent, knowledge-driven AI agents using this template. Access the retrieval agent template on GitHub , or visit Search Labs for cookbooks using Elasticsearch and LangChain. Happy searching agenting! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Jump to What is LangGraph? About LangGraph’s default Elasticsearch template Start building with LangGraph retrieval agent template Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"LangChain and Elasticsearch: Building LangGraph retrieval agent template - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/langchain-langgraph-retrieval-agent-template","meta_description":"Explore LangGraph retrieval agent template, which simplifies the development of GenAI agentic apps that require agents to use Elasticsearch for agentic retrieval."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. Elastic Cloud Serverless Ingestion VR MR By: Vishal Raj and Marc Lopez Rubio On September 20, 2024 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Sending Observability data to Elastic Cloud has become increasingly easier with the introduction of Serverless. You can create an Observability serverless project in Elastic Cloud Serverless , and send observability signals directly to Elasticsearch or through what we’ve been calling the Managed Intake Service, a fully-managed solution akin to the Elastic APM Server. Gone are the days where you needed to worry about sizing and configuring different aspects of the APM Server. Instead, you can send data directly to the Managed Intake Service and let our platform do the hard scaling work for you. In this post, we’ll share how we iterated over the existing APM Server architecture to create the Managed Intake Service. APM Server primer APM Server was designed as a single-tenant process that receives and enriches raw observability signals (traces, logs and metrics) and indexes them to Elasticsearch. It also produces multi-interval (1m, 10m and 60m intervals) aggregated metrics (roll ups) based on the received signals to efficiently power the Observability APM UI. More on why aggregations are needed can be perused here . While the current APM Server offering in Elastic Cloud works well for use cases with predictable or known load, it has a few limitations that we aimed to improve in the first iteration of the next-generation ingest service: APM Server doesn’t automatically scale with the incoming load, requiring users to make upfront decisions based on estimated or actual maximum ingestion load. Each APM Server instance should ideally generate 1 aggregated metric document for each unique set of aggregated dimensions. However, with the current model the number of metrics are directly correlated with the number of replicas for APM Server, which impacts the effectiveness for aggregation. Aggregations are performed in-memory, requiring increasing amounts of memory as the number of unique combinations increases. Ingestion of observability data in APM Server and indexing of the data to Elasticsearch are tightly coupled. Any buffer between these 2 processes are memory based, thus, if Elasticsearch is in the process of scaling up or under-provisioned, the push-back could eventually cause ingestion to grind to a halt, returning errors to clients. Keep these points in mind as we iterate through the different implementations we tried. APM Server as a managed product When we started, we asked ourselves, what is the simplest approach we could take? How well would that work? Does it satisfy the previous requirements? That simplest approach would be to provision a single APM Server for each project and use horizontal autoscaling on each project's APM Server. However, that results in an enormous amount of wasted compute resources for observability projects that do not ingest observability signals through the APM Server. Additionally, it didn’t address any of the initial limitations, or improve the overall APM Server experience in Elastic Cloud. It became clear we wanted (and needed) to experiment with a multi-tenant approach. Again, we looked at the simplest approach we could take to shorten the feedback loop as much as possible. Multi Tenant APM Server Our logical next step was to extend the current APM Server codebase. One of the requirements for the multi-tenant APM Server was to be able to distinguish between different tenants and route to the appropriate Elasticsearch instance for that tenant. We came up with a consistent hash ring load balancer to route to the same APM Server for the same tenant. This would satisfy our bounded aggregation requirement (2) to avoid generating multiple aggregation documents for the same unique set of dimensions for an event. However, as we continued designing the remainder of the multi-tenant APM Server, it looked a lot like multiple services rolled up in one box (or a distributed monolith, something we’d want to avoid). Additionally, the memory requirements looked quite large to avoid running out of memory, and perform reasonably well. After the initial prototype for feasibility, it became clear that iterating on the existing APM Server would not meet our expectations. We went back to the drawing board with the goal to design a multi-tenant, distributed system from the groud up in order to achieve the expected level of reliability, scalability, and availability. Break down the monolith Back to the drawing board! This time, we decided to break the APM Server into smaller services according to the bounded context they belonged to. Using this classification, the 3 main APM Server responsibilities are pretty obvious: Ingest signals and enrich them. Generate aggregated metric representations for known time periods Index the raw signals and the aggregated metrics once their time period has ended. Another advantage of breaking apart the APM Server into smaller services is that it allows independent scaling of each service, based on the specific demand(s) of the service. This translates in better resource utilization, simplified reasoning and maintenance of the service. Ingestion Service As the name indicates, the purpose of the ingestion service is to ingest the signals from various agents, like Elastic or OTel clients. The ingestion service also performs simple data enrichment to get the most out of the telemetry signals. The scalability requirement of the ingestion service is directly dependent on the number of clients sending data and the data volume. In addition to ingestion, the service performs distributed rate limiting for each customer sending data to the service. Rate limiting prevents sudden bursts from overwhelming the data processing pipeline. Aggregation Service Aggregations, or data roll-ups, are an essential component of the Elastic observability stack. Rolling-up metrics allows Kibana UIs to display telemetry signals for services more efficiently, allowing you to change that time range from minutes to days or years without incurring significant query performance degradation. In essence, it reduces the total number of documents that Elasticsearch has to aggregate, not dissimilar to materialized views in SQL/Database-land. Traditional APM Server performed in-memory aggregations, however, a memory based approach would be insufficient for a multi-project service with auto-scaling capabilities. Also, in-memory aggregation limits didn’t behave optimally since each aggregation type had individual limits to avoid out of memory issues. Since we wanted to solve both these problems at the same time (and after some initial experimenting with persistence implementations in the aggregation flow), we settled on a Log-Structured Merge(LSM)-tree approach using a key-value database, Pebble . This effort eventually materialized in apm-aggregation , a library to perform aggregations that are mostly constrained by disk size, with much smaller memory requirements. LSM-based aggregations were also released in APM Server from 8.10 onwards. We intentionally kept the library open to share the same improvements for self-managed and hosted APM Server. Index Service The indexing process buffers tens, hundreds, or thousands of events, and sends these in batches to Elasticsearch using the _bulk API. While inherently simple, there are some complexities to the process and it required a lot of engineering effort to get right: Data must be reliably indexed into Elasticsearch. There are two major scenarios where retries are necessary to avoid data loss: a. Temporary _bulk request rejections (the entire _bulk request was rejected by Elasticsearch because it can’t service it). b. Temporary individual document rejections (the _bulk requests succeeded, but one or more documents have not been indexed). Indexing must be fair but also proportional to the data volume for different tenants. The first sub-point (1a) was correctly addressed by the go-elasticsearch client’s built-in request retries on specific HTTP status codes . Retrying individual document rejections required a bit more engineering effort and led us to implement document-level retries in the shared indexing component ( go-docappender ) for both APM Server and the index process in the Managed Intake Service. The second point is addressed by the fundamental sharded architecture and the transport that is used between the Managed Intake Service components. In short, each project has a dedicated number of indexing clients to ensure a baseline quality of service. Glue it back together to create the Managed Intake Service While breaking down the service got us closer to our goal, we still had to decide how services were going to communicate and how data was going to flow from one service to another. Traditionally, most microservices communicate using simple HTTP/RPC-based request/response schemes. However, that requires services to always be up, or assume temporary unavailability, where unhealthy status codes are retried by the application, or using something like a service mesh to route requests to the appropriate application instance and potentially rely on status codes to allow the service mesh to transparently handle retries. While we considered this approach, it seemed unnecessarily complicated and brittle once you start considering different failure modes. We researched event processing systems and, unsurprisingly, we started considering the idea of using an event bus or queue as means of communication. Using an event bus instead of a synchronous RPC-based communication system has a lot of advantages for our use case. The main advantage is that it decouples producers and consumers (producers generate data, while consumers receive it and process it). This decoupling is incredibly advantageous for reliability and resilience, and allows asymmetric processing for a time until auto scaling comes into effect. We spent a significant amount of time vetting different event bus technologies and unsurprisingly decided that the strongest contender in many areas was… Kafka ! Using Kafka Since the tried and tested Kafka would be the glue between services, it gave us a high degree of confidence in being able to offer high availability and reliability. The data persistence offered by the event bus allows us to absorb consuming (and producing traffic spikes) delays and push-back from the persistence layer while keeping external clients happy on the ingest path. The next step was making the data pipeline come together. Our initial attempt resulted in Kafka topics for each signal type. Each topic received specific signal types for multiple projects – undoubtedly the most cost efficient approach with the given stack. Initial testing and closed-beta launches saw good performance; however, pagers started to ring, literally, once the number of projects (and data volume) grew. We were seeing alerts for delayed time-to-glass as well as poor indexing performance. While investigating the issue, we quickly discovered that our multi-tenant topics were creating hotspots and noisy neighbor issues. In addition, the indexing service was struggling to meet our SLOs consistently due to Head-Of-Line blocking issues. Taking a step back, we realized that a single tenant model of Elasticsearch requires a higher level of data isolation to guarantee performance, prevent noisy neighbors and eliminate Head-Of-Line blocking issues. We changed topics from multi-project per-event (per-signal type) topic, to per-project multi-event (multi-signal) i.e. each project would get its own topic. The per-project topics provided improved isolation while Elasticsearch autoscaled without affecting other projects. Additionally, given how Kafka partitioning works, it also allows partition scaling to meet increasing data volumes when single consumers are unable to cope with the load. Observing the system A simpler system is always easy to observe and while splitting our services was driven by necessity it also introduced observability challenges. More services may also translate into more (potential) points of failure. To alleviate the issue, we decided to observe our system based on customer impact. To this end, we decided to monitor our services using Service Level Objectives (SLOs) . SLOs gave us the required framework to objectively reason about the performance of our service, but we didn’t stop here. Since our goal was measuring customer impact, we drew out the critical user journeys and designed our SLOs to cover these. The next challenge was implementing SLOs. Fortunately for us, the broader Observability team was working on launching Service Level Objectives (SLOs) and we became one of the first users. To power the Service Level Indicators (SLIs) that power the SLOs, we carefully instrumented our services using a combination of metrics, logs and traces (surprise!). The majority of our SLOs are powered by metrics, since our services are fairly high-throughput services but lower-throughput SLOs are also powered by log sources. Since we focused on the customer’s impact and the different areas where things could go wrong, we had a very broad (and deep) initial instrumentation from the beginning. It greatly facilitated investigating pages and recurring issues in our new ingestion platform. Today, all our user journeys are monitored using SLOs end-to-end allowing us to proactively detect and resolve any issues before it bothers you, our customers. Level up your observability stack with the Managed Intake Service Managed Intake Service aims to level-up Elastic's observability offering by providing a seamless interface for our users to ingest their telemetry signals without thinking about the scale of data or spending business hours computing the infrastructure requirements to reliably host their current and near-future data. The service is live on Elastic Cloud and available to you when you create an Observability project in our serverless offering. Redesigning ingest for our observability platform has been a lot of fun for us and we hope it will help you level up your observability stack. While this blog post covered high-level architecture of Managed Intake Services, there is still much more to talk about. Keep an eye out for future posts where we will delve deeper into individual components. Report an issue Related content Integrations Ingestion +1 March 7, 2025 Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. JR By: Jeffrey Rengifo Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Integrations Ingestion +1 February 19, 2025 Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. AK By: Amit Khandelwal Integrations Ingestion +1 February 18, 2025 Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. JR TM By: Jeffrey Rengifo and Tomás Murúa Ingestion How To February 4, 2025 How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. AL By: Andre Luiz Jump to APM Server primer APM Server as a managed product Multi Tenant APM Server Break down the monolith Ingestion Service Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Architecting the next-generation of Managed Intake Service - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/architecting-managed-intake-service","meta_description":"Learn how Elastic iterated over the existing APM Server architecture to create the Managed Intake Service. Explore improvements made over the APM Server."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Leveraging Kubernetes controller patterns to orchestrate Elastic workloads globally Understand how Kubernetes controller primitives are used at very large scale to power Elastic Cloud Serverless. Elastic Cloud Serverless SG By: Sebastien Guilloux On September 3, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. Let's dive into how Kubernetes controller primitives are used at very large scale to power Elastic Cloud Serverless. As part of engineering our Elastic Cloud Serverless offering, we have been through a major redesign of our Cloud Architecture . The new architecture allows us to leverage the Kubernetes ecosystem for a resilient and scalable offering across 3 Cloud Services Providers, in many regions across the globe. Early on we decided to embrace Kubernetes, as the orchestration backend to run millions of containers for Serverless projects, but also as a composable platform that can be easily extended by our engineers to support various use cases. For example, provisioning new projects on demand, configuring Elasticsearch with object storage accounts, auto-scaling Elasticsearch and Kibana in the most performant and cost-effective way, provisioning fast local SSD volumes on-demand, metering resource usage, authenticating requests, scaling the fleet with more regions and clusters, and many more. Many of the Kubernetes design principles have guided us in that process. We naturally ended up building a set of APIs and controllers that follow Kubernetes conventions, but we also adapted the controller paradigm at a higher level, as the way we orchestrate resources globally. What does it look like in practice? Let's dive deep into some important aspects. Global control plane, regional data plane At a high-level, we can distinguish two high-level components: The global control plane services, responsible for global orchestration of projects across regions and clusters. It also acts as a scheduler and decides which regional Kubernetes cluster will host which Elastic Cloud Serverless project. The regional data plane services, responsible for orchestrating and running workloads at the scope of a single Kubernetes cluster. Each region can be made of several Kubernetes clusters, which we treat as disposable: they can be reconstructed from scratch at any time. This includes resources stored in the Kubernetes API Server, derived from the global control plane state. Both components include a variety of services, though many of them are in practice implemented as APIs and controllers: At the regional data plane level, they materialize as Custom Resource Definitions (CRDs) to declaratively specify entities manipulated by the system. They are stored as Custom Resources in the Kubernetes API Server, continuously reconciled by our custom controllers. At the global control plane level, they materialize as HTTP REST APIs that persist their data in a globally-available, scalable and resilient datastore, alongside controllers that continuously reconcile those resources. Note that, while they look and feel like “normal” Kubernetes controllers, global controllers take their inputs from Global APIs and a global datastore, rather than from the Kubernetes API Server and etcd ! Kubernetes controllers Kubernetes controllers behave in a simple, repeatable and predictable way. They are programmed to watch a source of events. On any event (creation, update, deletion), they always trigger the same reconciliation function. That function is responsible for comparing the desired state of a resource (for example, a 3-nodes Elasticsearch cluster) with the actual state of that resource (the Elasticsearch cluster currently has 2 nodes), and take action to reconcile the actual state towards the desired state (increase the number of replicas of that Elasticsearch deployment). The controller pattern is convenient to work with for various reasons: The level-triggered paradigm is simple to reason about. Asynchronous flows in the system are always encoded in terms of moving towards a declarative desired state, no matter what was the event that led to that desired state. This contrasts with an edge-triggered flow that considers every single variation of state (a configuration change, a version upgrade, a scale-up, a resource creation, etc.), and sometimes leads to modeling complex state machines. It is resilient to failures and naturally leads to a design where resources get self-repaired and self-healed automatically. Missing an intermediate event does not matter, as long as we can guarantee the latest desired state will get processed. Part of the Kubernetes ecosystem, the controller-runtime library comes with batteries included to easily build new controllers, as it abstracts away a number of important technical considerations: interacting with custom resources, watching through the API Server efficiently, caching objects for cheap reads, and enqueueing reconciliations automatically through a workqueue. The workqueue itself holds some interesting properties: Items are de-duplicated. If an item needs to be reconciled twice, due to two consecutive watch events, we just need to ensure the latest version of the object has been processed at least once. Failed reconciliations can easily be retried by appending the item again to the same workqueue. Those retries are automatically rate-limited, with an exponential backoff. The workqueue is populated automatically with all existing resources at controller startup. This ensures all items have been reconciled at least once, and covers for any missed event while the controller was unavailable. Global controller and global datastore The controller pattern and its internal workqueue fit very nicely with the needs of our regional data plane controllers. Conceptually, it would also apply quite well to our global control plane controllers! However, things get a bit more complicated there: As an important design principle for our backend, Kubernetes clusters and their state stored in etcd should be disposable. We want the operational ability to easily recreate and repopulate a Kubernetes cluster from scratch, with no important data or metadata loss. This led us to a strong requirement for our global state datastore: in case of regional failure, we want a multi-region failover with Recovery Point Objective (RPO) of 0, to ensure no data loss for our customers. The apiserver and etcd do not guarantee this by default. Additionally, while the regional data plane Kubernetes clusters have a strict scalability upper bound (they can’t grow larger than we want them to !), data in the global control plane datastore is conceptually unbounded. We want to support running millions of Serverless projects on the platform, and therefore require the datastore to scale along with the persisted data and query load, for a total amount of data that can be much larger than the amount of RAM on a machine. Finally, it is convenient for Global API services to work with the exact same data that global controllers are watching, and be able to serve arbitrarily complex SQL-like queries to fetch that data. The Kubernetes apiserver and etcd, as a key-value store, are not primarily designed for this use case. With these requirements in mind, we decided to not persist the global control plane data in the Kubernetes API, but rather in an external strongly consistent, highly-available and scalable general-purpose database. Fortunately, controller-runtime allows us to easily customize the source stream of events that trigger reconciliations. In just a few lines of code we were able to pipe our own logic of watching events from the global datastore into the controller. With this, our global controllers code largely look like regular Kubernetes controllers, while interacting with a completely different datastore. Workqueue optimizations at large scale What happens once we have 200,000 items in the global datastore that need to be reconciled at startup of a global controller? We can control the concurrency of reconciliations ( MaxConcurrentReconciles=N) , to consume the workqueue in parallel, with uniqueness guarantees that avoid concurrent processing of the same item. The degree of parallelism needs to be carefully thought through. If set too low, processing all items will take a very long time. For example, 200,000 items with an average of 1 second reconciliation duration and MaxConcurrentReconciles=1 means all items will only be processed after 2.3 days. Worse, if a new item gets created during that time, it may only be processed for the first time 2.3 days after creation! On the other hand, if MaxConcurrentReconciles is set too high, processing a very large number of items concurrently will dramatically increase CPU and IO usage, which generally also means increasing infrastructure costs. Can the global datastore handle 200,000 concurrent requests? How much would it then cost? To better address the trade-off, we decided to categorize reconciliations into 3 buckets: Items that need to be reconciled as soon as possible. Because they have been recently created/updated/deleted, and never successfully reconciled since then. These fit in a “high-priority reconciliations” bucket. Items that have already been reconciled successfully at least once with their current specification. The main reason why those need to be reconciled again is to ensure any code change in the controller will eventually be reflected through a reconciliation on existing resources. These can be processed reasonably slowly over time, since there should be no customer impact from delaying their reconciliation. These fit in a “low-priority reconciliations” bucket. Items that we know need to be reconciled at a particular point in time in the future. For example, to respect a soft-deletion period of 30 days, or to ensure their credentials are rotated every 24h. Those fit in a “scheduled reconciliations” bucket. This can be implemented through a similar mechanism as Generation and ObservedGeneration in some Kubernetes resources. On any change in the specification of a project, we persist the revision of that specification (a monotonically increasing integer). At the end of a successful reconciliation, we persist the revision that was successfully reconciled. To know whether an item deserves to be reconciled immediately, hence be put in the high-priority bucket, we can compare both revisions values. In case the reconciliation failed, it is enqueued again for being retried. We can then plug the controller reconciliation event handling logic to append the item in the appropriate workqueue. A separate low-priority workqueue is consumed asynchronously at a fixed rate (for example, 300 items per minute). Those consumed items then get appended to the main high-priority workqueue for immediate processing by the available controller workers. The controller then in practice works with two different workqueues. Since both rely on regular controller-runtime workqueues implementations, we benefit from built-in Prometheus metrics. Those allow monitoring the depth of the low-priority workqueue, for example, as a good signal of how many controller startup reconciliations are still pending. And the additions rate to the high priority workqueue, which indicates how much “urgent” work we're asking from the controller. Conclusion Kubernetes is a fascinating project. We have taken a lot of inspiration from its design principles and extension mechanisms. For some of them, extending their scope beyond the ability to manage resources in a single Kubernetes cluster, towards the ability to work with a highly-scalable datastore, with controllers able to reconcile resources in thousands of Kubernetes clusters across Cloud Service Providers regions. It has proven to be a fundamental part of our internal platform at Elastic, and allows us to develop and deliver new services and features to Elastic users every day. Stay tuned for more technical details in future posts. You can also check out talks by Elastic engineers at the last Kubecon + CloudNativeCon 2024: Building a Large Scale Multi-Cloud Multi-Region Saas Platform with Kubernetes Controllers Platform Engineering with the Argo Ecosystem: the Elastic story . Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Global control plane, regional data plane Kubernetes controllers Global controller and global datastore Workqueue optimizations at large scale Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Leveraging Kubernetes controller patterns to orchestrate Elastic workloads globally - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/kubernetes-controller-power-elastic-serverless-workloads","meta_description":"Understand how Kubernetes controller primitives are used at very large scale to power Elastic Cloud Serverless."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. Search Relevance DW By: Daniel Wrigley On May 26, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. The creation of judgement lists is a crucial step in optimizing search result quality, but it can be a complicated and difficult task. A judgement list is a curated set of search queries paired with relevance ratings for their corresponding results, also known as a test collection. Metrics computed using this list act as a benchmark for measuring how well a search engine performs. To help streamline the process of creating judgement lists, the OpenSource Connections team developed Quepid . Judgement can either be explicit or based on implicit feedback from users. This blog will guide you through setting up a collaborative environment in Quepid to effectively enable human raters to do explicit judgements, which is the foundation of every judgement list. Quepid supports search teams in the search quality evaluation process: Build query sets Create judgement lists Calculate search quality metrics Compare different search algorithms/rankers based on calculated search quality metrics For our blog, let's assume that we are running a movie rental store and have the goal of improving our search result quality. Prerequisites This blog uses the data and the mappings from the es-tmdb repository . The data is from The Movie Database . To follow along, set up an index called tmdb with the mappings and index the data. It doesn’t matter if you set up a local instance or use an Elastic Cloud deployment for this - either works fine. We assume an Elastic Cloud deployment for this blog. You can find information about how to index the data in the README of the es-tmdb repository . Do a simple match query on the title field for rocky to confirm you have data to search in: You should see 8 results. Log into Quepid Quepid is a tool that enables users to measure search result quality and run offline experiments to improve it. You can use Quepid in two ways: either use the free, publicly available hosted version at https://app.quepid.com , or set up Quepid on a machine you have access to. This post assumes you are using the free hosted version. If you want to set up a Quepid instance in your environment, follow the Installation Guide . Whichever setup you choose, you’ll need to create an account if you don’t already have one. Set up a Quepid Case Quepid is organized around \"Cases.\" A Case stores queries together with relevance tuning settings and how to establish a connection to your search engine. For first-time users, select Create Your First Relevancy Case . Returning users can select Relevancy Cases from the top-level menu and click + Create a case . Name your case descriptively, e.g., \"Movie Search Baseline,\" as we want to start measuring and improving our baseline search. Confirm the name by selecting Continue . Next, we establish a connection from Quepid to the search engine. Quepid can connect to a variety of search engines, including Elasticsearch. The configuration will differ depending on your Elasticsearch and Quepid setup. To connect Quepid to an Elastic Cloud deployment, we need to enable and configure CORS for our Elastic Cloud deployment and have an API key ready. Detailed instructions are in the corresponding how-to on the Quepid docs . Enter your Elasticsearch endpoint information ( https://YOUR_ES_HOST:PORT/tmdb/_search ) and any additional information necessary to connect (the API key in case of an Elastic Cloud deployment in the Advanced configuration options), test the connection by clicking on ping it and select Continue to move to the next step. Now we define which fields we want to see displayed in the case. Select all that help our human raters later assess the relevance of a document for a given query. Set title as the Title Field , leave _id as the ID Field , and add overview, tagline, cast, vote_average, thumb:poster_path as Additional Display Fields . The last entry displays small thumbnail images for the movies in our results to visually guide us and the human raters. Confirm the display settings by selecting the Continue button. The last step is adding search queries to the case. Add the three queries star wars , harrison ford , and best action movie one by one via the input field and Continue . Ideally, a case contains queries that represent real user queries and illustrate different types of queries. For now, we can imagine star wars being a query representing all queries for movie titles, harrison ford a query representing all queries for cast members, and best action movie a query representing all queries that search for movies in a specific genre. This is typically called a query set. In a production scenario, we would sample queries from event tracking data by applying statistical techniques like Probability-Proportional-to-Size sampling and import these sampled queries into Quepid to include queries from the head (frequent queries) and tail (infrequent queries) relative to their frequency, which means we bias towards more frequent queries without excluding rare ones. Finally, select Finish and you will be forwarded to the case interface where you see the three defined queries. Queries and Information Needs To arrive at our overarching goal of a judgement list, human raters will need to judge a search result (typically a document) for a given query. This is called a query/document pair. Sometimes, it seems easy to know what a user wanted when looking at the query. The intention behind the query harrison ford is to find movies starring Harrison Ford, the actor. What about the query action ? I know I’d be tempted to say the user’s intention is to find movies belonging to the action genre. But which ones? The most recent ones, the most popular ones, the best ones according to user ratings? Or does the user maybe want to find all movies that are called “Action”? There are at least 12 (!) movies called “Action” in The Movie Database and their names mainly differ in the number of exclamation marks in the title. Two human raters may differ in interpreting a query where the intention is unclear. Enter the Information Need: An Information Need is a conscious or unconscious desire for information. Defining an information need helps human raters judge documents for a query, so they play an important role in the process of building judgement lists. Expert users or subject matter experts are good candidates for specifying information needs. It is good practice to define information needs from the perspective of the user, as it's their need the search results should fulfill. Information needs for the queries of our “Movies Search Baseline” case: star wars : The user wants to find movies or shows from the Star Wars franchise. Potentially relevant are documentaries about Star Wars. harrison ford : The user wants to find movies starring the actor Harrison Ford. Potentially relevant are movies where Harrison Ford has a different role, like narrator. best action movie : The user wants to find action movies, preferably the ones with high average user votes. Define Information Needs in Quepid To define an information need in Quepid, access the case interface: 1. Open a query (for example s tar wars ) and select Toggle Notes. 2. Enter the Information Need in the first field and any additional notes in the second field: 3. Click Save . For a handful of queries, this process is fine. However, when you expand your case from three to 100 queries (Quepid cases are often in the range of 50 to 100 queries) you may want to define information needs outside of Quepid (for example, in a spreadsheet) and then upload them via Import and select Information Needs . Create a Team in Quepid and Share your Case Collaborative judgements enhance the quality of relevance assessments. To set up a team: 1. Navigate to Teams in the top-level menu. 2. Click + Add New , enter a team name (for example, \"Search Relevance Raters\"), and click Create . 3. Add members by typing their email addresses and clicking Add User . 4. In the case interface, select Share Case . 5. Choose the appropriate team and confirm. Create a Book of Judgements A Book in Quepid allows multiple raters to evaluate query/document pairs systematically. To create one: 1. Go to Judgements in the case interface and click + Create a Book . 2. Configure the book with a descriptive name, assign it to your team, select a scoring method (for example, DCG@10), and set the selection strategy (single or multiple raters). Use the following settings for the Book: Name : “Movies Search 0-3 Scale” Teams to Share this Book With : Check the box with the Team you created Scorer : DCG@10 3. Click Create Book. The name is descriptive and contains information about what is searched in (“Movies”) and also the scale of the judgements (“0-3”). The selected Scorer DCG@10 defines the way the search metric will be calculated. “DCG” is short for Discounted Cumulative Gain and“@10” is the number of results from the top taken into consideration when the metric is calculated. In this case, we are using a metric that measures the information gain and combines it with positional weighting. There may be other search metrics that are more suitable for your use case and choosing the right one is a challenge in itself . Populate the Book with Query/Document Pairs In order to add query/document pairs for relevance assessment, follow these steps: 1. In the case interface, navigate to \"Judgements.\" 2. Select your created book. 3. Click \"Populate Book\" and confirm by selecting \"Refresh Query/Doc Pairs for Book.\" This action generates pairs based on the top search results for each query, ready for evaluation by your team. Let your Team of Human Raters Judge So far, the completed steps were fairly technical and administrative. Now that this necessary preparation is done, we can let our team of judges do their work. In essence, the judge’s job is to rate the relevance of one particular document for a given query. The result of this process is the judgement list that contains all relevance labels for the judged query document pairs. Next, this process and the interface for it are explained in further detail. Overview of the Human Rating Interface Quepid's Human Rating Interface is designed for efficient assessments: Query: Displays the search term. Information Need: Shows the user's intent. Scoring Guidelines: Provides instructions for consistent evaluations. Document Metadata: Presents relevant details about the document. Rating Buttons: Allows raters to assign judgements with corresponding keyboard shortcuts. Using the Human Rating Interface As a human rater, I access the interface via the book overview: 1. Navigate to the case interface and click Judgements . 2. Click on More Judgements are Needed! . The system will present a query/document pair that has not been rated yet and that requires additional judgements. This is determined by the Book’s selection strategy: Single Rater : A single judgement per query/doc pair. Multiple Raters : Up to three judgements per query/doc pair. Rating Query/Doc Pairs Let’s walk through a couple of examples. When you are following this guide, you will most likely be presented with different movies. However, the rating principles stay the same. Our first example is the movie “Heroes” for the query harrison ford : We first look at the query, followed by the information need and then judge the movie based on the metadata given. This movie is a relevant result for our query, since Harridson Ford is in its cast. We may regard more recent movies as more relevant subjectively but this is not part of our information need. So we rate this document with “Perfect” which is a 3 in our graded scale. Our next example is the movie “Ford v Ferrari” for the query harrison ford : Following the same practice, we judge this query/doc by looking at the query, the information need and then how well the document’s metadata matches the information need. This is a poor result. We probably see this result as one of our query terms, “ford”, matches in the title. But Harrison Ford plays no role in this movie, nor any other role. So we rate this document “Poor” which is a 0 in our graded scale. Our third example is the movie “Action Jackson” for the query best action movie : This looks like an action movie, so the information need is at least partially met. However, the vote average is 5.4 out of 10. And that makes this movie probably not the best action movie in our collection. This would lead me as a judge to rate this document “Fair,” which is a 1 in our graded scale. These examples illustrate the process of rating query/doc pairs with Quepid in particular, on a high level and also in general. Best Practices Human Raters The shown examples might make it seem straightforward to get to explicit judgements. But setting up a reliable human rating program is no easy feat. It’s a process filled with challenges that can easily compromise the quality of your data: Human raters can become fatigued from repetitive tasks. Personal preferences may skew judgements. Levels of domain expertise vary from judge to judge. Raters often juggle multiple responsibilities. The perceived relevance of a document may not match its true relevance to a query. These factors can result in inconsistent, low-quality judgements. But don’t worry - there are proven best practices that can help you minimize these issues and build a more robust and reliable evaluation process: Consistent Evaluation: Review the query, information need, and document metadata in order. Refer to Guidelines: Use scoring guidelines to maintain consistency. Scoring guidelines can contain examples of when to apply which grade which illustrates the process of judging. Having a check in with human raters after the first batch of judgements proved to be a good practice to learn about challenging edge cases and where additional support is needed. Utilize Options: If uncertain, use \"I Will Judge Later\" or \"I Can’t Tell,\" providing explanations when necessary. Take Breaks: Regular breaks help maintain judgement quality. Quepid helps with regular breaks by popping confetti whenever a human rater finishes a batch of judgements. By following these steps, you establish a structured and collaborative approach to creating judgement lists in Quepid, enhancing the effectiveness of your search relevance optimization efforts. Next Steps Where to go from here? Judgement lists are but one foundational step towards improving search result quality. Here are the next steps: Calculate Metrics and Start Experimenting Once judgement lists are available, leveraging the judgements and calculating search quality metrics is a natural progression. Quepid automatically calculates the configured metric for the current case when judgements are available. Metrics are implemented as “Scorers” and you can provide your own when the supported ones do not include your favorite! Go to the case interface, navigate to Select Scorer , choose DCG@10 and confirm by clicking on Select Scorer . Quepid will now calculate DCG@10 per query and also average overall queries to quantify the search result quality for your case. Now that your search result quality is quantified, you can run first experiments. Experimentation starts with generating hypotheses. Looking at the three queries in the screenshot after doing some rating makes it obvious that the three queries perform very differently in terms of their search quality metric: star wars performs pretty well, harrison ford looks alright but the greatest potential lies in best action movie . Expanding this query we see its results and can dive into the nitty gritty details and explore why documents matched and what influences their scores: By clicking on “Explain Query” and entering the “Parsing” tab we see that the query is a DisjunctionMaxxQuery searching across three fields: cast , overview and title : Typically, as search engineers we know some domain-specifics about our search platform. In this case, we may know that we have a genres field. Let’s add that to the query and see if search quality is improved. We use the Query Sandbox that opens when selecting Tune Relevance in the case interface. Go ahead and explore this by adding the genres field you search in: Click Rerun My Searches! And view the results. Have they changed? Unfortunately not. We now have a lot of options to explore, basically all query options Elasticsearch offers: We could increase the field weight on the genres field. We could add a function that boosts documents by their vote average. We could create a more complex query that only boosts documents by their vote average if there is a strong genres match. … The best thing about having all these options and exploring them in Quepid is that we have a way of quantifying the effects not only on the one query we try to improve but all queries we have in our case. That prevents us from improving one underperforming query by sacrificing search result quality for others. We can iterate fast and cheap and validate the value of our hypothesis without any risk, making offline experimentation a fundamental capability of all search teams. Measure Inter-Rater Reliability Even with task descriptions, information needs, and a human rater interface like the one Quepid provides, human raters can disagree. Disagreement per se is no bad thing, quite the contrary: measuring disagreement can surface issues that you may want to tackle. Relevance can be subjective, queries can be ambiguous, and data can be incomplete or incorrect. Fleiss’ Kappa is a statistical measure for the agreement among raters and there is an example notebook in Quepid you can use. To find it, select Notebooks in the top-level navigation and select the notebook Fleiss Kappa.ipynb in the examples folder. Conclusion Quepid empowers you to tackle even the most complex search relevance challenges and continues to evolve: as of version 8 Quepid supports AI-generated judgements , which is particularly useful for teams who want to scale their judgement generation process. Quepid workflows enable you to efficiently create judgement lists that are scalable–which ultimately results in search results that truly meet user needs. With judgement lists established, you have a robust foundation for measuring search relevance, iterating on improvements, and driving better user experiences. As you move forward, remember that relevancy tuning is an ongoing process. Judgement lists allow you to systematically evaluate your progress, but they are most powerful when paired with experimentation, metric analysis, and iterative improvements. Further Reading Quepid docs: Relevancy is a Team Sport Quepid for Human Raters How to Connect Quepid to Elastic Cloud Quepid Github repository Meet Pete, a blog series on improving e-commerce search Relevance Slack : join the #quepid channel Partner with Open Source Connections to transform your search and AI capabilities and empower your team to continuously evolve them. Our proven track record spans the globe, with clients consistently achieving dramatic improvements in search quality, team capability, and business performance. Contact us today to learn more. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to Prerequisites Log into Quepid Set up a Quepid Case Queries and Information Needs Define Information Needs in Quepid Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Creating Judgement Lists with Quepid - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/quepid-judgement-lists","meta_description":"Creating judgement lists in Quepid with a collaborative human rater process."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Excluding Elasticsearch fields from indexing Explaining how to configure Elasticsearch to exclude fields, why you might want to do this, and best practices to follow. How To KB By: Kofi Bartlett On May 12, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In Elasticsearch, indexing refers to the process of storing and organizing data in a way that makes it easily searchable. While indexing all fields in a document can be useful in some cases, there are situations where you might want to exclude certain fields from being indexed. This can help improve performance, reduce storage costs, and minimize the overall size of your Elasticsearch index. In this article, we will discuss the reasons for excluding fields from indexing, how to configure Elasticsearch to exclude specific fields, and some best practices to follow when doing so. Reasons for excluding fields from indexing Performance: Indexing all fields in a document can lead to increased indexing time and slower search performance. By excluding fields that are not required for search or aggregation, you can improve the overall performance of your Elasticsearch cluster. Storage: Indexing fields consumes storage space. Excluding fields that are not needed for search or aggregation can help reduce the storage requirements of your Elasticsearch cluster. Index size: The size of an Elasticsearch index is directly related to the number of fields indexed. By excluding unnecessary fields, you can minimize the size of your index, which can lead to faster search and indexing performance. Configuring Elasticsearch to exclude fields To exclude a field from being indexed in Elasticsearch, you can use the “index” property in the field’s mapping. By setting the “index” property to “false”, Elasticsearch will not index the field, and it will not be searchable or available for aggregations. Here’s an example of how to exclude a field from indexing using the Elasticsearch mapping: In this example, we’re creating a new index called “my_index” with a single field called “field_to_exclude”. By setting the “index” property to “false”, we’re telling Elasticsearch not to index this field. The field will still be available in the source document, though. Best practices for excluding fields from indexing Analyze your data: Before excluding fields from indexing, it’s essential to analyze your data and understand which fields are necessary for search and aggregation. This will help you make informed decisions about which fields to exclude. Test your changes: When excluding fields from indexing, it’s crucial to test your changes to ensure that your search and aggregation functionality still work as expected. This can help you avoid any unexpected issues or performance problems. Monitor performance: After excluding fields from indexing, monitor the performance of your Elasticsearch cluster to ensure that your changes have had the desired effect. This can help you identify any additional optimizations that may be required. Use source filtering: If you need to store a field in Elasticsearch but don’t want it to be searchable or available for aggregations, consider using source filtering. This allows you to store the field in the _source field but exclude it from the index. Conclusion Excluding fields from indexing in Elasticsearch can help improve performance, reduce storage costs, and minimize the overall size of your index. By carefully analyzing your data and understanding which fields are necessary for search and aggregation, you can make informed decisions about which fields to exclude. Always test your changes and monitor the performance of your Elasticsearch cluster to ensure that your optimizations have the desired effect. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Reasons for excluding fields from indexing Configuring Elasticsearch to exclude fields Best practices for excluding fields from indexing Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Excluding Elasticsearch fields from indexing - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/excluding-elasticsearch-fields-from-indexing","meta_description":"Explaining how to configure Elasticsearch to exclude fields, why you might want to do this, and best practices to follow."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. Elastic Cloud Serverless DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more On December 2, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. The advent of Elastic Cloud Serverless has reshaped how businesses can harness the power of Elasticsearch without the need to manage clusters, nodes, or resource scaling. A key innovation within Elastic Cloud Serverless is its autoscaling feature, which adapts to changes in workload and traffic in real-time. This post explores the technicalities behind autoscaling, the performance of Elastic Cloud Serverless under load, and the results from extensive stress testing. What is Elastic Cloud Serverless? Elastic Cloud Serverless offers an automated, managed version of Elasticsearch that scales based on demand. Unlike traditional Elasticsearch deployments, where users must provision and manage hardware or cloud instances, Elastic Cloud Serverless manages infrastructure scaling and resource allocation. This is particularly beneficial for organizations with variable workloads, where scaling infrastructure up and down manually can be cumbersome and error-prone. The system’s built-in autoscaling feature accommodates heavy ingestion tasks, search queries, and other operations without manual intervention. Elastic Cloud Serverless operates with two distinct tiers, the search and indexing tiers, each optimized for specific workloads. The search tier is dedicated to handling query execution, ensuring fast and efficient responses for search requests. Meanwhile, the indexing tier is responsible for ingesting and processing data, managing write operations, and ensuring data is properly stored and searchable. By decoupling these concerns, Elastic Cloud Serverless allows each tier to scale independently based on workload demands. This separation improves resource efficiency, as compute and storage needs for indexing (e.g., handling high-throughput ingestion) do not interfere with query performance during search operations. Similarly, search tier resources can be scaled up to handle complex queries or spikes in traffic without impacting the ingestion process. This architecture ensures optimal performance, cost-efficiency, and resilience, allowing Elastic Cloud Serverless to adapt dynamically to fluctuating workloads while maintaining consistent user experiences. You can read more about the architecture of Elastic Cloud Serverless in the following blog post . Stress testing Elastic Cloud Serverless Comprehensive stress tests assessed Elastic Cloud Serverless’s capability to handle large, fluctuating workloads. These tests were designed to measure the system’s ability to ingest data, handle search queries, and maintain performance under extreme conditions. It should be noted that the system can perform beyond what we present here, depending on factors such as client count and bulk index sizes. Here, we’ll walk through the approach and findings of these tests. Testing scope and approach The primary objective of our stress testing was to answer key questions: How well does Elastic Cloud Serverless handle large-scale ingestion and search queries with a high number of concurrent clients? Can it scale dynamically to accommodate sudden spikes in workload? Does the system maintain stability over extended periods? Stress testing a search use case In Elastic Cloud Serverless, you can choose from three project types: Elasticsearch, Observability, and Security. We began our stress test journey on search use cases for Elasticsearch, using a Github Archive dataset and simulating likely ingest and search behaviors. Before testing, we prepared the system by ingesting a base corpus of 186GB / 43 million documents. We then gradually added clients over ten minutes to allow Elasticsearch the time to scale appropriately. The data was ingested using Datastreams via the Bulk APIs. Stress testing the indexing tier. Firstly, let's talk about indexing data (ingest). Ingest autoscaling in Elastic Cloud Serverless dynamically adjusts resources to match data ingestion demands, ensuring optimal performance and cost-efficiency. The system continuously monitors metrics such as ingestion throughput, resource utilization (CPU, memory, and network), and response latencies. When these metrics exceed predefined thresholds, the autoscaler provisions additional capacity proportionally to handle current and anticipated demand while maintaining a buffer for unexpected spikes. The complexity of data pipelines and system-imposed resource limits also influences scaling decisions. By dynamically adding or removing capacity, ingest autoscaling ensures seamless scaling without manual intervention. In autoscaled systems like Elastic Cloud Serverless, where resource efficiency is optimized, there may be situations where a sudden, massive increase in workload exceeds the capacity of the system to scale immediately. In such cases, clients may receive HTTP 429 status codes, indicating that the system is overwhelmed. To handle these situations, clients should implement an exponential backoff strategy, retrying requests at progressively longer intervals. During stress testing, we actively track 429 responses to assess how the system reacts under high demand, providing valuable insights into autoscaling effectiveness.You can read a more in-depth blog post on how we autoscale indexing here . Now, let’s look at some of the results we encountered in our stress testing of the indexing tier. Indexing while scaling up: Corpus Bulk Size Actual Volume Indexing Period (minutes) Volume / hr Median Throughput (docs/s) 90th PCT Indexing latency (seconds) Avg. % Error Rate (429s, other) 1TB 2500 1117.43 GB 63 1064.22 GB 70,256.96 7.095 0.05% 2TB 2500 2162.02 GB 122 1063.29 GB 68,365.23 8.148 0.05% 5TB 2500 5254.84 GB 272 1159.16 GB 74,770.27 7.46 0 For initial tests with 1TB and 2TB corpus , we achieved a throughput of 1064 GB/hr and 1063 GB/hr , respectively. For 5TB we achieved higher at 1160 GB / hr ingest , as we observed the ingest tier continued to scale up, providing a better throughput. Indexing while fully scaled: Clients Bulk Size Actual Volume Duration Volume / hr Median Throughput (docs/s) 99th PCT Indexing latency (seconds) Avg. % Error Rate (429s, other) 3,000 2,000 1 TB 8 minutes 7.5 TB 499,000 33.5 0.0% When working with a maximally scaled indexing tier, ECS ingested 1TB of data in 8 minutes , at a rate of ~499K docs/s indexed per second. This equates to an extrapolated capacity of 180TB daily . Indexing from minimal scale to maximum scale: Clients Bulk Size Actual Volume Duration Volume / hr Median Throughput (docs/s) 99th PCT Indexing latency (seconds) Avg. % Error Rate (429s, other) 2,048 1,000 13 TB 6 hours 2.1 TB 146,478 55.5 1.55% During tests with 2TB of data , we gradually scaled up to 2048 clients and managed to ingest data at a rate of 146K docs/s , completing 2TB of data in 1 hour . Extrapolated, this would result in 48TB per day . 72-Hour Stability Test: Clients Bulk Size Actual Volume Indexing Period (hours) Volume / hr Median Throughput (docs/s) 99th PCT Indexing latency (seconds) Avg. % Error Rate (429s, other) 128 500 61 TB 72 ~868.6 GB 51,700 7.7 <0.05% In a 72-hour stability test , we ingested 60TB of data with 128 clients . Elasticsearch maintained an impressive 870GB/hr throughput with minimal error rates while scaling the indexing and search tiers. This demonstrated Elasticsearch’s ability to sustain high throughput over extended periods with low failure rates. Stress testing the search tier. Search tier autoscaling in Elastic Cloud Serverless dynamically adjusts resources based on dataset size and search load to maintain optimal performance. The system classifies data into two categories: boosted and non-boosted. Boosted data includes time-based documents (documents with an @timestamp field) within a user-defined boost window and all non-time-based documents, while non-boosted data falls outside this window. Users can set a boost window to define the time range for boosted data and select a search power level—On-demand, Performant, or High-throughput—to control resource allocation. You can read more about configuring Search Power & Search Boost Windows here . The autoscaler monitors metrics such as query latency, resource utilization (CPU and memory), and query queue lengths. When these metrics indicate increased demand, the system scales resources accordingly. This scaling is performed on a per-project basis and is transparent to the end user. Search stability under load: Corpus Actual Volume (from corpus tab) Duration Average Search Rate (req/s) Max Search Rate (req/s) Response Time (P50) Response Time (P99) 5TB 5254.84 GB 120 minutes 891 3,158 36 ms 316 ms With 5TB of data , we tested a set of 8 searches running over 2 hours, including complex queries, aggregations & ES|QL. Clients were ramped up from 4 to 64 clients per search. In total there were between 32 and 512 clients performing searches. Performance remained stable as the number of clients increased from 32 to 512. When running with 512 clients, we observed a search request rate of 3,158 queries per second with a P50 response time of 36ms . Throughout the test we observed the search tier scaling as expected to meet demand. 24-hour search stability test: Corpus Actual Volume Duration Average Search Rate (req/s) Max Search Rate (req/s) Response Time (P50) Response Time (P99) 40TB 60 TB 24 hours 183 250 192 ms 520 ms A set of 7 searches, aggregations, and an ES|QL query were used to query 40TB of (mainly) boosted data. The number of clients was ramped up from 1 to 12 per search, totaling 7 to 84 search clients. With Search Power set to balanced, we observed 192ms (P50) response time. You can read more about configuring Search Power & Search Boost Windows here . Concurrent index and search In tests that ran simultaneous indexing and searching , we aimed to ingest 5TB in 6 “chunks.” We ramped up from 24 to 480 clients ingesting data with a bulk size of 2500 documents. For search, clients were ramped up from 2 to 40 per search. In total, between 16 and 320 clients performed searches. We observed both tiers autoscaling and saw search latencies consistently around 24ms (p50) and 1359ms (p99). The system’s ability to index and search concurrently while maintaining performance is critical for many use cases. Conclusion The stress tests discussed above focused on a search use case in an Elasticsearch project designed with a specific configuration of field types, number of fields, clients, and bulk sizes. These parameters were tailored to evaluate Elastic Cloud Serverless under well-defined conditions relevant to the use case, providing valuable insights into its performance. However, it's important to note that the results may not directly reflect your workload, as performance depends on various factors such as query complexity, data structure, and indexing strategies. These benchmarks serve as a baseline, but real-world outcomes will vary depending on your unique use case and requirements. It should also be noted that these results do not represent an upper performance bound. The key takeaway from our stress testing is that Elastic Cloud Serverless demonstrates remarkable robustness. It can ingest hundreds of terabytes of data daily while maintaining strong search performance. This makes it a powerful solution for large-scale search workloads, ensuring reliability and efficiency at high data volumes. In upcoming posts, we will expand our exploration into stress testing Elastic Cloud Serverless for observability and security use cases, highlighting its versatility across different application domains and providing deeper insights into its capabilities. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Elastic Cloud Serverless Ingestion September 20, 2024 Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. VR MR By: Vishal Raj and Marc Lopez Rubio Jump to What is Elastic Cloud Serverless? Stress testing Elastic Cloud Serverless Testing scope and approach Stress testing a search use case Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-serverless-performance-stress-testing","meta_description":"Dive into Elasticsearch Cloud Serverless, explore its performance under real-world conditions and see the results from extensive stress testing."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Inside Elastic Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Inside Elastic August 9, 2024 GenAI for Customer Support — Part 3: Designing a chat interface for chatbots... for humans This series gives you an inside look at how we’re using generative AI in customer support. Join us as we share our journey in designing a GenAI chatbot interface. IM By: Ian Moersen Inside Elastic July 22, 2024 GenAI for Customer Support — Part 2: Building a Knowledge Library This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! CM By: Cory Mangini Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Inside Elastic - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/inside-elastic","meta_description":"Inside Elastic articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Search Relevance Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance How To March 27, 2025 How to automate synonyms and upload using our Synonyms API Discover how LLMs can be used to identify and generate synonyms automatically, allowing terms to be programmatically loaded into the Elasticsearch synonym API. AL By: Andre Luiz Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Search Relevance Vector Database +1 March 18, 2025 Searching complex documents with ColPali - part 1 The article introduces the ColPali model, a late-interaction model that simplifies the process of searching complex documents with images and tables, and discusses its implementation in Elasticsearch. PS BT By: Peter Straßer and Benjamin Trent 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Search Relevance - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/search-relevance","meta_description":"Search Relevance articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. How To KB By: Kofi Bartlett On May 16, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Disk management is important in any database, and Elasticsearch is no exception. If you don’t have enough disk space available, Elasticsearch will stop allocating shards to the node. This will eventually prevent you from being able to write data to the cluster, with the potential risk of data loss in your application. On the other hand, if you have too much disk space, then you are paying for more resources than you need. Background on watermarks There are various “watermark” thresholds on your Elasticsearch cluster which help you track the available disk space. As the disk fills up on a node, the first threshold to be crossed will be the “low disk watermark”. The second threshold will then be the “high disk watermark threshold”. Finally, the “disk flood stage” will be reached. Once this threshold is passed, the cluster will then block writing to ALL indices that have one shard (primary or replica) on the node which has passed the watermark. Reads (searches) will still be possible. How to prevent and handle cases when disk is too full (over utilization) There are various methods for handling cases when your Elasticsearch disk is too full: Delete old data: Usually, data should not be kept indefinitely. One way to prevent and solve disk being too full is by ensuring that when data reaches a certain age, it gets reliably archived and deleted. One way to do this is to use ILM . Add storage capacity: If you cannot delete the data, you might want to add more data nodes or increase the disk sizes in order to retain all the data without negatively affecting performance. If you need to add storage capacity to the cluster, you should consider whether you need to add just storage capacity alone, or both storage capacity and also RAM and CPU resources in proportion (see section on ratio of disk size, RAM and CPU below). How to add storage capacity to your Elasticsearch cluster Increase the number of data nodes: Remember that the new nodes should be of the same size as existing nodes, and of the same Elasticsearch version. Increase the size of existing nodes: In cloud-based environments, it is usually easy to increase disk size and RAM/CPU on existing nodes. Increase only the disk size: In cloud-based environments, it is often relatively easy to increase disk size. Snapshot and restore : If you are willing to allow old data to be retrieved upon request in an automated process from backups, you can snapshot old indices, delete them and restore data temporarily upon request from the snapshots. Reduce replicas per shard: Another option to reduce data is to reduce the number of replicas of each shard. For high availability, you would like to have one replica per shard, but when data grows older, you might be able to work without replicas. This could usually work if the data is persistent, or you have a backup to restore if needed. Create alerts: In order to prevent disks from filling up in the future and act proactively, you should create alerts based on disk usage that will notify you when the disk starts filling up. How to prevent and handle cases when the disk capacity is underutilized If your disk capacity is underutilized, there are various options to reduce the storage volume on your cluster. How to reduce the storage volume on an Elasticsearch cluster There are various methods for how to reduce the storage volume of a cluster. 1. Reduce the number of data nodes If you want to reduce data storage and also reduce RAM and CPU resources in the same proportion, then this is the easiest strategy. Decommissioning unnecessary nodes is likely to provide the greatest cost savings. Before decommissioning the node, you should: Ensure that the node to be decommissioned is not necessary as a MASTER node. You should always have at least three nodes with the MASTER node role. Migrate the data shards away from the node to be decommissioned. 2. Replace existing nodes with smaller nodes If you cannot further reduce the number of nodes (usually 3 would be a minimum configuration), then you may want to downsize existing nodes. Remember that it is advisable to ensure that all data nodes are of the same RAM memory and disk size, since the shards balance on the basis of number of shards per node. The process would be: Add new, smaller nodes to the cluster Migrate the shards away from the nodes to be decommissioned Shut down the old nodes 3. Reduce disk size on nodes If you ONLY want to reduce disk size on the nodes without changing the cluster’s overall RAM or CPU, then you can reduce the disk size for each node. Reducing disk size on an Elasticsearch node is not a trivial process. The easiest way to do so would usually be to: Migrate shards from the node Stop the node Mount a new data volume to the node with appropriate size Copy all data from old disk volume to new volume Detach old volume A Start node and migrate shards back to node This requires that you have sufficient capacity on the other nodes to temporarily store the extra shards from the node during this process. In many cases, the cost of managing this process may exceed the potential savings in disk usage. For this reason, it may be simpler to replace the node altogether with a new node with the desired disk size (see “Replace existing nodes with smaller nodes” above). When paying for unnecessary resources, cost can obviously be reduced by optimizing your resource utilization. The relationship between disk size, RAM and CPU The ideal ratio of disk capacity to RAM in your cluster will depend on your particular use case. For this reason, when considering changes to your storage capacity, you also should consider whether your current Disk/RAM/CPU ratios are suitably balanced and whether as a consequence you also need to add/reduce RAM/CPU in the same proportion. RAM and CPU requirements depend on the volume of indexing activity, the number and type of queries, and also the amount of data that is being searched and aggregated. This is often in proportion to the amount of data being stored on the cluster, and therefore should also be related to disk size. The ratio between the disk capacity and the RAM can change based on the use case. See a few examples here: Index activity Retention Search activity Disk capacity RAM Enterprise search app Moderate log ingestion Long Light 2TB 32GB App monitoring Intensive log ingestion Short Light 1TB 32GB E-commerce Light data indexing Indefinite Heavy 500GB 32GB Remember that modifying the configuration of node machines must be done with care, since it may involve node downtime and you need to ensure that shards do not start to migrate to your other already over-stretched nodes. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 14, 2025 Elasticsearch Index Number_of_Replicas Explaining how to configure the number_of_replicas, its implications and best practices. KB By: Kofi Bartlett Jump to Background on watermarks How to prevent and handle cases when disk is too full (over utilization) How to add storage capacity to your Elasticsearch cluster How to prevent and handle cases when the disk capacity is underutilized How to reduce the storage volume on an Elasticsearch cluster Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to optimize Elasticsearch disk space and usage - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/optimize-elasticsearch-disk-space-and-usage","meta_description":"Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. Integrations EZ FB By: Enrico Zimuel and Florian Bernd On May 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In collaboration with the Microsoft Semantic Kernel team, we are announcing the availability of hybrid search capabilities in the .NET Elasticsearch Semantic Kernel connector – the first vector database to implement this capability. Microsoft Semantic Kernel recently announced support of hybrid search use cases, which opened the door for customers to use Elasticsearch for a broader set of applications. Elasticsearch has supported hybrid search since version 8.8.0, and in this article we will walk through how to use hybrid search with Elasticsearch and Semantic Kernel. You can find the latest version of the Elasticsearch Semantic Kernel connector with Hybrid Search support here . If you are not familiar with the Elasticsearch integration in Semantic Kernel for .NET, we suggest reading this article that we previously published. What is Hybrid Search? Hybrid Search is a powerful information retrieval strategy that combines two or more search techniques into a search algorithm. A typical use case is the combination of lexical search (i.e. BM25 ) combined with semantic search (i.e. kNN ). By running these two strategies in parallel customers can get the most significant results, which feeds into better answer quality overall (Figure 1). Figure 1: Hybrid search as the intersection between lexical and semantic search In order to combine the results we can use different strategies. Each result in Elasticsearch produces a list of relevant documents, ordered by a score value. A score is a floating point number that represents the relevance of a document. Higher numbers mean better relevance. If we have two lists of results, one coming from lexical and another from semantic how can we combine them? One strategy is to use the Reciprocal Rank Fusion (RRF) algorithm. This algorithm rearranges the score of each document using the following algorithm: Where: k is a ranking constant q is a query in the set of queries (e.g. lexical and semantic) d is a document in the result set of q result(q) is the result set of q rank(result(q), d) is the position (ranking) of document d in the results of query q For instance, imagine we run a Hybrid Search query to get the top-3 significant documents. We use lexical and semantic queries and we set k=1. The results for lexical query are (in order): Doc4 Doc3 Doc2 Doc1 That means the most relevant document is Doc4 followed by Doc3, Doc2 and Doc1. The results for semantic query are (in order): Doc3 Doc2 Doc1 Doc5 We can then calculate the RRF scores using the previous algorithm. In the following table, we calculated the scores for the lexical and semantic results, and then summed the two values to obtain the final RRF score. Documents Lexical Semantic RRF Doc1 1/(1+4) 1/(1+3) ⅕ + ¼ = 0.4500 Doc2 1/(1+3) 1/(1+2) ¼ + ⅓ = 0.5833 Doc3 1/(1+2) 1/(1+1) ⅓ + ½ = 0.8333 Doc4 1/(1+1) 0 ½ = 0.5 Doc5 0 1/(1+4) ⅕ = 0.2 Sorting the RRF scores gives us the following results: Doc3 Doc2 Doc4 Doc1 Doc5 Finally, the top-3 results are: Doc3, Doc2 and Doc4. The RRF algorithm is used by default with the hybrid search Elasticsearch integration for Semantic Kernel. The Hybrid Search integration in Semantic Kernel The latest version of the Elasticsearch Semantic Kernel connector implements the brand new IHybridSearch<TRecord> interface in the ElasticsearchVectorStoreRecordCollection<TKey, TRecord> type. This interface extends the existing functionality with a new method that looks like this: Where: vector is the TVector for the semantic search (using kNN); keywords contain a collection of strings to be used in the lexical search terms query of Elasticsearch (the terms in the collection are treated as OR conditions); top indicates the maximum number of documents to return; options options like e.g. the vector property/field to use for the vector search operation, the property/field to use for the lexical search operation, or an additional pre-filter specified in .NET expression tree syntax; cancellationToken the CancellationToken used to cancel the asynchronous operation; For instance, imagine we reuse the hotel dataset introduced in the previous article How to use Elasticsearch Vector Store Connector for Microsoft Semantic Kernel for AI Agent development . We can execute an Hybrid Search query to retrieve the top-5 hotels containing the keywords “downtown” or “luxury” combined with a semantic search using the vector {1, 2, 3}: If we want to apply a filter before executing the Hybrid Search, we can do that by using the HybridSearchOptions . For instance, imagine we want to consider only the hotels that are beachfront, we can add a filter using the expression Filter = x => x.Description.Contains(\"beachfront\") as follows: In this way, the search will consider only the beachfront hotels and then apply the previous Hybrid Search criteria (hint: expression tree-based filtering is also available for the regular vector search in Semantic Kernel). The support for expression tree-based filtering in recent versions of Semantic Kernel is a nice improvement over the previous filtering API. Right now, the Elasticsearch Semantic Kernel connector only supports comparison (=, !=, <, <=, >, >=) and boolean (!, &&, ||) operators. More operations like collection.Contains() will be implemented soon. Hybrid search for .NET apps, with Elasticsearch and Semantic Kernel In this article, we showed how to use Semantic Kernel’s Hybrid Search features with Elasticsearch integration. We illustrated how to combine lexical and semantic search to improve the retrieval results. This technique can be used for improving information retrieval systems, such as Retrieval-augmented generation (RAG). Moreover, we also looked at applying pre-filtering using the HybridSearchOptions object. The filtering condition can be expressed using the .NET expression tree syntax. While Reciprocal Rank Fusion provides a robust default for combining lexical and semantic scores in hybrid search—as we saw in this blog with Semantic Kernel, Elasticsearch also more broadly supports other retriever styles . This includes options like the Linear Retriever , providing simple customization of combination strategies beyond the RRF default, enabling users to fine-tune search relevance with hybrid approaches. In the future, we will continue to expand support for Semantic Kernel with the latest features within Elasticsearch. Happy (hybrid) searching! Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Jump to What is Hybrid Search? The Hybrid Search integration in Semantic Kernel Hybrid search for .NET apps, with Elasticsearch and Semantic Kernel Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"First to hybrid search: with Elasticsearch and Semantic Kernel - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/hybrid-search-support-elasticsearch-vector-database-semantic-kernel","meta_description":"Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Search Analytics Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Search Analytics January 10, 2025 Filtering in ES|QL using full text search 8.17 included match and qstr functions in ES|QL, that can be used to perform full text filtering. 8.18 removed limitations on their usage. This article describes what they do, how they can be used, the difference with the existing text filtering methods, current limitations and future improvements. CD By: Carlos Delgado Search Analytics How To June 10, 2024 Storage wins for time-series data in Elasticsearch Explore Elasticsearch's storage improvements for time series data and best practices for configuring a TSDS with storage efficiency. MG KK By: Martijn Van Groningen and Kostas Krikellas Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Search Analytics - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/search-analytics","meta_description":"Search Analytics articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch Index Number_of_Replicas Explaining how to configure the number_of_replicas, its implications and best practices. How To KB By: Kofi Bartlett On May 14, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch is designed to be a distributed system that can handle a large amount of data and provide high availability. One of the key features that enable this is the concept of index replication, which is controlled by the number_of_replicas setting. This article will delve into the details of this setting, its implications, and how to properly configure it. The role of replicas in Elasticsearch In Elasticsearch, an index is a collection of documents that are partitioned across multiple primary shards. Each primary shard is a self-contained Apache Lucene index, and the documents within an index are distributed among all primary shards. To ensure high availability and data redundancy, Elasticsearch allows each shard to have one or more copies, known as replicas. The number_of_replicas setting controls the number of replica shards (copies) that Elasticsearch creates for each primary shard in an index. By default, Elasticsearch creates one replica for each primary shard, but this can be changed according to the requirements of your system. Configuring the number_of_replicas The number_of_replicas setting can be configured at the time of index creation or updated later. Here’s how you can set it during index creation: In this example, Elasticsearch will create two replicas for each primary shard in the my_index index. To update the number_of_replicas setting for an existing index, you can use the _settings API: This command will update the my_index index to have three replicas for each primary shard. Implications of the number_of_replicas setting The number_of_replicas setting has a significant impact on the performance and resilience of your Elasticsearch cluster . Here are some key points to consider: Data Redundancy and Availability: Increasing the number_of_replicas enhances the availability of your data by creating more copies of each shard. If a node fails, Elasticsearch can still serve data from the replica shards on the remaining nodes . Search Performance: Replica shards can serve read requests, so having more replicas can improve search performance by distributing the load across more shards. Write Performance: However, each write operation must be performed on every copy of a shard. Therefore, a higher number_of_replicas can slow down indexing performance as it increases the number of operations that must be performed for each write. Storage Requirements: More replicas mean more storage space. You should ensure that your cluster has enough capacity to store the additional replicas. Resilience to Node Failure: The number_of_replicas should be set considering the number of nodes in your cluster. If the number_of_replicas is equal to or greater than the number of nodes, your cluster can tolerate the failure of multiple nodes without data loss. Best practices for setting number_of_replicas The optimal number_of_replicas setting depends on the specific requirements of your system. However, here are some general best practices: For a single-node cluster, number_of_replicas should be set to 0, as there are no other nodes to hold replicas. For a multi-node cluster, number_of_replicas should be set to at least 1 to ensure data redundancy and high availability. If search performance is a priority, consider increasing the number_of_replicas . However, keep in mind the trade-off with write performance and storage requirements. Always ensure that your cluster has enough capacity to store the additional replicas. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to The role of replicas in Elasticsearch Configuring the number_of_replicas Implications of the number_of_replicas setting Best practices for setting number_of_replicas Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch Index Number_of_Replicas - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-index-number-of_replicas","meta_description":"Explaining how to configure the number_of_replicas, its implications and best practices.\n"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Generative AI Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Generative AI How To March 17, 2025 How to optimize RAG retrieval in Elastisearch with DeepEval Learn how to optimize the Elasticsearch retriever in a RAG pipeline using DeepEval. KV By: Kritin Vongthongsri Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Generative AI How To March 11, 2025 Building a Multimodal RAG system with Elasticsearch: The story of Gotham City Learn how to build a Multimodal Retrieval-Augmented Generation (RAG) system that integrates text, audio, video, and image data to provide richer, contextualized information retrieval. AS By: Alex Salgado Generative AI Search Relevance +1 March 5, 2025 How to build autocomplete feature on search application automatically using LLM generated terms Learn how to enhance your search application with an automated autocomplete feature in Elastic Cloud using LLM-generated terms for smarter, more dynamic suggestions. MS By: Michael Supangkat Generative AI Integrations +1 February 26, 2025 Embeddings and reranking with Alibaba Cloud AI Service Using Alibaba Cloud AI Service features with Elastic. TM By: Tomás Murúa 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Generative AI - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/generative-ai","meta_description":"Generative AI articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Elastic Cloud Hosted Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Vector Database Generative AI +2 May 21, 2024 Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures Benchmarking in preview provides Elasticsearch up to 37% better performance on Azure Cobalt 100 Arm-based VMs. YG HM By: Yuvraj Gupta and Hemant Malik Vector Database Generative AI +1 May 21, 2024 Elastic Cloud adds Elasticsearch Vector Database optimized profile to Microsoft Azure Elasticsearch added a new vector search optimized profile to Elastic Cloud on Microsoft Azure. Get started and learn how to use it here. SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta Vector Database Generative AI +1 April 25, 2024 Elastic Cloud adds Elasticsearch Vector Database optimized instance to Google Cloud Elasticsearch's vector search optimized profile for GCP is available. Learn more about it and how to use it in this blog. SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elastic Cloud Hosted - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/elastic-cloud-hosted","meta_description":"Elastic Cloud Hosted articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. Search Relevance VB By: Vincent Bosc On April 11, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Sparse vectors are a key component in ELSER , but their usefulness extends far beyond that. In this post, we’ll explore how sparse vectors can enhance search relevance in an e-commerce setting: boosting documents based on search behavior (like clicks) and user preferences. What exactly are sparse vectors? Vector search is a hot topic right now, but most conversations focus on dense vectors: compact numerical representations used in machine learning and neural search. Sparse vectors, on the other hand, take a different path. Unlike dense vectors that pack data tightly, sparse vectors store information in a more interpretable and structured format, often with many zeros. Though less hyped, they can be incredibly powerful in the right context. 💡 Fun fact: Sparse vectors and inverted indexes both leverage sparsity to efficiently represent and retrieve information. In Elasticsearch, you can store sparse vectors using the sparse_vector field type : no surprises there. Querying with sparse vectors Searching with sparse vectors in Elasticsearch feels similar to traditional keyword search, but with a twist. Rather than matching terms directly, sparse vector queries use weighted terms and the dot product to score documents based on how well they align with the query vector. Use case 1: Signal boosting for better search ranking Signal boosting refers to emphasizing certain features or terms to improve search ranking. This is especially useful when business logic or user behavior suggests that some results should appear higher. Let’s say we’re working with a simple e-commerce index: Now, let’s index two documents only using traditional full text type: A basic search for “playstation” will return the controller first, not because it’s more relevant, but because BM25, the default lexical scoring algorithm, tends to favor shorter fields, causing the controller’s concise title to rank higher. But we want to boost the console result, especially since it has a special offer! One way to do this is by embedding boosting signals directly into the document via sparse vectors: This document now carries extra weight for the search queries “playstation” and “game console”. We can adjust our query to incorporate this sparse vector boost: Thanks to the added score from the sparse vector match, the console now ranks above the controller, which is exactly what we want! This approach offers an alternative to traditional boosting techniques, such as function_score queries or field-level weight tuning. By storing boosting information directly in the document using sparse vectors, you gain more flexibility and transparency in how relevance is adjusted. It also decouples business logic from query logic. However, it’s worth noting the tradeoffs: traditional boosting can be simpler to implement for straightforward use cases and may have performance advantages in some scenarios. Sparse vectors shine when you need fine-grained, multi-dimensional control over boosting. Reminder : The must clause filters and contributes to scoring, while the should clause adds to the score if the condition matches. Use case 2: Personalization using sparse vectors Sparse vectors also enable personalization. You can assign weights to customer traits or personas and use them to surface the most relevant products for individual users. Here’s an example: Let’s say Jim is a customer who prefers healthy, sustainable options: We can tailor the search experience to reflect Jim’s preferences: As a result, the healthier snack bar floats to the top of the search results because that’s what Jim is more likely to buy. This method of personalization via sparse vectors builds on ideas like static segment tags, but makes them more dynamic and expressive. Instead of assigning a user to a single segment like \"tech-savvy\" or \"healthy-conscious\", sparse vectors allow you to represent multiple affinities with varying weights, all in a way that integrates directly into the search ranking process. Using a function_score query to incorporate user preferences is a flexible alternative for personalization, but it can become complex and difficult to maintain as logic grows. Another common approach, collaborative filtering , relies on external systems to compute user-item similarities and typically requires additional infrastructure. Learning to Rank (LTR) can also be applied to personalization , offering powerful ranking capabilities, but it demands a high level of maturity, both in terms of feature engineering and model training. Wrapping up Sparse vectors are a versatile addition to your search toolbox. We’ve covered just two practical examples: boosting search results and personalizing based on user profiles. But the possibilities are broad. By embedding structured, weighted information directly into your documents, you unlock smarter, more relevant search experiences with minimal complexity. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to What exactly are sparse vectors? Querying with sparse vectors Use case 1: Signal boosting for better search ranking Use case 2: Personalization using sparse vectors Wrapping up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Enhancing relevance with sparse vectors - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-sparse-vector-boosting-personalization","meta_description":"Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elastic Cloud adds Elasticsearch Vector Database optimized profile to Microsoft Azure Elasticsearch added a new vector search optimized profile to Elastic Cloud on Microsoft Azure. Get started and learn how to use it here. Vector Database Generative AI Elastic Cloud Hosted SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta On May 21, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elastic Cloud Vector Search optimized hardware profile is available for Elastic Cloud on Microsoft Azure users. This hardware profile is optimized for applications that use Elasticsearch as a vector database to store dense or sparse embeddings for search and Generative AI use cases powered by RAG (retrieval augmented generation). Vector Search optimized hardware profile: what you need to know Elastic Cloud users benefit from having Elastic managed infrastructure across all major cloud providers (Azure, GCP and AWS) along with wide region support for Microsoft Azure users. This release follows the previous announcement of a Vector Search optimized hardware profile for GCP . AWS users have had access to the Vector Search optimized profile since November 2023. For more specific details on the instance configuration for this Azure hardware profile, refer to our documentation for instance type: azure.es.datahot.lsv3 Vector Search, HNSW, and Memory Elasticsearch uses the Hierarchical Navigable Small World graph (HNSW) data structure to implement its Approximate Nearest Neighbor search (ANN). Because of its layered approach, HNSW's hierarchical aspect offers excellent query latency. To be most performant, HNSW requires the vectors to be cached in the node's memory. This caching is done automatically and uses the available RAM not taken up by the Elasticsearch JVM. Because of this, memory optimizations are important steps for scalability. Consult our vector search tuning guide to determine the right setup for your vector search embeddings and whether you have adequate memory for your deployment. With this in mind, the Vector Search optimized hardware profile is configured with a smaller than standard Elasticsearch JVM heap setting. This provides more RAM for caching vectors on a node, allowing users to provision fewer nodes for their vector search use cases. If you’re using compression techniques like scalar quantization , the memory requirement is lowered by a factor of 4 . To store quantized embeddings (available in versions Elasticsearch 8.12 and later) simply ensure that you’re storing in the correct element_type: byte . To utilize our automatic quantization of float vectors update your embeddings to use index type: int8_hnsw like in the following mapping example. In upcoming versions, Elasticsearch will provide this as the default mapping, removing the need for users to adjust their mapping. For further reading, we provide an evaluation of scalar quantization in Elasticsearch in this blog . Combining this optimized hardware profile with Elasticsearch’s automatic quantization are two examples where Elastic is focused on vector search and our vector database to be cost-effective while still being extremely performant. Getting started with Elastic Cloud Vector Search optimized hardware profile Start a free trial on Elastic Cloud and simply select the new Vector Search optimized profile to get started. Migrating existing Elastic Cloud deployments Migrating to this new Vector Search optimized hardware profile is a few clicks away. Simply navigate to your Elastic Cloud management UI, click to manage the specific deployment, and edit the hardware profile. In this example, we are migrating from a ‘Storage optimized’ profile to the new ‘Vector Search’ optimized profile. When choosing to do so, there is a small reduction to the available storage, but what is gained is the ability to store more vectors per memory with vector search at a lower cost. Migrating to a new hardware profile uses the grow and shrink approach for deployment changes. This approach adds new instances, migrates data from old instances to the new ones, and then shrinks the deployment by removing the old instances. This approach allows for high availability during configuration changes even for single availability zones. The following image shows a typical architecture for a deployment running in Elastic Cloud, where vector search will be the primary use case. This example deployment uses our new Vector Search optimized hardware profile, now available in Azure. This setup includes: Two data nodes in our hot tier with our vector search profile One Kibana node One Machine Learning node One integration server One master tiebreaker By deploying these two “full-sized” data nodes with the Vector Search optimized hardware profile and while taking advantage of Elastic’s automatic dense vector scalar quantization , you can index roughly 60 million vectors, including one replica (with 768 dimensions). Conclusion Vector search is a powerful tool when building modern search applications, be it for semantic document retrieval on its own or integrating with an LLM service provider in a RAG setup . Elasticsearch provides a full-featured vector database natively integrated with a full-featured search platform. Along with improving vector search feature set and usability, Elastic continues to improve scalability. The vector search node type is the latest example, allowing users to scale their search application. Elastic is committed to providing scalable, price effective infrastructure to support enterprise grade search experiences. Customers can depend on us for reliable and easy to maintain infrastructure and cost levers like vector compression, so you benefit from the lowest possible total cost of ownership for building search experiences powered by AI. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Vector Search optimized hardware profile: what you need to know Vector Search, HNSW, and Memory Getting started with Elastic Cloud Vector Search optimized hardware profile Migrating existing Elastic Cloud deployments Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elastic Cloud adds Elasticsearch Vector Database optimized profile to Microsoft Azure - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-vector-profile-azure","meta_description":"Elasticsearch added a new vector search optimized profile to Elastic Cloud on Microsoft Azure. Get started and learn how to use it here."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Searching complex documents with ColPali - part 1 The article introduces the ColPali model, a late-interaction model that simplifies the process of searching complex documents with images and tables, and discusses its implementation in Elasticsearch. Search Relevance Vector Database How To PS BT By: Peter Straßer and Benjamin Trent On March 18, 2025 Part of Series The ColPali model series Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. When building search applications, we often need to deal with documents that have complex structures—tables, figures, multiple columns, and more. Traditionally, this meant setting up complicated retrieval pipelines including OCR (optical character recognition), layout detection, semantic chunking, and other processing steps. In 2024, the model ColPali was introduced to address these challenges and simplify the process. Source: https://huggingface.co/blog/manu/colpali From Elasticsearch version 8.18 onwards, we added support for late-interaction models such as ColPali as a tech preview feature. In this blog, we will take a look at how we can use ColPali to search through documents in Elasticsearch. Does this work? While we have many benchmarks that are based on previously cleaned up text data to compare different retrieval strategies, the authors of the ColPali paper argue that real-world data in many organizations is messy and not always available in a nice, cleaned-up format. Example documents from the ColPali paper: https://arxiv.org/pdf/2407.01449 To better represent these scenarios, the ViDoRe benchmark was released alongside the ColPali model. This benchmark includes a diverse set of document images from sectors such as government, healthcare, research, and more. A range of different retrieval methods, including complex retrieval pipelines or image embedding models, were compared with this new model. The following table shows that ColPali performs exceptionally well on this dataset and is able to retrieve relevant information from these messy documents reliably. Source: https://arxiv.org/pdf/2407.01449 Table 2 How does it work? As teased in the beginning, the idea of ColPali is to just embed the image instead of extracting the text via complicated pipelines. ColPali builds on the vision capabilities of the PaliGemma model and the late-interaction mechanism introduced by ColBERT. Source: https://arxiv.org/pdf/2407.01449 Figure 1 Let’s first take a look at how we index our documents. Instead of converting the document into a textual format, ColPali processes documents by dividing a screenshot into small rectangles and converts each into a 128-dimensional vector. This vector represents the contextual meaning of this patch within the document. In practice, a 32x32 grid generates 1024 vectors per document. For our query, the ColPali model creates a vector for each token. To score documents during search, we calculate the distance between each query vector and each document vector. We keep only the highest score per query vector and sum those scores for a final document score. Late interaction mechanism for scoring ColBERT Interpretability Vector search with bi-encoders struggle with the fact that the results are sometimes not very interpretable—meaning we don’t know why a document matched. Late interaction models are different: we know how well each document vector matches our query vectors, therefore we can determine where and why a document matches. A heatmap of where the word “hour” matches in this document. Source: https://arxiv.org/pdf/2407.01449 Searching with ColPali in Elasticsearch We will be taking a subset of the ViDoRe test set to take a look at how to index documents with ColPali in Elasticsearch. The full code examples can be found on GitHub . To index the document vectors, we will be defining a mapping with the new rank_vectors field. We now have an index ready to be searched full of ColPali vectors. To score our documents, we can use the new maxSimDotProduct function. Conclusion ColPali is a powerful new model that can be used to search complex documents with high accuracy. Elasticsearch makes it easy to use as it provides a fast and scalable search solution. Since the initial release, other powerful iterations such as ColQwen have been released. We encourage you to try these models for your own search applications and see how they can improve your results. Before implementing what we covered here in production environments, we highly recommend that you check out part 2 of this article. Part 2 explores advanced techniques, such as bit vectors and token pooling, which can optimize resource utilization and enable effective scaling of this solution. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to Does this work? How does it work? Interpretability Searching with ColPali in Elasticsearch Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Searching complex documents with ColPali - part 1 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastiacsearch-colpali-document-search","meta_description":"Learn about ColPali and explore how to use it to search through complex documents in Elasticsearch, including tables, figures, multiple columns & more."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. Search Relevance Vector Database How To PS BT By: Peter Straßer and Benjamin Trent On March 20, 2025 Part of Series The ColPali model series Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our previous blog on ColPali , we explored how to create visual search applications with Elasticsearch. We primarily focused on the value that models such as ColPali bring to our applications, but they come with performance drawbacks compared to vector search with bi-encoders such as E5. Building on the examples from part 1 , this blog explores how to use different techniques and Elasticsearch's powerful vector search toolkit in order to make late interaction vectors ready for large-scale production workloads. The full code examples can be found on GitHub. Problem ColPali creates over 1000 vectors per page for the documents in our index. This results in two challenges when working with late interaction vectors: Disk space: Saving all these vectors on disks will incur a serious amount of storage usage, which will be expensive at scale. Computation: When ranking our documents with the maxSimDotProduct() comparison, we need to compare all of these vectors for each of our documents with the N vectors of our query. Let’s look at some techniques on how to address these issues. Bit vectors In order to reduce disk space, we can compress the images into bit vectors. We can use a simple Python function to transform our multi-vectors into bit vectors: The function's core concept is straightforward: values above 0 become 1, and values below 0 become 0. This results in an array of 0s and 1s, which we then transform into a hexadecimal string representing our bit vector. For our index mapping, we set the element_type parameter to bit : After having written all of our new bit vectors to our index, we can rank our bit vectors using the following code: Trading off a bit of accuracy, this allows us to use hamming distance ( maxSimInvHamming(...) ), which is able to leverage optimizations such as bit-masks, SIMD, etc. Learn more about bit vectors and hamming distance in our blog . Alternatively, we can not convert our query vector to bit vectors and search with the full-fidelity late interaction vector: This will compare our vectors using an asymmetric similarity function. Let’s think about a regular hamming distance between two bit vectors. Suppose we have a document vector D: And a query vector Q: Simple binary quantization will transform D into 10101101 and Q into 11111011 . For hamming distance, we need direct bit math—it's extremely fast. In this case, the hamming distance is 01010110 , which is 86 . So, scoring then becomes the inverse of that hamming distance. Remember, more similar vectors have a SMALLER hamming distance, so inverting it allows for more similar vectors to be scored higher. Specifically here, the score would be 0.012 . However, note how we lose the magnitude of each dimension. A 1 is a 1 . So, for Q , the difference between 0.01 and 0.79 disappears. Since we are simply quantizing according to >0 , we can do a small trick where the Q vector isn’t quantized. This doesn’t allow for the extremely fast bitwise math, but it does keep the storage cost low as D is still quantized. In short, this retains the information provided in Q , thus increasing the distance estimation quality and keeping the storage low. Using bit vectors allows us to save significantly on disk space and computational load at query time. But there is more that we can do. Average vectors To scale our search across hundreds of thousands of documents, even the performance benefits that bit vectors give us will not be enough. In order to scale to these types of workloads, we will want to leverage Elasticsearch’s HNSW index structure for vector search. ColPali generates around a thousand vectors per document, which is too many to add to our HNSW graph. Therefore, we need to reduce the number of vectors. To do this, we can create a single representation of the document's meaning by taking the average of all the document vectors produced by ColPali when we embed our image. We simply take the average vector over all late interaction vectors. As of now, this is not possible within the Elastic itself and we will need to preprocess the vectors before ingesting them in Elasticsearch. We can do this with Logstash or Ingest pipelines, but here we will use a simple Python function: We are also normalizing the vector so that we can use the dot product similarity. After transforming all of our ColPali vectors to average vectors, we can index them into our dense_vector field: We have to consider that this will increase total disk usage since we are saving more information along with our late interaction vectors. Additionally, we will use extra RAM to hold the HNSW graph, allowing us to scale the search over billions of vectors. To reduce the usage of RAM, we can make use of our popular BBQ feature . In turn, we get fast search results over massive data sets that would otherwise not be possible. Now, we simply search with the knn query to find our most relevant documents. The previously best match has unfortunately fallen to rank 3. To fix this problem, we can do a multi-stage retrieval. In our first stage, we are using the knn query to search the best candidates for our query over millions of documents. In the second stage, we are only reranking the top k (here: 10) with the higher fidelity of the ColPali late interaction vectors. Here, we are using the in 8.18 introduced rescore retriever to rerank our results. After rescoring we see that our best match is again in first position. Note: In a production application we can use a much higher k than 10 as the max sim function is still comparatively performant. Token pooling Token pooling reduces the sequence length of multi-vector embeddings by pooling redundant information, such as white background patches. This technique decreases the number of embeddings while preserving most of the page's signal. We are clustering semantically similar vectors to achieve less vectors overall. Token pooling works by grouping similar token embeddings within a document into clusters using a clustering algorithm. Then, the mean of the vectors in each cluster is calculated to create a single, aggregated representation. This aggregated vector replaces the original tokens in the group, reducing the total number of vectors without significant loss of document signal. The ColPali paper proposes an initial pool factor value of 3 for most datasets, which maintains 97.8% of the original performance while reducing the total number of vectors by 66.7%. Source: https://arxiv.org/pdf/2407.01449 But we need to be careful: The \"Shift\" dataset, which contains very dense, text-heavy documents with little white space, declines rapidly in performance as pool factors increase. To create the pooled vectors, we can use the colpali_engine library: We now have a vector that was reduced by about 66.7% in its dimensions. We index it as usual and we are able to search on it with our maxSimDotProduct() function. We are able to get good search results at the expense of some slight accuracy in results. Hint: With a higher pool_factor (100-200), you can also have a middle ground between the average vector solution and the one we discussed here. With around 5-10 vectors per document, it becomes viable to index them in a nested field to leverage the HNSW index. Coss-encoder vs. late-interaction vs. bi-encoder With what we have learned so far, where does this place late interaction models such as ColPali or ColBERT when we compare them to other AI retrieval techniques? While the max sim function is cheaper compared to cross-encoders, it still requires many more comparisons and computation than vector search with bi-encoders, where we are just comparing two vectors for each query-document pair. Because of this, our recommendation for late-interaction models is to generally only use them for reranking the top k search results. We also capture this in the name of the field type: rank_vectors. But what about the cross encoder? Are late interaction models better because they are cheaper to execute at query time? As is often the case, the answer is: it depends. Cross encoders generally produce higher quality results, but they require a lot of compute because the query document pairs need to do a full pass through the transformer model. They also benefit from the fact that they do not require any indexing of vectors and can operate in a stateless manner. This results in: Less disk space used A simpler system Higher quality of search results Higher latency and therefore not being able to rerank as deep On the other hand, late Interaction models can offload some of this computation at index them, making the query cheaper. The price we pay is having to index the vectors, which makes our indexing pipelines more complex and also requires more disk space to save these vectors. Specifically in the case of ColPali, the analysis of information from images is very expensive as they contain a lot of data. In this case, the tradeoff shifts in favor of using a late interaction model such as ColPali because evaluating this information at query time would be too resource intensive/slow. For a late interaction model such as ColBERT, which works on text data like most cross-encoders (e.g., elastic-rerank-v1), the decision might lean more toward using the cross-encoder to benefit from the disk savings and simplicity. We encourage you to weigh those pros and cons for your use-case and experiment with the different tools that Elasticsearch provides you to build the best search applications. Conclusion In this blog, we explored various techniques to optimize late interaction models like ColPali for large-scale vector search in Elasticsearch. While late interaction models provide a strong balance between retrieval efficiency and ranking quality, they also introduce challenges related to storage and computation. To address these challenges, we looked at: Bit vectors to significantly reduce disk space while leveraging efficient similarity computations like hamming distance or asymmetric max similarity. Average vectors to compress multiple embeddings into a single dense representation, enabling efficient retrieval with HNSW indexing. Token pooling to intelligently merge redundant embeddings while maintaining semantic integrity, reducing computational overhead at query time. Elasticsearch provides a powerful toolkit to customize and optimize search applications based on your needs. Whether you prioritize retrieval speed, ranking quality, or storage efficiency, these tools and techniques allow you to balance performance and quality as you need for your real-world applications. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to Problem Bit vectors Average vectors Token pooling Coss-encoder vs. late-interaction vs. bi-encoder Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Scaling late interaction models in Elasticsearch - part 2 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/scale-late-interaction-model-colpali","meta_description":" Explore techniques to scale late interaction models like ColPali for large-scale vector search in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. Inside Elastic AJ By: Andy James On November 8, 2024 Part of Series GenAI for customer support Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog series reveals how our Field Engineering team used the Elastic stack with generative AI to develop a lovable and effective customer support chatbot. If you missed other installments in the series, be sure to check out: Part 1: Building our proof of concept Part 2: Building a knowledge library Part 3: Designing a chat interface for chatbots... for humans Part 4: Tuning RAG search for relevance Launch blog: GenAI for customer support - Explore the Elastic Support Assistant What I find compelling about observability is that it has utility in both good times and bad. When everything is going great, your observability is what provides you the metrics to show off the impact your work is having. When your system is a rough day, your observability is what will help you find the root cause and stabilize things as quickly as possible. It's how we noticed a bug causing us to load the same data over and over again from our server. We saw in the APM data that the throughput for one of our endpoints was well over 100 transactions per minute, which was unreasonably large for the size of our user base. We could confirm the fix when we saw the throughput reduce to a much more reasonable 1 TPM. It's also how I know we served up our 100th chat completion 21 hours post launch (you love to see it folks). This post will discuss the observability needs for a successful launch, and then some unique observability considerations for a chatbot use case such as our Support Assistant. Critical GenAI observability components You're going to want three main pieces in place. A status dashboard, alerting, and a milestones dashboard, in that order. We’ll dig more into what that means, and what I put into place for the Support Assistant launch as it pertains to each. There is one requirement that all three of those components requires; data. So before we can dive into how to crunch the data for actionable insights, let’s take a look at how we collect that data for the Support Assistant (and generally for the Support Portal). Observability data collection We have an Elastic Cloud cluster dedicated to monitoring purposes. This is where all of the observability data I am going to discuss gets stored and analyzed. It is separate from our production and staging data Elastic clusters which are where we manage the application data (e.g. knowledge articles, crawled documentation). We run Elastic’s Node APM client within our Node application that serves the API, and have Filebeat running to capture logs. We have a wrapper function for our console.log and console.error calls that appends APM trace information at the end of each message, which allows Elastic APM to correlate logs data to transaction data. Additional details about this feature are available on the Logs page for APM . The key piece of information you'll find there is that apm.currentTraceIds exists to provide exactly what you need. From there it's nothing complicated, just a pinch of string formatting. Copy ours. A small gift; from my team, to yours. We use the Elastic Synthetics Monitoring feature to check on the liveliness of our application and critical upstream services (e.g. Salesforce, our data clusters). At the moment we use the HTTP monitor type, but we're looking at how we might want to use Journey monitors in the future. The beauty of the basic HTTP monitor is that all you need to configure is a URL to ping, how often you want to ping it, and from where. When choosing which locations to check from, we know for the app itself we want to check from locations around the world, and because there are some calls directly from our user's browsers to the data clusters, we also check that from all of the available locations. However, for our Salesforce dependency, we know we only connect to that from our servers, so we only monitor that from locations where the Support Portal app is being hosted. We also ship Stack Monitoring data from the application data Elastic clusters, and have the Azure OpenAI integration shipping logs and metrics from that service via an Elastic Agent running on a GCP Virtual Machine. Setting up Elastic APM Getting started with Elastic APM is really easy. Let's go over the APM configuration for our Support Portal's API service as an example. Let's unpack a few things going on there. The first is that we've allowed ourselves to inject a mock APM instance in testing scenarios, and also added a layer of protection to prevent the start function from being called more than once. Next, you'll see that we are using environment variables to power most of our configuration options. APM will automatically read the ELASTIC_APM_ENVIRONMENT environment variable to fill in the environment setting, ELASTIC_APM_SERVER_URL for the serverUrl setting, and ELASTIC_APM_SECRET_TOKEN for the secretToken setting. You can read the full list of configuration options here , which includes the names of the environment variables that can be used to configure many of the options. I want to emphasize the value of setting environment . It allows me to easily distinguish traffic from different environments. Even if you aren't running a staging environment (which you really should), collecting APM when you're developing locally can also come in handy, and you will want to be able to look at production and development data in isolation most of the time. Being able to filter by service.environment is convenient. If you're running in Elastic Cloud, you can follow these steps to get the values for serverUrl and secretToken to use with your configuration. Visit your Kibana instance, and then navigate to the Integrations page. Find the APM integration. Scroll past the APM Server section to find the APM Agents section and you'll see a Configure the agent subsection that includes the connection info. Status dashboard Data is only as useful as your ability to extract meaning from it, and that’s where dashboarding comes in. With Elastic Cloud, it’s default to be running Kibana along with Elasticsearch, so we’ve already got a great visualization layer available within our stack. So what do we want to see? Usage, latency, errors, and capacity are pretty common categories of data, but even within those, your specific needs will dictate what specific visualizations you want to make for your dashboard. Let’s go over the status dashboard I made for the Support Assistant launch to use as an example. You might be surprised to notice the prime real estate in the upper-left being host to text. Kibana has a markdown visualization you can use to add instructions, or in my case a bunch of convenient links to other places where we might want to follow up on something seen in the dashboard. The rest of the top row displays some summary stats like the total number of chat completions, unique users, and errors for the time range of the dashboard. The next set of visualizations are time series charts to examine latency, and usage over time. For our Support Assistant use case, we are specifically looking at latency of our RAG searches and our chat completions. For usage, I’m interested in the number of chat completions, unique users, returning users, and a comparison of assistant users to all Support Portal users. Those last two I've left below the fold of the image because they include details we decided not to share. I like to save a default time range with dashboards. It anchors other users to a default view that should be generally useful to see when they first load the dashboard. I pinned the start timestamp to approximately when the release went live, and the end is pinned to now . During the launch window, it's great to see the entire life of the feature. At some point it will probably make more sense to update that stored time to be a recent window like “last 30 days.” Bonus challenge: Can you tell when we upgraded our model from GPT-4 to the more powerful GPT-4o? I have additional areas of the status dashboard focused on users who are using the most or experiencing the most errors, and then also some time series views of HTTP status and errors over time. Your status dashboard will be different, and it should be. This type of dashboard also has the tendency to evolve over time (mine did noticably during the time I was drafting this post). Its purpose is to be the answer key to the series of questions that are most important to be able to answer about the feature or system you’re observing. You will discover new questions that are important, and that might add some new visualizations to the dashboard. Sometimes a question becomes less relevant or you come to understand it was less meaningful than you expected, and so you could remove or rearrange it below other items. Before we move on from this dashboard, let's take a couple of detours to take a look at an APM trace for our chat completions, and then how I used ES|QL to create that returning users visualization. APM traces If you've never seen an Elastic APM trace there is probably a ton of really compelling things going on in that image. The header shows request URL, response status, duration, which browser was used. Then when we get into the waterfall chart we can see the breakdown of which services were involved and some custom spans. APM understands that this trace traveled through our frontend server (green spans), and our API service (blue spans). Custom spans are a great way to monitor performance of specific tasks. In this case where we are streaming chat completions, I want to know how long until the first tokens of generation arrive, and also how long the entire completion process takes. The average duration of these spans is charted on the dashboard. Here's a trimmed down snippet of the chat completion endpoint that focusses on starting and ending the custom spans. Using ES|QL to visualize returning users When I first started trying to visualize repeat users, my original goal was to end up with something like a stacked bar chart per day where the total size of the bar should be the number of unique users that day, and the break down would be net new users vs. returning users. The challenge here is that to compute this requires overlapping windows, and that's not compatible with how histograms work in Kibana visualizations. A colleague mentioned that ES|QL might have some tools to help. While I didn't end up with the visualization I originally described, I was able to use it to help me process a dataset where I could generate the unique combinations of user email and request date, which then enabled counting how many unqiue days each user had visited. From there, I could visualize the distribution of quantity of visits. Here's the ES|QL query that powers my chart. Alerting With the status dashboard in place, you have a way to quickly understand the state of the system both at the present and over time. The metrics being displayed in the visualizations are inherently metrics that you care about, but you can’t nor would you want to be glued to your dashboard all day (well maybe the excitement of the first couple days after launch left me glued to my dashboard, but it’s definitely not a sustainable strategy). So let’s talk about how alerting can untether us from the dashboard, while letting us sleep well, knowing that if something starts going wrong, we’ll get notified instead of finding out the next time you chase the sweet sensation of staring at that beautiful dahsboard. A very convenient thing about Elastic Observability is that the details you need to know to make the alerting rules, you already figured out in making the visualizations for the dashboard. Any filters you were applying, and the specific fields from the specific indices that you visualized are the main configuration details you need to configure alerting rules. You’re essentially taking that metric defined by the visualization and adding a threshold to decide when to trigger the alert. How should I pick a threshold? For some alerts it might be about trying to achieve a certain quality of service that is defined by the team. In a lot of cases, you want to use the visualizations to establish some sort of expected baseline, so that you can then choose a threshold based on how much of a deviation from that observed baseline you’re willing to tolerate. This is a good time to mention that you might be planning to hold off integrating APM until the end of the development process, but I would encourage you to do it sooner. For starters, it’s not a big lift (as I showed you above). The big bonus for doing it early is that during development you are capturing APM information. It might help you debug something during development by capturing details you can investigate during an expected error, and then it’s also capturing sample data. That can be useful for both verifying your visualizations (for metrics involving counts), and then also for establishing baseline values for metric categories like latency. How should I get alerted? That will really depend on the urgency of the alert. For that matter, there are some alerts where you may want to configure multiple alerts at different thresholds. For example, at a warning level, you might just send an email, but then there could also be a critical level that sends a Slack message tagging your team. An example of a non-critical alert that is best as email-only are the ones I configured to go along with the milestones dashboard we’ll talk about next. It’s a good idea to test the formatting of your alert outputs by temporarily configuring it such that it will trigger right away. A best practice for determining which alerts notify in passive ways (e.g. an email) vs. demanding immediate attention (e.g. getting paged) is to ask yourself \"is there a defined set of steps to take in response to this alert to resolve it?\" If there is not a well-defined path to take to investigate or resolve an alert, then paging someone isn't going to add much value, and instead just add noise. It can be hard to stick to this, and if you've just realized that you've got a bunch of unactionable alerts being noisy, maybe see if you can think of a way to surface those in a less demanding way. What you don't want, is to accidentally train your team to ignore alerts because they are so often inactionable. Milestones dashboard The milestones dashboard arguably does not need to be separate from the status dashboard, and could be arranged as an area of the status dashboard, but I like having the separate space focused on highlighting achievements. The two metrics I was most interested in highlighting with milestones were unique users, and chat completions. There is a horizontal bullet visualization that I found suitable for displaying a gauge with a set range and an optional goal. I decided that time windows for all time, last 7 days, and last 30 days were standard but interesting to look at and so I have two columns side by side where each row is a different window of time. The bottom row has a bar chart aggregating by day, creating a nice way to look for growth over time. Special considerations for the Support Assistant & GenAI Observability We’ve discussed the basics of observing any new feature or system you’re launching, but every project is going to have some unique observability opportunities, so the rest of this blog post will be discussing some of the ones that came up for our team while working on the Support Assistant. If you’re also building a chatbot experience, some of these might apply directly for your use case, but even if your project is very different, maybe these ideas inspire some additional observability options and strategies. NOTE: Most of the code examples I am about to share come from the chat completion request handler in our API layer where we send a request to the LLM and stream the response back to the client. I am going to show you that same handler a few different times, but editted down to only include the lines relevant to the functionality being described at that time. First generation timeouts You may remember from the UX entry in this series that we chose to use streaming responses from the LLM in order to avoid having to wait for the LLM generation to finish before being able to show anything to the user. The other thing we did to try to give our assistant a more responsive experience was to enforce a 10 second timeout on getting the first chunk of generated text back. Being able to see trends in this type of error is critical for us to be able to know if our service is reliable, or overloaded. We've noticed with the launch that these timeouts are more likely to happen when there are more simultaneous users. Sometimes this even leads to retries overloading the provisioned capacity on our LLM service, leading to further errors displayed to the user. The APM agent runs on our server, and the timeout for first generation was configured in the client code that runs in the user’s browser, so I started experimenting with listening for events on the server to detect when the client had sent the abort signal so that I could send an error to APM with captureError , but what I found was that the server never became aware that the client aborted the request. I listened on the request, I listened on the socket, and then I did some Internet searches, and reached the conclusion that at least for our application stack there was no practical or built-in way for our server to recognize the client had timed out. To work around this, I moved the timeout and AbortController from the client code to be in our API layer that was talking directly to the LLM. Now when we hit the timeout, we’re on the server where we can send the error to APM and then close the connection early from the server side, which propagates just fine down to the client. Here's a view of our request handler that shows just the parts related to first generation timeout: Unfortunately, just closing the connection from the server created an unexpected behavior with the client. Without sending back a proper error signal or any generated response text, the client code was not running the parts of the code where we exited the loading state. To smooth this out, I updated the server side timeout to add an extra step before calling end() on the response. The streaming responses work by sending a series of events related to the generation down to the client. There are 4 flavors; Started, Generation, End, and Error. By adding an extra step to send an Error event before closing the connection, the client code was able to update the UI state to reflect an error. So let's see the handler again with that included: The first generation timeout error is a very generic error, and always logs the same message. For the other types of errors, there are many different failures that could result in reaching the error handler. For this, we pass in a parameterized message object , so that APM will group all of the errors captured by the same error handling together, despite the error message varying depending on the actual error that occurred. We have parameters for the error message, error code, and also which LLM we used. Declining requests The goal of the Support Assistant is to be helpful, but there are two broad categories of input that we want to avoid engaging with. The first is questions unrelated to getting technical support for Elastic products. We think it’s pretty fair to insist that since we pay the bills for the LLM service, that we don’t want folks using the Support Assistant to draft emails or write song lyrics. The second broad category we avoid are topics we know it cannot answer well. The prime example of this is billing questions. We know the Support Assistant does not have access to the data needed to help answer billing questions accurately, and certainly for a topic like billing, an inaccurate answer is worse than none at all (and the Sales team, finance team, and lawyers all breathed a sigh of relief 😉). Our approach was to add instructions to the prompt before the user's input as opposed to using a separate call to a 3rd party service. As our hardening needs evolve we may consider adding a service, or at least splitting the task of deciding whether or not to attempt to respond into a separate LLM request dedicated to making that determination. Standardized response I’m not going to share a lot of details about our prompt hardening methods and what rules we put in the prompt because this blog is about observability, and I also feel that the state of prompt engineering is not at a place where you can share your prompt without helping a malicious user get around it. That said, I do want to talk about something I noticed while I was developing our prompting strategy to avoid the two categories mentioned above. I was having some success with getting it to politely decline to answer certain questions, but it wasn’t very consistent with how it replied. And the quality of the response varied. To help with this, I started including a standardized response to use for declining requests as part of the prompt. With a predefined response in hand, the chatbot reliably used the standard response when declining a request. The predefined response is stored as its own variable that is then used when building the payload to send to the LLM. Let's take a look at why that comes in handy. Monitoring declined requests Bringing this back to observability, by having a predefined response for declining requests, it created an opportunity for me to examine the response coming from the LLM, and compare it to the variable containing the standardized decline message. When I see a match, I use captureError to keep a record of it. It’s important for us to keep an eye on declined requests because we want to be sure that these rejections are happening for the right reasons. A spike in rejections could indicate that a user or group of users is trying to get around our restrictions to keep the chat on the topic of Elastic product technical support. The strategy shown above collects all the tokens in a string[] and then joins then when the response is complete to make the comparison. I heard a great optimization suggestion from a colleague. Instead of collecting the tokens during streaming, just track an index into the DECLINED_REQUEST_MESSAGE , and then as each token comes in, see if it matches the next expected characters of the message. If so, keep tracking, but if there ever isn't a match, you know it's not a declined request. That way you don't have to consume extra memory buffering the whole response. We aren't seeing performance or memory issues, so I didn't update my strategy, but it was too clever of an idea to not mention here. Mitigating abuse Closely related to the previous section on declining requests, we know that these chatbot systems backed by LLMs can be a target for folks who want free access to an LLM service. Because you have to be logged in and have a technical support subscription (included with Elastic Cloud) to get access to the Support Assistant, this was a bit less of a concern for our particular launch, but we wanted to be prepared just in case, and maybe your use case doesn’t have the same gating upfront. Our two prongs of abuse mitigation are a reduced rate limit for the chat completion endpoint, and a feature flag system with the flexibility to allow us to configure flags that block access to a particular feature for a given user or organization. Rate limit Our application already had a general rate limit across all of our endpoints, however that rate limit is meant to be a very generous rate that should really only get triggered if something was actually going wrong and causing a significant amount of spam traffic. For a rate limit to be meaningful as applied to the Support Assistant chat completion endpoint, it was going to have to be a much lower limit. It was also important to leave the limit generous enough that we wouldn’t be penalizing enthusiastic users either. In addition to usage data from beta test we did with customers, we’ve had an internally-facing version of the Support Assistant available to our Support Engineers to help streamline their workflows in answering cases. This gave us something to anchor our usage expectation to. I looked at the previous week's data, and saw that our heaviest internal users had sent 10-20 chat messages per day on average with the top user sending over 70 in a single day. I also had latency metrics to tell me that the average completion time was 20 seconds. Without opening multiple windows or tabs, a single user asking rapid fire questions one after another, would not be able to send more than about 3 chat messages in a minute. Our app sessions expire after an hour, so I decided that it would be best to align our rate limit window with that hour long session window. That means the theoretical max chats for a single user in an hour where they use a single tab is 180 chats in an hour. The team agreed on imposing a limit of 20 chat completions in a one hour window. This is as many chats for our customer users in an hour as our heaviest internal users send in a whole day, while limiting any malicious users to ~11% of that theoretical max based on latency for a full completion. I then configured an alert looking for HTTP 429 responses on the chat completion endpoint, and there is also a table in the status dashboard listing users that triggered the limit, how many times, and when the most recent example was. I am very happy to report that we have not had anyone hit the limit in these first couple of weeks since launch. The next section discusses an option we gave ourselves for how to react if we did see certain individuals that seemed to be trying to abuse the system. Ban flags In rolling out the Support Assistant, we did a limited beta test with some hand-selected customers. To enable the Support Assistant for a subset of users during development, we set up a feature flag system to enable features. As we got closer to the launch we realized that our feature flags needed a couple of upgrades. The first was that we wanted to have the concept of features that were on by default (i.e. already fully launched), and the second was to allow flags to be configured such that they blocked access to a feature. The driving factor behind this one was that we heard some customer organizations might be interested in blocking their employees from engaging with the Support Assistant, but we also recognized that it could also come in handy if we ever reached a conclusion that some particular user was consistently not playing nice, we could cut off the feature while an appropriate Elastic representative tried to reach out and have a conversation. Context creates large payloads This last section is part special consideration for a chatbot, and part observability success story. In studying our status dashboard we started seeing HTTP 413 status codes coming back for a small, but non-negligible amount of traffic. That meant we were sending payloads from the browser that were above the configured size that our server would accept. Then one of our developers stumbled upon a reliable chat input that reproduced it so that we could confirm that the issue was the amount of context generated from our RAG search, combined with the user’s input was exceeding the default limits. We increased the size of the payloads accepted by the chat completion endpoint, and ever since we released the fix, we haven’t seen any more transactions with 413 response status. It’s worth noting that our fix to expand the accepted payload size is really more of a short-term bandage than a long-term solution. The way we plan to solve this problem in a more holistic way is to refactor the way we orchestrate our RAG searches and chat completions such that instead of sending the full content of the RAG results back to the client to include in the completion payload, instead we’d rather only return limited metadata like ID and title for the RAG results to the client, and then include that in the request with the user’s input to the completion endpoint. The completion endpoint would fetch the content of the search results by ID, and combine it with our prompt, and the user’s input to make the request to the LLM service. Here's a snippet where we configure the Express route for the chat completion endpoint. It touches on the rate limit, flags, and the boosted payload size: Conclusion Ideally, observability is more than one thing. It's multi-faceted to provide multiple angles and viewpoints for creating a more complete understanding. It can and should evolve over time to fill gaps or bring deeper understanding. What I hope you can take away from this blog is a framework for how to get started with observability for your application or feature, how the Elastic stack provides a full platform for achieving those monitoring goals, and a discussion of how this fits into the Support Assistant use case. Engage with this advice, and Bob's your mother's brother, you've got a successful launch! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Jump to Critical GenAI observability components Observability data collection Setting up Elastic APM Status dashboard APM traces Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"GenAI for customer support — Part 5: Observability - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/genai-customer-support-observability","meta_description":"Explore how we're using GenAI in customer support and for GenAI observability through a real-world use case. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. ES|QL Inside Elastic CL By: Costin Leau On April 15, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. It's my pleasure to announce LOOKUP JOIN —a new ES|QL command available in tech preview in Elasticsearch 8.18, designed to perform left joins for data enrichment. With ES|QL, one can query and combine documents from one index with documents from a second index based on a criterion defining how the documents should be paired natively in Elasticsearch. This approach enhances data management by dynamically correlating documents at query time across multiple indices, thus removing duplication. For instance, the following query connects employee data from one index with their corresponding department information from another index using a shared field key name: As the name indicates, LOOKUP JOIN performs a complementing, or left (outer) , join at query time between any regular index (employees) - the left side and any lookup index (departments) - the right side. All the rows from the left side are returned along with their corresponding equivalent (if any) from the right side. The lookup side's index mode must be set to lookup. This means that the underlying index can only have one shard. This current solution addresses the cardinality challenges of one side of the join and the issues that distributed systems like Elasticsearch encounter, which are outlined in the next section. Apart from using the lookup index mode, there are no limitations on the source data or the commands used. Additionally, no data preparation is needed. The join can be performed before or after the filtering: Be mixed with aggregations: Or be combined with another join: Executing a Lookup Join Let's illustrate what happens during runtime by looking at a basic query that doesn't include any other commands, such as filter. This will allow us to concentrate on the execution aspect as opposed to the planning phase. The logical plan, a tree structure representing the data flow and necessary transformations, is the result of translating the query above. This logical plan centers on the semantics of the query. To ensure efficient scaling, standard Elasticsearch indices are divided into multiple shards spread across the cluster. In a join scenario, sharding both the left (L) and right (R) sides would result in L*R partitions. To minimize the need for data movement, lookup joins require the right side (which provides the enriching data) to have a single shard, similar to an enrich index, with the replication dictated by the index settings (default is 1). This decreases the amount of nodes needed to execute the join, thereby reducing the problem space. As a result, L*R becomes L*1, which equals L. Thus, the coordinator needs to dispatch the plan only to the left side data nodes, with the hash join performed locally using the lookup/right index to “build” the underlying hash map while the left side is used for “probing” for matching keys in batches. The resulting distributed physical plan, which focused on the distributed execution of the query, looks as follows: The plan consists of two main parts or sub-plans: the physical plan that gets executed on the coordinator (generally speaking, the node receiving/responsible for the query completion) and the plan fragment, which is executed on the data nodes (the nodes holding the data). Since the coordinator does not have the data, it sends a plan fragment to the relevant data nodes for local execution. The results are then sent back to the coordinator node, which computes the final results. The communication between the two entities is represented in the plan through the Exchange block. The coordinator doesn't have to do much work for this query because most of the processing happens on the data nodes. The fragment encapsulates the logical subplan, enabling optimization based on the specific characteristics of each shard's data (e.g., missing fields, local minimum and maximum values). This local replanning also helps manage differences in node code that might exist between nodes or between a node and the coordinator, for example, during cluster upgrades. The local physical plan looks something like this: The plan is designed to reduce I/O by using efficient data extraction methods. The two nodes at the bottom of the tree act as roots , supplying the nodes above. Each one outputs references to the underlying Elasticsearch documents ( doc_id ). This is done intentionally to delay the loading of columns (fields) or documents for as long as possible through the designated extraction nodes (in yellow). In this particular plan, loading takes place right before the hash join on each of the joining sides and prior to the final project just before the data exits the node using only the join resulting data. Future work Qualifiers At the moment, the lookup join syntax requires the key to have the same name in both tables (similar to JOIN USING in some SQL dialects). This can be addressed through RENAME or EVAL : It’s an unnecessary inconvenience that we’re working on removing in the near future by introducing (source) qualifiers. The previous query could be rewritten as (syntax wip): Notice that the join key was replaced by an equality comparison, where each side is using a field name qualifier, which can be implicit (departments) or explicit (e). More join types and performance We are currently working on enhancing the lookup join algorithm to better exploit the data topology with a focus on specializations that leverage the underlying search structures and statistics in Lucene for data skipping. In the long term, we plan to support additional join types, such as inner join (or intersection, which combines documents that have the same field on both sides) and full outer join (or union, which combines the documents from both sides even when there is no common key). Feedback The path to native JOIN support in Elasticsearch has been a long one, dating back to version 0.90. Early attempts included nested and ` _parent ` field types, with the latter eventually being rewritten (in 2.0), deprecated (in 5.0), and replaced by the join field (in 6.0). More recent features like Transforms (7.3) and the Enrich ingest pipeline (7.5) also aimed to address join-like use cases. In the wider Elasticsearch ecosystem, Logstash and Apache Spark (via the ES-Hadoop connector) have provided alternative solutions. Elasticsearch SQL , which debuted in 6.3.0, is also worth mentioning due to the grammar similarity: while it supports a wide range of SQL functionality, native JOIN support has remained elusive. All these solutions work and continue to be supported. However, we think ES|QL, due to its query language and execution engine, significantly simplifies the user experience! ESQL Lookup join is in tech preview, freely available in Elasticsearch 8.18 and Elastic Cloud—try it out and let us know how it works for you! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more ES|QL December 31, 2024 Improving the ES|QL editor experience in Kibana With the new ES|QL language becoming GA, a new editor experience has been developed in Kibana to help users write faster and better queries. Features like live validation, improved autocomplete and quick fixes will streamline the ES|QL experience. ML By: Marco Liberati Jump to Executing a Lookup Join Future work Qualifiers More join types and performance Feedback Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Native joins available in Elasticsearch 8.18 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-esql-lookup-join","meta_description":"Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. Inside Elastic SB PK By: Shay Banon and Philipp Krenn On February 12, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch just turned 15-years-old. It all started back in February 2010 with the announcement blog post (featuring the iconic “You Know, for Search” tagline), first public commit , and the first release , which happened to be 0.4.0. Let’s take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. GET _cat/stats Since its launch, Elasticsearch has been downloaded an average of 3 times per second , totaling over 1.45 billion downloads. The GitHub stats are equally impressive: More than 83,000 commits from 2,400 unique authors , 38,000 issues, 25,000 forks, and 71,500 stars. And there is no sign of slowing down . All of this is on top of countless Apache Lucene contributions . We’ll get into those for the 25 year anniversary of Lucene, which is also this year. In the meantime, you can check out the 20 year anniversary page to celebrate one of the top Apache projects. A search (hi)story There are too many highlights to list them all, but here are 15 releases and features from the last 15 years that got Elasticsearch to where it is today: Elasticsearch, the company (2012): The open source project became an open source company, setting the stage for its growth. ELK Stack (2013): Elasticsearch joined forces with Logstash and Kibana to form the ELK Stack, which is now synonymous with logging and analytics. Version 1 (2014): The first stable release introduced key features like snapshot/restore, aggregations, circuit breakers, and the _cat API. Shield and Found (2015): Shield brought security to Elasticsearch clusters in the form of a (paid) plugin. And the acquisition of found.no brought Elasticsearch to the cloud, evolving into what is now Elastic Cloud. As an anecdote, nobody could find Found — SEO can be hard for some keywords. Version 2 (2015): Introduced pipelined aggregations, security hardening with the Java Security Manager, and performance and resilience improvements. Version 5 and the Elastic Stack (2016): Skipping two major versions to unify the version numbers of the ELK Stack and turning it into the Elastic Stack after adding Beats. This version also introduced ingest nodes and the scripting language Painless. Version 6 (2017): Brought zero-downtime upgrades, index sorting, and the removal of types to simplify data modeling. Version 7 (2019): Changed the cluster coordination to the more scalable and resilient Zen2, single-shard default settings, built-in JDK, and adaptive replica selection. Free security (2019): With the 6.8 and 7.1 releases, core security became free to help everyone secure their cluster. ILM, data tiers, and searchable snapshots (2020): Made time-series data more manageable and cost-effective with Index Lifecycle Management (ILM), tiered storage, and searchable snapshots. Version 8 (2022): Introduced native dense vector search with HNSW and enabled security by default. ELSER (2023): Launched Elastic Learned Sparse EncodeR model, bringing sparse vector search for better semantic relevance. Open source again (2024): Added AGPL as a licensing option to bring back open source Elasticsearch. Start Local (2024): Made it easier than ever to run Elasticsearch and Kibana: curl -fsSL https://elastic.co/start-local | sh LogsDB (2024): A new specialized index mode that reduces log storage by up to 65%. The future of search is bright Thanks to the rise of AI capabilities, search is more relevant and interesting than ever. So what is next for Elasticsearch? There’s way too much to name, so we’ll stick to three areas and the challenges they address. Serverless No shards, nodes, or versions. Elasticsearch Serverless — which is GA on AWS and just entered technical preview on Azure — takes care of the operational issues you might have experienced in the past: 15 years in, and someone is still setting number_of_shards: 100 for no reason. 15 years, and we’re still debating refresh_interval: 1s vs 30s like it’s a life-or-death decision. 15 years of major versions, minor heart attacks, and the thrill of migrating to the latest version. You can try out Elasticsearch Serverless today. ES|QL “Cheers to 15 years of Elasticsearch — where the Query DSL is still the most complex part of your day.” But it doesn’t have to be. The new Elasticsearch Piped Query Language (ES|QL) brings a much simpler syntax and a significant investment into a new compute engine with performance in mind. While we’re building out more features, you can already use ES|QL today. Don’t worry; the Query DSL will understand. AI everywhere 15 years of query tuning, and we’re still just throwing boost: 10 at the problem. 15 years of making your logs searchable while you still have no idea what’s happening in production. Still the best at finding that one log line… if you remember how you indexed it. AI is redefining what’s possible — from turning raw logs into actionable insights with the AI Assistant for observability and security , to more relevant search with semantic understanding and intelligent re-ranking .. This is only the beginning. More AI-powered features are on the horizon — bringing smarter search, enhanced observability, and stronger security. The future of Elasticsearch isn’t just about finding data; it’s about understanding it. Stay tuned — the best is yet to come. Thanks to all of you Thanks to all contributors, users, and customers over the last 15 years to make Elasticsearch what it is today. We couldn’t have done it without you and are grateful for every query you send to Elasticsearch. Here’s to the next 15 years. Enjoy! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Jump to GET _cat/stats A search (hi)story The future of search is bright Serverless ES|QL Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch: 15 years of indexing it all, finding what matters - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-history-15-years","meta_description":"Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. Inside Elastic US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more On January 13, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. The frozen data tier can achieve both low cost and good performance by leveraging Elastic's Searchable Snapshots - which offer a compelling solution for managing vast amounts of data while maintaining the performant searchability of data within a budget. In this article, we delve into a benchmark of Elastic's hot and frozen data tiers by running sample queries on 105 terabytes of logs spanning more than 90 days. These queries replicate common tasks within Kibana's Discover, including search with highlighting, total hits, date histogram aggregation, and terms aggregation; that all happen behind the scenes when a user triggers a simple search. The results reveal that Elastic's frozen data tier is quick and delivers latency comparable to its hot tier, with only the first query to the object store being slower - subsequent queries are fast. We replicated the way a typical user would interact with a hot-frozen deployment through Kibana's Discover - its main interface for interacting with indexed documents. When a user issues a search using Discover's search bar three tasks are executed in parallel: a search and highlight operation on 500 docs that doesn't track the total amount of hits (referred as discover_search tasks on the results) a search that tracks the total hits ( discover_search_total in the results) a date histogram aggregation to construct the bar chart (referred as discover_date_histogram ) and also a terms aggregation (referred as discover_terms_agg ) when/if the user clicks the left side bar. Data tiers in Elastic Some types of data decrease in value over time. It's natural to think about application logs where the most recent records are usually the ones that need to be queried more frequently and also need the fastest possible response time. But there are several other examples of such data like medical records (detailed patient histories, diagnoses and physician notes); legal documents (contracts, court rulings, case files, etc.) and bank records (transaction records including descriptions of purchases and merchant names)-just to cite three. All contain unstructured or semi-structured text that requires efficient search capabilities to extract relevant information. As these records age, their immediate relevance may diminish, but they still hold significant value for historical analysis, compliance, and reference purposes. Elastic's data tiers — Hot, Warm, Cold, and Frozen– provide the ideal balance of speed and cost, ensuring you maximize the value of these types of data as they age without sacrificing usability. Through both Kibana and Elasticsearch's search API the use of the underlying data tiers is always automatic and transparent–users don't need to issue search queries in a different way to retrieve data from any specific tier (no need to manually restore the data, or \"rehydrate\"). In this blog we keep it simple by using solely the Hot and Frozen tiers, in what is commonly called a hot-frozen scenario. How the frozen tier works In a hot-frozen scenario, data begins its journey in the hot tier, where it is actively ingested and queried. The hot tier is optimized for high-speed read and write operations, making it ideal for handling the most recent and frequently accessed data. As data ages and becomes less frequently accessed, it is transitioned to the frozen tier to optimize storage costs and resource utilization. The transition from the hot tier to the frozen tier involves converting the data into searchable snapshots. Searchable snapshots leverage the snapshot mechanism used for backups, allowing the data to be stored in a cost-effective manner while still being searchable. This eliminates the need for replica shards, significantly reducing the local storage requirements. Once the data is in the frozen tier, it is managed by nodes specifically designated for this purpose. These nodes do not need to have enough disk space to store full copies of all indices. Instead, they utilize an on-disk Least Frequently Used (LFU) cache. This cache stores only portions of the index data that are downloaded from the blob store as needed to serve queries. The on-disk cache functions similarly to an operating system's page cache, enhancing access speed to frequently requested parts of the data. When a query is executed in the frozen tier, the process involves several steps to ensure efficient data retrieval and caching: 1. Read requests mapping : At the Lucene level, read requests are mapped to the local cache. This mapping determines whether the requested data is already present in the cache. 2. Cache mishandling : If the required data is not available in the local cache (a cache miss), Elasticsearch handles this by downloading a larger region of the Lucene file from the blob store. Typically, this region is a 16MB chunk, which is a balance between minimizing the number of fetches and optimizing the amount of data transferred. 3. Adding data to cache : The downloaded chunk is then added to the local cache. This process ensures that subsequent read requests for the same region can be served directly from the local cache, significantly improving query performance by reducing the need to repeatedly fetch data from the blob store. 4. Cache configuration options : Shared cache size : This setting accepts either a percentage of the total disk space or an absolute byte value. For dedicated frozen tier nodes, the default is 90% of the total disk space. Max headroom : Defines the maximum headroom to maintain. If not explicitly set, it defaults to 100GB for dedicated frozen tier nodes. 5. Eviction policy : The node-level shared cache uses a LFU policy to manage its contents. This policy ensures that frequently accessed data remains in the cache, while less frequently accessed data is evicted to make room for new data. This dynamic management of the cache helps maintain efficient use of disk space and quick access to the most relevant data. 6. Lucene index management : To further optimize resource usage, the Lucene index is opened only on-demand—when there is an active search. This approach allows a large number of indices to be managed on a single frozen tier node without consuming excessive memory. Methodology We ran the tests on a six node cluster in Elastic Cloud hosted on Google Cloud Platform on N2 family nodes: 3 x gcp.es.datahot.n2.68x10x45 - Storage-optimized Elasticsearch instances for hot data. 3 x gcp.es.datafrozen.n2.68x10x90 - Storage-optimized (dense) Elasticsearch instances serving as a cache tier for frozen data. We measured the following spans, which also equate to Terabytes in size, since we indexed one Terabyte per day. We used Rally to run the tests, below is a sample test relative to an uncached search on one day of frozen data ( discover_search_total-1d-frozen-nocache ), iterations refer to the number of times the entire set of operations is repeated, which in this case is 10. Each operation defines a specific task or set of tasks to be performed, and in this example, it is a composite operation. Within this operation, there are multiple requests that specify the actions to be taken, such as clearing the frozen cache by issuing a POST request. The stream within a request indicates a sequence of related actions, such as submitting a search query and then retrieving and deleting the results. Each test would run for 10 times per benchmark run, and we performed 500 benchmark runs across several days, therefore the sample for each task is 5,000. Having a high amount of measurements is essential when we want to ensure statistical significance and reliability of the results. This large sample size helps to smooth out anomalies and provides a more accurate representation of performance, allowing us to draw meaningful conclusions from the data. Results The detailed results are outlined below. The \"tip of the candle\" represents the max (or p100 ) value observed within all the requests for a specific operation, and they are grouped by tier. The green value represents the p99.9 , or the value below what 99.9% of the requests would fall. Due to how Kibana interacts with Elasticsearch–which is via async searches–a more logical way of representing the time is by using horizontal bar charts as below. Since the requests are asynchronous and parallel, they will complete at different times. You don't have to wait for all of them to start seeing query results, and this is how we read the benchmark results. The results are expressed as, for example, 543ms - 2s where 543ms is when we received the first result and 2s when we received the last. 1 Day Span / 1 Terabyte What we observed 99.9% of the times (p99.9): Hot: 543ms - 2s Frozen Not Cached: 1.8s - 14s Frozen Cached: 558ms - 11s What we observed as a maximum latency (likely the very first query): Hot: 630ms - 2s Frozen Not Cached: 1.9s - 28s Frozen Cached: 750ms - 19s 7 Days Span / 7 Terabytes What we observed 99.9% of the times (p99.9): Hot: 555ms - 792s Frozen Not Cached: 2.5s - 14s Frozen Cached: 1s - 12s What we observed as a maximum latency (likely the very first query): Hot: 842ms - 4s Frozen Not Cached: 2.5s - 5.6m (336s) Frozen Cached: 1.1s - 26s 14 Days Span / 14 Terabytes What we observed 99.9% of the times (p99.9): Hot: 551ms - 608ms Frozen Not Cached: 1.8s - 15s Frozen Cached: 551ms - 592ms What we observed as a maximum latency (likely the very first query): Hot: 785ms - 9s Frozen Not Cached: 2.3s - 32s Frozen Cached: 624ms - 7s 30 Days Span / 30 Terabytes We did not use hot data past 14 days on this test, but we can still use the results for frozen as a reference. What we observed 99.9% of the times (p99.9): Frozen Not Cached: 2.3s - 12s Frozen Cached: 1s - 11s What we observed as a maximum latency (likely the very first query): Frozen Not Cached: 2.4s - 68s Frozen Cached: 1.1s - 27s 60 Days Span / 60 Terabytes What we observed 99.9% of the times (p99.9): Frozen Not Cached: 2.3s - 13s Frozen Cached: 1s - 11s What we observed as a maximum latency (likely the very first query): Frozen Not Cached: 2.4s - 18s Frozen Cached: 1.1s - 240s 90 Days Span / 90 Terabytes What we observed 99.9% of the times (p99.9): Frozen Not Cached: 2.4s - 13s Frozen Cached: 1s - 11s What we observed as a maximum latency (likely the very first query): Frozen Not Cached: 3.3s - 5m (304s) Frozen Cached: 1.1s - 1.6m (98s) Cost implications (16x reduction) Let's make a simple pricing exercise using Elastic Cloud. If we were to put the entirety of a 90 days / 90 TB dataset in an all-hot deployment on the most performant hardware profile for large datasets ( Storage Optimized ), that would cost $53.382 / month since we would need about 45 hot nodes to cover about 120TB. Since Elastic Cloud has different hardware profiles, we could also select Storage optimized (dense), which brings the cost to $28.222. However, by benefiting from the Frozen tier, we could make a deployment that holds 1 day in Hot and the rest on Frozen. The cost of such deployment can be as low as $3.290, a staggering 16x reduction on costs . Use Elastic's frozen data tier to cool down the cost of data storage Elastic's frozen data tier redefines what's possible in data storage and retrieval. Benchmark results show that it delivers performance comparable to the hot tier, efficiently handling typical user tasks. While rare instances of slightly higher latency (0.1% of the time) may occur, Elastic's searchable snapshots ensure a robust and cost-effective solution for managing large datasets. Whether you're searching through years of security data for advanced persistent threats or analyzing historical seasonal trends from logs and metrics, searchable snapshots and the frozen tier deliver unmatched value and performance. By adopting the frozen tier, organizations can optimize storage strategies, maintain responsiveness, keep data searchable, and stay within budget. To learn more, see how to set up hot and frozen data tiers for your Elastic Cloud deployment. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Jump to Data tiers in Elastic How the frozen tier works Methodology Results 1 Day Span / 1 Terabyte Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Ice, ice, maybe: Measuring searchable snapshots performance - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/searchable-snapshots-benchmark","meta_description":"Learn how Elastic’s searchable snapshots make the frozen tier perform like the hot tier, offering latency consistency and reducing costs."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. Vector Database Lucene ML Research TT By: Tommaso Teofili On January 7, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. The challenge of finding nearest neighbors efficiently in high-dimensional spaces, particularly as datasets grow in size, is one of the most important ones in the context of vector search. As discussed in our previous blog post , brute force nearest neighbor search might be the best choice, when dataset size is limited. On the other hand, as vector dataset size increases, switching to approximate nearest neighbor search can be useful to retain query speed without sacrificing accuracy. Elasticsearch implements approximate nearest neighbor search via the Hierarchical Navigable Small World algorithm . HNSW offers an efficient way to navigate the vector space, reducing the computational cost while still maintaining high search accuracy. In particular its hierarchical layered structure makes it possible to visit candidate neighbors and decide whether to include them in the final result set with fewer vector distance computations. However, despite its strengths, the HNSW algorithm can be further optimized for large-scale vector searches. One effective way to enhance HNSW's performance is by finding ways to stop visiting the graph under specific circumstances, called early termination . This blog post explores early termination concepts for HNSW and how they can optimize query execution. HNSW redundancy HNSW is an approximate nearest neighbor algorithm that builds a layered graph where nodes represent vectors and edges represent the proximity between vectors in the vector space. Each layer contains incrementally a larger number of graph nodes. When querying, the search traverses this graph, starting at a random entry point and navigating towards the closest neighbors through the layers. The search process is iterative and expands as it examines more nodes and vectors. This balance between speed and accuracy is central to HNSW, but it can still result in redundant computations, especially when large datasets are involved. In HNSW, redundant computations primarily occur when the algorithm continues to evaluate new nodes or candidates that provide little to no improvement in finding actual neighbors to a query. This happens because, in standard HNSW traversal, the algorithm proceeds layer-by-layer, exploring and expanding candidate nodes until all possible paths are exhausted. In particular, this kind of redundancy can arise when the dataset includes highly similar or duplicate vectors, clusters with dense intra-cluster connections, or vectors in very high-dimensional spaces with little intrinsic structure. Such redundancy leads to visiting unnecessary edges, increasing memory usage and potentially slowing search performance without improving accuracy. In high-dimensional spaces where similarity scores decay quickly, some edges often fail to contribute meaningful shortcuts in the graph, resulting in inefficient navigation paths too. So, in case a number of unnecessary computations can be performed while traversing the graph, one could try to improve the HNSW algorithm to mitigate this issue. Early Termination FTW Navigating the solution space is a fundamental concept in optimization and search algorithms, where the goal is to find an optimal solution among a set of possible solutions. The solution space represents all potential solutions to a given problem, and navigating it involves systematically exploring these possibilities. This process can be visualized as moving through a graph where each node represents a different solution, and the objective is to identify the node that best meets the criteria of the problem. Understanding how to effectively navigate the solution space is crucial for solving complex problems that have huge numbers of solutions. Early termination is a generic optimization strategy that can be applied to any such algorithm to make smart decisions about when stopping searching for solutions under certain circumstances. If any solution is considered 'good enough' to meet a desired criteria, the search can stop and the solution can be considered either a good candidate or an optimal solution. This means some potentially better solutions might remain unexplored, so it's tricky to find a perfect compromise between efficiency and quality of the final solution(s). Early Termination in HNSW In the context of HNSW, early termination can be used to stop the search process before all potential candidates nodes (vectors) have been fully evaluated. Evaluating a candidate node means calculating the actual similarity between the query vector and the vector corresponding to the node in the graph that is being processed; for this reason, when skipping a bunch of such operations while traversing each layer, the computational cost of the query can be greatly reduced. On the other hand, skipping a candidate that would otherwise result in a true nearest neighbor will surely affect the quality of the search results, potentially missing a few candidate vectors that are close to the query vector. Consequently the trade-off between the efficiency gains and loss in accuracy is to be handled with care. Early termination is useful in case of: Sublinear Efficiency: You want to optimize performance in exchange for slightly lower recall. High-Throughput Systems: Faster response times are more valuable than the highest accuracy. Resource Constraints: Compute or memory limitations make full traversal of the HNSW graph undesirable. In the context of HNSW there are a number of options for implementing an early termination strategy. Fixed candidate pool size: One of the simplest early termination strategies is to limit the size of the candidate pool (e.g., the number of nodes evaluated during the search). In HNSW, the search process is iterative, expanding to more nodes as it progresses. By setting a limit on the number of candidates considered, we can terminate the search early and return results based on only a subset of the total graph. Of course this can be implemented either in a layer-wise fashion or accounting for all the nodes across the whole graph. Distance threshold-based termination: Another potentially effective early termination strategy is to make smart decisions based on distance computations between the query vector and the vectors corresponding to HNSW nodes. One could set a threshold based on the distance between the query vector and the current closest vector. If a vector is found whose distance is below a specified threshold, the search can be stopped early, assuming that further exploration is unlikely to yield significantly better results. This goes hand in hand with constraints on the fact that nodes get visited in a smart order, to avoid missing possibly relevant neighbors. Dynamic early termination based on quality estimation: A more sophisticated approach is dynamically adjusting the termination criteria based on the \"quality\" of the results found during each search query. If the search process is converging quickly on high-quality neighbors (e.g., neighbors with very close distances), the algorithm can terminate early, even without hitting a predefined threshold. The first two strategies fall in the category of \"fixed configuration\" early termination strategies, so that the search terminates based on fixed constraints that do not take into account query-specific challenges. In fact not all queries are equally hard, some queries require more candidate visit than others, for example, when the distribution of the vectors is skewed. Consequently some query vectors might fall into denser regions of the vector space, so that they have more candidate nearest neighbors, while some others might fall into \"less populated regions\", making it harder to find their true nearest neighbors. Because of such situations, early termination strategies that can adapt to the density of the vector space (and consequently to the connectivity of the HNSW graph) seem more attractive for real-life scenarios. Therefore determining the optimal point at which to stop searching for each query is more likely to lead to substantial latency reductions without compromising accuracy. Such kinds of early termination strategies are dubbed adaptive as they adapt to each query instance to decide when to terminate the search process. For example, an adaptive early termination strategy can utilize machine learning models to predict how much search effort is sufficient for a given query to achieve the desired accuracy. One such a model dynamically adjusts how much of the graph to explore based on the individual query's characteristics and intermediate search results. Speaking about intermediate search results, they are often powerful predictors of how much further to search. If the initial results are already close to the query, the nearest neighbors are likely to be found soon, allowing for early termination. Conversely, poor initial results indicate a need for more extensive exploration, (see this paper ). Lucene makes it possible to implement early termination in HNSW by means of the KnnCollector interface that exposes an earlyTerminated() method , but it also offers a couple of fixed configuration early termination strategies for HNSW: TimeLimitingKnnCollector makes it possible to stop the HNSW graph traversing when a certain time threshold is met. AbstractKnnCollector is a base KnnCollector implementation that stops the graph traversal once a fixed number of graph nodes are visited. As an additional example, to implement a distance threshold-based termination, we could rely on the minimum competitive similarity recorded by Lucene during HNSW traversal (used to make sure only competitive nodes are explored) and early exit when it falls below a given threshold. Conclusion Early termination strategies for approximate KNN can lead to notable speedups while retaining good accuracy, if correctly implemented. Fixed strategies are easier to implement but they might require more tuning and also not work well across different queries. Dynamic / adaptive strategies, on the other hand, are harder to implement but have the advantage of being able to better adapt to different search queries. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Jump to HNSW redundancy Early Termination FTW Early Termination in HNSW Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Early termination in HNSW for faster approximate KNN search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/hnsw-knn-search-early-termination","meta_description":"Learn how HNSW can be made faster for KNN search, using smart early termination strategies."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. Integrations Vector Database CH HM By: Chris Hegarty and Hemant Malik On March 19, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. We in the Elastic Engineering org have been busy optimizing vector database performance for a while now. Our mission: making Lucene and Elasticsearch the best vector database. Through hardware accelerated CPU SIMD instructions , introducing new vector data compression innovations ( Better Binary Quantization a.k.a BBQ ), and then exceeding expectations by updating the algorithmic approach to BBQ for even more benefits, and also making Filtered HNSW faster . You get the gist — we’re building a faster, better, efficient(er?) vector database for the developers as they solve those RAG-gedy problems! As part of our mission to leave no efficiencies behind, we are exploring acceleration opportunities with these curious computer chips, which you may have heard of — NVIDIA GPUs! (Seriously, have you not?). When obsessing over performance, we have several problem spaces to explore — how to index exponentially more data, how to retrieve insights from it, and how to do it when your ML models are involved. You should be able to eke out every last benefit available when you have GPUs. In this post, we dive into our collaboration with the NVIDIA vector search team as we explore GPU-accelerated vector search in Elasticsearch. This work paves the way for use cases where developers could use a mix of GPUs and CPUs for real-world Elasticsearch-powered apps. Exciting times! Elasticsearch: Well, hello, GPUs! We are excited to share that the Elasticsearch engineering team is helping build the open-source cuVS Java API experience for developers, which exposes bindings for vector search algorithms. This work leverages our previous experience with Panama FFI. Elasticsearch and Apache Lucene use the NVIDIA cuVS API to build the graph during indexing. Okay, we are jumping ahead; let’s rewind a bit. NVIDIA cuVS , an open-source C++ library, is at the heart of this collaboration. It aims to bring GPU acceleration to vector search by providing higher throughput, lower latency, and faster index build times. But Elasticsearch and Apache Lucene are written in Java; how will this work? Enter lucene-cuvs and the Elastic-NVIDIA-SearchScale collaboration to bring it into the Lucene ecosystem to explore GPU-accelerated vector search in Elasticsearch. In the recent NVIDIA cuVS 25.02 release, we added a Java API for cuVS. The new API is experimental and will continue to evolve, but it’s currently available for use. The question may arise: aren’t Java to native function calls slow? Not anymore! We’re using the new Panama FFI (Foreign Function Interface) for the bindings, which has minimal overhead for Java to native downcalls. We’ve been using Panama FFI in Elasticsearch and Lucene for a while now. It’s awesome! But... there is always a “but”, isn’t there? FFI has availability challenges across Java versions. We overcame this by compiling the cuVS API to Java 21 and encapsulating the implementation within a multi-release jar targeting Java 22. This allows the use of cuVS Java directly in Lucene and Elasticsearch. Ok, now that we have the cuVS Java API, what else would we need? A tale of two algorithms Elasticsearch supports the HNSW algorithm for scalable approximate KNN search. However, to get the most out of the GPU, we use a different algorithm, CAGRA [ C UDA A NN GRA ph ] , which has been specifically designed for the high levels of parallelism offered by the GPU. Before we get into how we look to add support for CAGRA, let’s look at how Elasticsearch and Lucene access index data through a “codec format”. This consists of the on-disk representation, the interfaces for reading and writing data, and the machinery for dealing with Lucene’s segment-based architecture. We are implementing a new KNN (k-nearest neighbors) vector format that internally uses the cuVS Java API to index and search on the GPU. From here, we “plumb” this codec type through Elasticsearch’s mappings to a field type in the index. As a result, your existing KNN queries continue to work regardless of whether the backing index is using a CAGRA or HNSW graph. Of course, this glosses over many details, which we plan to cover in a future blog. The following is the high-level architecture for a GPU-accelerated Elasticsearch. This new codec format defaults to CAGRA. However, it also supports converting a CAGRA graph to an HNSW graph for search on the CPU. Indexing and Searching: Making some “Core” decisions With the stateless architecture for Elasticsearch Serverless, which separates indexing and search, there is now a clear delineation of responsibilities. We pick the best hardware profile to fulfill each of these independent responsibilities. We anticipate users to consider two main deployment strategies: Index and Search on the GPU: During indexing, build a CAGRA graph and use it during search - ideal when extremely low latency search is required. Index on GPU and Search on CPU: During indexing, build a CAGRA graph and convert it to an HNSW graph. The HNSW graph is stored in the index, which can later be used on the CPU for searching. This flexibility provides different deployment models, offering tradeoffs between cost and performance. For example, an indexing service could use GPU to efficiently build and merge graphs in a timely manner while using a lower-powered CPU for searching. So here is the Plan, Stan We are looking forward to bringing performance gains and flexibility with deployment strategies to users, offering various knobs to balance cost and performance. Here is the NVIDIA GTC 2025 session where this work was presented in detail. We’d like to thank the engineering teams at NVIDIA and SearchScale for their fantastic collaboration. In an upcoming blog, we will explore the implementation details and performance analysis in greater depth. Hold on to your curiosity hats 🎩! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Jump to Elasticsearch: Well, hello, GPUs! A tale of two algorithms Indexing and Searching: Making some “Core” decisions So here is the Plan, Stan Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/gpu-accelerated-vector-search-elasticsearch-nvidia","meta_description":"Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. Inside Elastic AS By: Antonio Schönmann On August 22, 2024 Part of Series GenAI for customer support Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog series reveals how our Field Engineering team used the Elastic stack with generative AI to develop a lovable and effective customer support chatbot. If you missed other installments in the series, be sure to check out part one , part two , part three , the launch blog , and part five . Welcome to part 4 of our blog series on integrating generative AI in Elastic's customer support. This installment dives deep into the role of Retrieval-Augmented Generation (RAG) in enhancing our AI-driven Technical Support Assistant. Here, we address the challenges, solutions, and outcomes of refining search effectiveness, providing action items to further improve its capabilities using the toolset provided in the Elastic Stack version 8.11 . Implied by those actions, we have achieved a ~75% increase in top-3 results relevance and gained over 300,000 AI-generated summaries that we can leverage for all kinds of future applications . If you're new to this series, be sure to review the earlier posts that introduce the core technology and architectural setup. If you missed the last blog of the series, you can find it here . RAG tuning: A search problem Perfecting RAG (Retrieval-Augmented Generation) is fundamentally about hitting the bullseye in search accuracy 🎯: Like an archer carefully aiming to hit the center of the target, we want to focus on accuracy for each hit. Not only that, we also want to ensure that we have the best targets to hit – or high-quality data . Without both together , there's the potential risk that large language models (LLMs) might hallucinate and generate misleading responses. Such mistakes can definitely shake users' trust in our system, leading to a deflecting usage and poor return on investment. To avoid those negative implications, we've encountered several challenges that have helped us refine our search accuracy and data quality over the course of our journey. These challenges have been instrumental in shaping our approach to tuning RAG for relevance, and we're excited to share our insights with you. That said: let's dive into the details! Our first approach We started with a lean, effective solution that could quickly get us a valuable RAG-powered chatbot in production. This meant focusing on key functional aspects that would bring it to operational readiness with optimal search capabilities. To get us into context, we'll make a quick walkthrough around four key vital components of the Support AI Assistant: data , querying , generation , and feedback . Data As showcased in the 2nd blog article of this series , our journey began with an extensive database that included over 300,000 documents consisting of Technical Support Knowledge Articles and various pages crawled from our website, such as Elastic's Product Documentation and Blogs . This rich dataset served as the foundation for our search queries, ensuring a broad spectrum of information about Elastic products was available for precise retrieval. To this end, we leveraged Elasticsearch to store and search our data. Query Having great data to search by, it's time to talk about our querying component. We adopted a standard Hybrid-Search strategy, which combines the traditional strengths of BM25 , Keyword-based Search , with the capabilities of Semantic Search , powered by ELSER . For the semantic search component, we used text_expansion queries against both title and summary embeddings. On the other hand, for broad keyword relevance we search multiple fields using cross_fields , with a minimum_should_match parameter tuned to better perform with longer queries. Phrase matches, which often signal greater relevance, receive a higher boost. Here’s our initial setup: Generation After search, we build up the system prompt with different sets of instructions, also contemplating the top 3 search results as context to be used. Finally, we feed the conversation alongside the built context into the LLM, generating a response. Here's the pseudocode showing the described behavior: The reason for not including more than 3 search results was the limited quantity of tokens available to work within our dedicated Azure OpenAI's GPT4 deployment (PTU), allied with a relatively large user base. Feedback We used a third-party tool to capture client-side events, connecting to Big Query for storage and making the JSON-encoded events accessible for comprehensive analysis by everyone on the team. Here's a glance into the Big Query syntax that builds up our feedback view. The JSON_VALUE function is a means to extract fields from the event payload: We also took advantage of valuable direct feedback from internal users regarding the chatbot experience, enabling us to quickly identify areas where our search results did not match the user intent. Incorporating both would be instrumental in the discovery process that enabled us to refine our RAG implementation, as we're going to observe throughout the next section. Challenges With usage, interesting patterns started to emerge from feedback. Some user queries, like those involving specific CVEs or Product Versions for instance, were yielding suboptimal results, indicating a disconnect between the user's intent and the GenAI responses. Let's take a closer look at the specific challenges identified, and how we solved them. #1: CVEs (Common Vulnerabilities and Exposures) Our customers frequently encounter alerts regarding lists of open CVEs that could impact their systems, often resulting in support cases. To address questions about those effectively, our dedicated internal teams meticulously maintain CVE-type Knowledge Articles. These articles provide standardized, official descriptions from Elastic, including detailed statements on the implications, and list the artifacts affected by each CVE. Recognizing the potential of our chatbot to streamline access to this crucial information, our internal InfoSec and Support Engineering teams began exploring its capabilities with questions like this: For such questions, one of the key advantages of using RAG – and also the main functional goal of adopting this design – is that we can pull up-to-date information, including it as context to the LLM and thus making it available instantly to produce awesome responses. That naturally will save us time and resources over fine-tuned LLM alternatives. However, the produced responses wouldn't perform as expected. Essential to answer those questions, the search results often lacked relevance, a fact which we can confirm by looking closely at the search results for the example: With just one relevant hit ( CVE-2019-10172 ), we left the LLM without the necessary context to generate proper answers: The observed behavior prompted us with an interesting question: How could we use the fact that users often include close-to-exact CVE codes in their queries to enhance the accuracy of our search results? To solve this, we approached the issue as a search challenge. We hypothesized that by emphasizing the title field matching for such articles, which directly contain the CVE codes, we could significantly improve the precision of our search results. This led to a strategic decision to conditionally boost the weighting of title matches in our search algorithm. By implementing this focused adjustment, we refined our query strategy as follows: As a result, we experienced much better hits for CVE-related use cases, ensuring that CVE-2016-1837 , CVE-2019-11756 and CVE-2014-6439 are top 3: And thus generating a much better response by the LLM: Lovely! By tuning our Hybrid Search approach, we significantly improved our performance with a pretty simple, but mostly effective Bob's Your Uncle solution (like some folks would say)! This improvement underscores that while semantic search is a powerful tool, understanding and leveraging user intent is crucial for optimizing search results and overall chat experience in your business reality. With that in mind, let's dive into the next challenge! #2: Product versions As we delved deeper into the challenges, another significant issue emerged with queries related to specific versions. Users frequently inquire about features, migration guides, or version comparisons, but our initial search responses were not meeting expectations. For instance, let's take the following question: Our initial query approach would return the following top 3: Elasticsearch for Apache Hadoop version 8.14.1 | Elasticsearch for Apache Hadoop [8.14] | Elastic ; APM version 8.14 | Elastic Observability [8.14] | Elastic ; Elasticsearch for Apache Hadoop version 8.14.3 | Elasticsearch for Apache Hadoop [8.14] | Elastic . Corresponding to the following _search response: Being irrevocably irrelevant, they ended up resulting in a completely uninformed answer from the chatbot, affecting the overall user experience and trust in the Support AI Assistant: Further investigating the issue we collected valuable insights. By replaying the query and looking into the search results, we noticed three serious problems with our crawled Product Documentation data that were contributing to the overall bad performance: Inaccurate semantic matching : Semantically, we definitely missed the shot. Why would we match against such specific articles, including two specifically about Apache Hadoop, when the question was so much broader than Hadoop? Multiple versions, same articles : Going further down on the hits of the initially asked question, we often noticed multiple versions for the same articles, with close to exactly the same content. That often led to a top 3 cluttered with irrelevant matches! Wrong versions being returned : It's fair to expect that having both 8.14.1 and 8.14.2 versions of the Elasticsearch for Apache Hadoop article, we'd return the latter for our query – but that just wasn't happening consistently. From the impact perspective, we had to stop and solve those – else, a considerable part of user queries would be affected. Let's dive into the approaches taken to solve both! A. Inaccurate semantic matching After some examination into our data, we've discovered that the root of our semantic matching issue lived in the fact that the summary field for Product Documentation-type articles generated upon ingestion by the crawler was just the first few characters of the body . This redundancy misled our semantic model, causing it to generate vector embeddings that did not accurately represent the document's content in relation to user queries. As a data problem, we had to solve this problem in the data domain: by leveraging the use of GenAI and the GPT4 model, we made a team decision to craft a new AI Enrichment Service – introduced in the 2nd installment of this blog series . We decided to create our own tool for a few specific reasons: We had unused PTU resources available. Why not use them? We needed this data gap filled quickly, as this was probably the greatest relevance detractor. We wanted a fully customizable approach to make our own experiments. Modeled to be generic, our usage for it boils down to generating four new fields for our data into a new index, using Enrich Processors to make them available to the respective documents on the target indices upon ingestion. Here's a quick view into the specification for each field to be generated: After generating those fields and setting up the index Enrich Processors , the underlying RAG-search indices were enriched with a new ai_fields object, also making ELSER embeddings available under ai_fields.ml.inference : Now, we can tune the query to use those fields, making for better overall semantic and keyword matching: Single-handedly, that made us much more relevant. More than that – it also opened a lot of new possibilities to use the AI-generated data throughout our applications – matters of which we'll talk about in future blog posts. Now, before retrying the query to check the results: what about the multiple versions problem? B. Multiple versions, same articles When duplicate content infiltrates these top positions, it diminishes the value of the data pool, thereby diluting the effectiveness of GenAI responses and leading to a suboptimal user experience. In this context, a significant challenge we encountered was the presence of multiple versions of the same article. This redundancy, while contributing to a rich collection of version-specific data, often cluttered the essential data feed to our LLM, reducing the diversity of it and therefore undermining the response quality. To address the problem, we employed the Elasticsearch API collapse parameter, sifting through the noise and prioritizing only the most relevant version of a single content. To do that, we computed a new slug field into our Product Documentation crawled documents to identify different versions of the same article, using it as the collapse field (or key ). Taking the Sort search results documentation page as an example, we have two versions of this article being crawled: Sort search results | Elasticsearch Guide [8.14] | Elastic Sort search results | Elasticsearch Guide [7.17] | Elastic Those two will generate the following slug : guide-en-elasticsearch-reference-sort-search-results Taking advantage of that, we can now tune the query to use collapse : As a result, we'll now only show the top-scored documentation in the search results, which will definitely contribute to increasing the diversity of knowledge being sent to the LLM. C. Wrong versions being returned Similar to the CVE matching problem , we can boost results based on the specific versions being mentioned, allied with the fact that version is a separate field in our index. To do that, we used the following simple regex-based function to pull off versions directly from the user question: We then add one more query to the should clause, boosting the version field accordingly and getting the right versions to the top (whenever they're mentioned): With A , B and C solved, we're probably ready to see some strong results! Let's replay the question! By replaying the previously tried question: And therefore running the Elasticsearch query once again, we get dramatically better results consisting of the following articles: Elasticsearch version 8.14.3 | Elasticsearch Guide [master] | Elastic Elasticsearch version 8.14.2 | Elasticsearch Guide [master] | Elastic Release notes | Elasticsearch Guide [8.14] | Elastic Consequently, we have a better answer generated by the LLM. More powerful than that – in the context of this conversation, the LLM is now conscious about versions of Elasticsearch that are newer than the model's cut-off date, crafting correct answers around those: Exciting, right? But how can we quantify the improvements in our query at this point? Let's see the numbers together! Measuring success To assess the performance implied by our changes, we've compiled a test suite based on user behavior, each containing a question plus a curated list of results that are considered relevant to answer it. Those will cover a wide wide range of subjects and query styles, reflecting the diverse needs of our users. Here's a complete look into it: But how do we turn those test cases into quantifiable success? To this end, we have employed Elasticsearch's Ranking Evaluation API alongside with the Precision at K (P@K) metric to determine how many relevant results are returned between the first K hits of a query. As we're interested in the top 3 results being fed into the LLM, we're making K = 3 here. To automate the computation of this metric against our curated list of questions and effectively assess our performance gains, we used TypeScript/Node.js to create a simple script wrapping everything up. First, we define a function to make the corresponding Ranking Evaluation API calls: After that, we need to define the search queries before and after the optimizations: Then, we'll output the resulting metrics for each query: Finally, by running the script against our development Elasticsearch instance, we can see the following output demonstrating the P@K or (P@3) values for each query, before and after the changes. That is – how many results on the top 3 are considered relevant to the response: Improvements observed As an archer carefully adjusts for a precise shot, our recent efforts into relevance have brought considerable improvements in precision over time. Each one of the previous enhancements, in sequence, were small steps towards achieving better accuracy in our RAG-search results, and overall user experience. Here's a look at how our efforts have improved performance across various queries: Before and after – P@K Relevant results in the top 3: ❌ = 0 , 🥉 = 1 , 🥈 = 2 , 🥇 = 3 . Query Description P@K Before P@K After Change Support Diagnostics Tool 0.333 🥉 1.000 🥇 +200% Air Gapped Maps Service 0.333 🥉 0.667 🥈 +100% CVE Implications 0.000 ❌ 1.000 🥇 ∞ Enrich Processor Setup 0.667 🥈 0.667 🥈 0% Proxy Certificates Rotation 0.333 🥉 0.333 🥉 0% Proxy Certificates Version-specific Rotation 0.333 🥉 0.333 🥉 0% Searchable Snapshot Deletion 0.667 🥈 1.000 🥇 +50% Index Lifecycle Management Usage 0.667 🥈 0.667 🥈 0% Creating Data Views via API in Kibana 0.333 🥉 0.667 🥈 +100% Kibana Data View Creation 1.000 🥇 1.000 🥇 0% Comparing Elasticsearch Versions 0.000 ❌ 0.667 🥈 ∞ Maximum Bucket Size in Aggregations 0.000 ❌ 0.333 🥉 ∞ Average P@K Improvement: +78.41% 🏆🎉 . Let's summarize a few observations about our results: Significant Improvements : With the measured overall +78.41% of relevance increase, the following queries – Support Diagnostics Tool , CVE implications , Searchable Snapshot Deletion, Comparing Elasticsearch Versions – showed substantial enhancements. These areas not only reached the podium of search relevance but did so with flying colors, significantly outpacing their initial performances! Opportunities for Optimization : Certain queries like the Enrich Processor Setup , Kibana Data View Creation and Proxy Certificates Rotation have shown reliable performances, without regressions. These results underscore the effectiveness of our core search strategies. However, those remind us that precision in search is an ongoing effort. These static results highlight where we'll focus our efforts to sharpen our aim throughout the next iterations. As we continue, we'll also expand our test suite, incorporating more diverse and meticulously selected use cases to ensure our enhancements are both relevant and robust. What's next? 🔎 The path ahead is marked by opportunities for further gains, and with each iteration, we aim to push the RAG implementation performance and overall experience even higher. With that, let's discuss areas that we're currently interested in! Our data can be futher optimized for search : Although we have a large base of sources, we observed that having semantically close search candidates often led to less effective chatbot responses. Some of the crawled pages aren't really valuable, and often generate noise that impacts relevance negatively. To solve that, we can curate and enhance our existing knowledge base by applying a plethora of techniques, making it lean and effective to ensure an optimal search experience. Chatbots must handle conversations – and so must RAG searches : It's common user behavior to ask follow-up questions to the chatbot. A question asking \"How to configure Elasticsearch on a Linux machine?\" followed by \"What about Windows?\" should query something like \"How to configure Elasticsearch on a Linux machine?\" (not the raw 2nd question). The RAG query approach should find the most relevant content regarding the entire context of the conversation. Conditional context inclusion : By extracting the semantic meaning of the user question, it would be possible to conditionally include pieces of data as context, saving token limits, making the generated content even more relevant, and potentially saving round trips for search and external services. Conclusion In this installment of our series on GenAI for Customer Support, we have thoroughly explored the enhancements to the Retrieval-Augmented Generation (RAG) search within Elastic's customer support systems. By refining the interaction between large language models and our search algorithms, we have successfully elevated the precision and effectiveness of the Support AI Assistant. Looking ahead, we aim to further optimize our search capabilities and expand our understanding of user interactions. This continuous improvement will focus on refining our AI models and search algorithms to better serve user needs and enhance overall customer satisfaction. Stay tuned for more insights and updates as we continue to push the boundaries of what's possible with AI in customer support, and don't forget to join us in our next discussion, where we'll explore how Observability plays a critical role in monitoring, diagnosing, and optimizing the performance and reliability of the Support AI Assistant as we scale! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Jump to RAG tuning: A search problem Our first approach Data Query Generation Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"GenAI for customer support — Part 4: Tuning RAG search for relevance - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elser-rag-search-for-relevance","meta_description":"Discover how to fine tune RAG search for relevance with a GenAI for customer support use case. Learn challenges and strategies for tuning RAG with ELSER and the Elastic Stack."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog ES|QL queries to TypeScript types with the Elasticsearch JavaScript client Explore how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects. ES|QL Javascript How To JM By: Josh Mock On June 3, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction In a recent article, Laura highlighted how to use the Java Elasticsearch client to craft ES|QL queries and parse their results as native Java objects. Similar functionality, with TypeScript support, will be available in the JavaScript client in the upcoming 8.14.0 release. This blog explains how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects. Implementation: ES|QL queries to TypeScript types with the Elasticsearch JavaScript client First, let's use the bulk helper to index some data: Now, let's use a very basic ES|QL query to look at these newly indexed documents: Returning each row as an array of values is a simple default that's useful in many cases. Still, if you wanted an array of records instead—a standard structure in JavaScript applications—extra effort is necessary to convert the data. Fortunately, in 8.14.0, the JavaScript client will include a new ES|QL helper to do this for you: If you're using TypeScript, you can declare the type of the results inline: In another situation, you might need to declare a type that doesn't perfectly match the query results. Maybe the keys and columns don't have the same names, or the returned documents have columns that don't exist in your type. In this case, we can use the ES|QL RENAME and KEEP processing commands to modify our results to better fit your type: Conclusion These are relatively simple examples that highlight how to use the new ES|QL helper in the JavaScript client, so check out the docs for full details. Future releases of the JavaScript client will likely include more ES|QL helpers, like pagination over large result sets using generators, and support for Apache Arrow. All of our official clients have plans to include similar helpers and tools to make working with ES|QL queries as simple as possible. Check the changelog for your preferred client in the coming weeks and months! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction Implementation: ES|QL queries to TypeScript types with the Elasticsearch JavaScript client Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"ES|QL queries to TypeScript types with the Elasticsearch JavaScript client - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-javascript-helper-typescript","meta_description":"Explore how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Fast Kibana Dashboards From 8.13 to 8.17, the wait time for data to appear on a dashboard has improved by up to 40%. These improvements are validated both in our synthetic benchmarking environment and from metrics collected in real user’s cloud environments. Developer Experience TN By: Thomas Neirynck On March 3, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Almost all Elastic stack users at some point will use a Dashboard. Elastic ships many out-of-the-box dashboards for all its integrations, and users will create custom dashboards: to share with others, do a root cause analysis, and/or generate a PNG report. As a result, the Dashboard app is one of the most used applications in Kibana (for those wondering, the other most popular one is Discover). In addition, many of the core application components of Dashboards support other Kibana applications. Kibana’s widget framework for embedding charts and tables was originally developed for Dashboards, or the data plugin, which brokers search requests to Elasticsearch from the Kibana browser app and is heavily used by all Kibana plugins. In other words, the Dashboard application – both from a product and technical perspective – is central to Kibana, and deserves to be the best experience for users. Yet, dashboards in Kibana could feel “sluggish”. We would experience it as developers, and we would hear it from users. Comparisons with other tools (like Grafana, or to a lesser extent OpenSearch Dashboards) also showed that Dashboards in Kibana sometimes tend to feel slower. For this reason, the Kibana Team recently undertook an effort to bring down the render time of a Dashboard. Identifying the challenge Looking at real user telemetry (we’ll discuss this more below) we see a clear 80/20 division in the render time of Dashboards. a) On one hand, some dashboards take tens of seconds to load The time here is dominated by (very) long-running Elasticsearch queries. The 95th percentile (i.e. the minority) of dashboard rendering times sits significantly above the mean. This does not have to be unexpected; for example, the search the panels issue can span a large time-range, queries hit cold or frozen storage tiers that are not optimized for analytical workloads. These are a minority of dashboards, “the long tail”. Note also a clear seasonality in the data, with longer render times during the weekday versus the weekend. This likely indicates that the “worst case” is influenced by the overall load on the cluster including ingest (not just the runtime analytical queries), which tends to be elevated during working hours. (b) On the other hand, the majority of dashboards load in the low-second range. The 75% and under (the green, blue, and red lines) are significantly less, but they still take 1-3 seconds. When searches to Elasticsearch complete quickly, then where is the time going in Kibana? Why should a Kibana dashboard ever feel sluggish, especially when a deployment is well provisioned and searches complete quickly? In this initial phase of the project—and what we’ll summarize in this blog post—we decided in spring 2024 to focus on improving (b): the render time of the 75 percentile of dashboards, and ensure these become more snappy and pleasurable to work with. We did not forget about (a)! At the end of the post, we will highlight the initiatives that will bring down the render time for this top 20th percentile too. Telemetry From the start, we realized we had poor measurements of the render time of Dashboards. Existing instrumentation did not capture the various stages of the page load of a dashboard. It was also important to align metrics with what we could make actionable in day to day development, and what we could collect from users in the real world. a) What to measure From a high level, we introduced three core metrics, each capturing a specific span of the dashboard load. These can be stacked neatly bottom-to-top when opening a dashboard by its unique URL in a fresh tab. Metric What When Frequency kibana_loaded Initial asset loading session (“the spinner”) Opening Kibana for the first time Once per user session dashboard_overhead Bootstrapping of dashboard app and panels Opening a specific dashboard Once per dashboard time_to_data Time for data to appear on screen Changing a filter, a time-range, a control… Once per query-state change This instrumentation is largely implemented ad-hoc. Conventional Core Web Vitals ( https://web.dev/articles/vitals ) do not exactly measure what we were looking for, although we can draw some parallels. Specifically, Time-To-Interactive (TTI) corresponds roughly to the end of “kibana_loaded”. After that, the app is usable, although not all UX may have fully initialized yet. For example, the dashboard controls may not have been initialized yet, even though the dashboard-app technically can start responding to inputs. Another tricky metric was to match the “completion” of a dashboard to when all data is on screen, comparable in spirit to Largest Contentful Paint (LCP). This is the result of AJAX data-requests and front-end rendering which are custom to each panel-type. So to gather it correctly, each panel on a dashboard needs to report its “completion” state. For some charts, this is fairly straightforward (e.g. a simple metric-chart), but for others this is complicated. For example, a map is only complete when all tiles are rendered on screen. Detecting this is not trivial. After all panel-types report “completion” correctly, it is then up to the dashboard-application itself to observe all these “completion” events and report the “time-to-data” metric accordingly after the last completion. Additionally, further segmentations of these benchmarks were introduced, as well as additional metadata like the number of panels on a dashboard or whether the user navigated from within Kibana (some assets are loaded already) or from outside Kibana into the dashboard (no assets have been loaded). The app is also collecting metrics on the server on the duration of data requests. b) Importance of each segment Each of these segments has different semantics. Walking top-down the metric list: When it comes to “snappyness”, it is largely dominated by time-2-data. Each time a user manipulates the filter-state (e.g. time range), the user will need to wait for the new data to occur on screen. It is here where “lag” matters most. Think of a video game. Players may tolerate the loading icon before the level starts, but they then expect responsive gameplay when starting to control the game. This is the same for a dashboard. Users interact with charts, controls, filters… these interactions are the “gameplay” of a dashboard, and they determine how users experience responsiveness. Dashboard_overhead is relevant as well. It is the time it takes to load a dashboard configuration (which is a document retrieved from a system index in Elasticsearch). It also includes some additional code loading. This is because in the Kibana plugin-system, some code is loaded ad-hoc. To give an example: suppose a Dashboard has a swimlane-panel. The dashboard-app would initialize a “swimlane”-embeddable. If during that Kibana-session it is the first time the swimlane is being loaded, it will be up to the swimlane-embedable to ensure it has loaded all the swimlane-specific code before rendering. A deep-dive in the Kibana plugin system would take us too far, but summarized: some of the code loading footprint is captured in this “dashboard_overhead”, not just “kibana_loaded”. kibana_loaded only occurs a single time for the entirety of a Kibana user session, and is not dashboard specific. Users can navigate to Dashboards from many other Kibana-pages. For this reason, we wanted to isolate kibana_loaded from the specific experience of Dashboards. While being the largest “chunk” of time, it is also the least relevant when it comes to snappyness of the overall Kibana experience. Benchmarking Instrumentation in place, we now actually need to collect these in the appropriate environments: in benchmarking deployments, and real user deployments. a) In CI Performance metrics are being collected for a handful of representative dashboards. These have similar configurations as our integrations, each dashboard containing a mix of chart types. An example of a dashboard benchmark These benchmarks run every three hours on Kibana’s main release branch. The runner spins up a single Elasticsearch and Kibana cluster on dedicated hardware. Metrics are collected from Playwright scripts in a Chromium headless browser. b) In the wild The same metrics are reported for Elastic Cloud and Serverless users as well, and for self-hosted users who opt in into telemetry. While the benchmarks in CI provide an actionable signal at a point in time, collecting these metrics in the wild provide a backward looking signal that helps validate whether the movement of our benchmarks reflects in the real world. Later in the post, you will read how both evolved for the better part of last year. A note on process (meetings! alerts!) There is not a single engineer or team who “owns” the dashboard experience, as the display panels have been developed from all across Engineering. To align effort, some logistic arrangements have proven useful. Reporting Twice weekly reporting provides an opportunity to review the evolution of telemetry and discuss any ongoing related work. It is easiest to consider these as mini-retrospectives. Responding to regressions Benchmarks have proven to be quite reliable, the runs being highly reproducible. When there is negative or unexpected movement in the benchmark, it usually reliably indicates a regression that requires action. There is no hard and fast rule on how to respond. Significant regressions are generally rolled back immediately. Other times, we may let the offending commit ride, until a bug fix is merged. This is decided on a case by case basis, and really depends on the nature of the change. A typical smile-pattern in our benchmarks. A regression was detected and resolved. Ad-hoc runs - validating hypothesis As familiarity with the tooling has grown, we are also doing more trigger ad-hoc performance runs on individual PRs before they merge. This allows for more rapid validation of code changes before they get merged into the main development trunk. Improvements With all this yak-shaving is out of the way, let’s finally get to the interesting part. There has been no silver bullet. Kibana spends time everywhere, all of which adds a marginal overhead, which adds up in the aggregate. The improvements to Dashboard rendering has come from improved hygiene in many layers in the application. Below, we break these down in a couple of major themes. Reducing code and asset loading Efficient code loading is currently one of the largest challenges in Kibana. Kibana’s plugin architecture is very flexible in that it allows for fast turnaround in adding new pages and apps. This flexibility does come with a cost, with two being critically related to JavaScript code loading in the browser. One is that code gets loaded that is never required, the other is that asset-loading tends to be fragmented, with multiple small JavaScript assets to support a single app rather than fewer larger files. The first is particularly a problem for Dashboards. Many plugins register widgets with the dashboard app, for example,. a maps panel, a swimlane panel, etc…. However, most dashboards will never display a map or swimlane. Example: plugins can add pop-up links to dashboard panels. These are context dependent. Before: pseudo-code of how a plugin registers a pop-up item for dashboard panels. This pattern causes unnecessary code to be loaded. To avoid this, a new pattern was introduced that allows clients to delay the code loading, until it is needed. After: code only included when it is required. Isolate the definition in a different module ./case_action.ts ./plugin.ts The other issue is that initialization of plugins would block the responsiveness, causing a waterfall effect of asset loading getting serialized rather than parallelized. One example of this is that dashboard controls used to block rendering of the page, causing all panels to have to wait until the controls had loaded all its assets. This is of course not necessary, as rendering should be able to start before controls are fully initialized. Many of these code-loading issues were addressed, contributing to overall responsiveness. Since Kibana has many plugins (over 200 and counting), it will require sustained attention to address the misuse of both these inefficient code loading patterns. Embeddables and rendering A big effort from 2024 in Kibana was the embeddable refactor. This effort had a few goals, of which performance was largely an incidental concern. The effort had to enable a few critical features (like collapsible panels), remove a number of unused code paths (largely angular related), improve testability, and simplify some of the DevX when working with the APIs. You can read more about this here . One way embeddables have allowed to tidy up dashboard performance is by consolidating all rendering in a single React tree. Before, each panel was rendered into its own render-tree with ReactDOM.render() . This architecture was an artifact of a time when Kibana had both Angular and React-based (and jQuery, ahem) widgets. This mix of rendering technologies has not existed in Kibana for over 4 years, and has been fully standardized on React as the only UX rendering library. However, the Dashboard carried that legacy with an additional abstraction layer. Reducing the number of state changes and re-renderings that panels respond to has been a marginal improvement to the Dashboard app, overall increasing its responsiveness. The reduction in code too has helped reduce the app’s footprint. Avoid crufty memory allocation The code for charts and tables will re-organize the data received from Elasticsearch in a data structure that is easier to manipulate for display. They perform “a flattening” step that takes a nested data structure from the ES-response, and turns it into a one-dimensional array, where each item of the array corresponds to a single feature (e.g. a row in a table, a bar in a barchart…). E.g. consider a nested ES-doc with many sub-fields, or the hierarchical organization of buckets from an ES-arg search. The implementations for flattenings these often allocated short-lived objects, like object literals, or lambdas () => {}. Frequent use of array-comprehension methods like .map or .reduce are patterns where such object allocation sneaks in easily. Since these flattening-operations all occur in tight recursive loops (thousands of documents, hundreds of buckets) and given that dashboards may contain multiple tables and multiple charts, these allocations can add up quickly. Liberal heap allocation like this also cuts into the user-experience twice: once at construction, but also as a strain on the garbage collector (garbage collection is less predictable, but tends to contribute to hiccups in the frame rate). Our benchmarks showed meaningful improvements (between 5-10%) by removing some of the most glaring allocation in a few of these tight loops. Data transfer improvements The data request roundtrip from a Dashboard running Kibana-browser to and from Elasticsearch was batched. Multiple requests would be collected and the Kibana server would fan these out to Elasticsearch as individual _async_search requests, and combine these ES-JSON responses in a new Kibana-specific format. The main reason for this batching was that it side-steps the browser connection limit for HTTP1. This sits around 6 concurrent http requests, something which is easily exceeded on Dashboards with multiple panels. This approach has two main disadvantages. It adds a small delay to collect the batches. It also puts a strain on the kibana-server to wait and re-encode the ES-responses. Kibana-server would need to unzip them first, decode the JSON, concatenate the responses, and then gzip again. While this re-encoding step is generally small, in the worst case (for example, for large responses), it could be significant. It would also add significant memory pressure, occasionaly causing Out Of Memory issues. Given that the proxies that sit in front of Elastic Cloud and Serverless already support HTTP 2.0, and that Kibana will start to support HTTP 2.0 for stateful in 9.0, it was decided to remove this batching. In addition, kibana-server no longer re-encodes the data and streams the original gzipped result from Elasticsearch. This greatly simplifies the data-transfer architecture and in combination with running over HTTP 2.0, has shown some nice performance improvements. Apart from the performance benefits (faster, less sensitive to OOM), debuggability has much improved due to the simplified architecture and the fact that data-responses can now easily be inspected in the browser’s debugger. Outcomes The aggregate outcome of these changes have been significant, and is reflected both in the benchmarks and the user telemetry. Benchmark evolution The chart below shows the metrics for a mixture of multiple benchmarks, and how they have evolved the last 6 months. We can see an overall drop from around 3500ms to 2000ms (*the most recent uptick is related to a current effort to change the theme in Kibana. During this migration-phase we ship multiple themes. This will be removed over time. There’s also a few gaps when the CI-runs had keeled over). Real world users The real world, as explained in the intro, is harder to measure. We just do not know exactly which dashboards users are running, and how their configurations evolve over time. However, looking at it from two different perspectives, we can verify the same evolution as in the synthetic benchmarking environment.. First, over time we see a drop of time to render in the 75 percentile. It allows us to say that – on average – the experience of users on a dashboard in Jan 2025 is significantly better than in June 24. Dashboard render-time for 25, 50, and 75 percentile We can also compare mean time_to_data by version of all users in the last week. Users in 8.17 wait noticeably less time for data to appear on screen than users of 8.12. The drop in real world is also slightly delayed from what we are observing in our benchmarks, which roughly is in line with the cadence the stack is released. Looking ahead So the curve is trending downwards, largely by adding many small shaves. There are some significant areas where this approach of trimming the fat will eventually lead to diminishing returns. Below are some areas we believe will benefit from more structural changes on how we approach the problem. Code loading continued We have not discussed that kibana_loaded metric very much in this blog post. If we would have to characterize it: the Kibana plugin-architecture is optimized for allowing applications to load code ad-hoc, with the code-packaging process producing many JavaScript code bundles. However, in practice we do see unnecessary code being loaded, as well as “waterfall” code-loading where code loading may block rendering of the UX. All in all, things could improve here (see above, “Reducing code and asset loading”). The team is currently engaging in a wider ranging effort, “Sustainable Kibana”, which includes revisiting how we package and deliver code to the browser. We anticipate more benefits to materialize here later. When they do, be sure to check the blog post! Dealing with slow searches Long searches take a long time. It is what it is. An Elasticsearch search can be slow for many reasons, and this doesn’t even need to be a bug. Consider complex aggregations on a large cluster with terabytes of data, over a long time range hitting hundreds of nodes on different storage tiers. Depending on how the cluster is provisioned, this could end up being a slow search. Kibana needs to be resilient in face of this query-dependent limitation. In such a scenario, we cannot improve the “snappiness” of a Dashboard with low-level improvements in Kibana (like those we have discussed above). To address inherent slowness, there needs to be a suite of new features that allow users to opt into a “fast” mode. Consider for example sampling the data, where users can trade speed for accuracy. Or consider improving the perceived performance by incrementally filling in the charts as the data come in. Or allowing users to hide parts of the Dashboards with collapsible panels (they’re coming!). These changes will straddle more the line between product feature and low-level tech improvement. Chart and query dependent sluggishness The current effort has mainly sought improvement into lower level components with broad impact. However, there are instances where dashboards are slow due to very specific chart configurations. E.g. computing the “other” bucket, unique count aggregation over data with high cardinality, … Identifying these charts and queries will allow for more targeted optimizations. Are the defaults correct (e.g. do all charts require another bucket)? Are there more efficient ways to query for the same data? Adding Discover to the program All the paintpoints of Dashboards are also the same painpoints in Discover (pluggability of charts, data-heavy, requirements to be responsive…). So we have rolled out this program to guide development in the Discover app. Already, we are seeing some nice gains in Discover, and we’re looking to build on this momentum. This too would deserve its own blog post so stay tuned! Conclusion Dashboards are getting faster in Kibana. Recent improvements are due to the compound effect of many lower level optimizations. To progress even further, we anticipate a more two-pronged approach: First, continue this theme of improved hygiene. Second, expand to a broader program that will allow us to address the “long tail” of causes contributing to slowness. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote Jump to Identifying the challenge a) On one hand, some dashboards take tens of seconds to load (b) On the other hand, the majority of dashboards load in the low-second range. Telemetry a) What to measure Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Fast Kibana Dashboards - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/kibana-dashboard-rendering-time","meta_description":"Kibana Dashboards are getting faster. Explore recent improvements that bring down the render-time of a dashboard."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. Integrations Python How To JR By: Jeffrey Rengifo On April 24, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. AutoGen is a Microsoft framework for building applications that can act with human intervention or autonomously. It provides a complete ecosystem with different abstraction levels, depending on how much you need to customize. If you want to read more about agents and how they work, I recommend you read this article. Image Source: https://github.com/microsoft/autogen AgentChat allows you to easily instantiate preset agents on top of the AutoGen core so you can configure the model prompts, tools, etc. On top of AgentChat, you can use extensions that allow you to extend its functionalities. The extensions are both from the official library and community-based. The highest level of abstraction is Magnetic-One , a generalist multi-agent system designed for complex tasks. It comes preconfigured in the paper explaining this approach. AutoGen is known for fostering communication among agents, proposing groundbreaking patterns like: Group chat Multi agent debate Mixture of agents Concurrent agents Handoffs In this article, we will create an agent that uses Elasticsearch as a semantic search tool to collaborate with other agents and look for the perfect match between candidate profiles stored in Elasticsearch and job offers online. We will create a group of agents that share Elasticsearch and online information to try to match the candidates with job offers. We'll use the ´ Group Chat ´ pattern where an admin moderates the conversation and runs tasks while each agent specializes in a task. The complete example is available in this Notebook . Steps Install dependencies and import packages Prepare data Configure Agents Configure tools Run task Install dependencies and import packages Prepare data Setup keys For the agent AI endpoint, we need to provide an OpenAI API key. We will also need a Serper API key to give the agent search capabilities. Serper gives 2,500 search calls for free on sign-up . We use Serper to give our agent internet access capabilities, more specifically finding Google results. The agent can send a search query via API and Serper will return the top Google results. Elasticsearch client Inference endpoint and mappings To enable semantic search capabilities, we need to create an inference endpoint using ELSER . ELSER allows us to run semantic or hybrid queries, so we can give broad tasks to our agents and semantically related documents from Elasticsearch will show up with no need of typing keywords that are present in the documents. Mappings For the mappings, we are going to copy all the relevant text fields into the semantic_text field so we can run semantic or hybrid queries against the data. Ingesting documents to Elasticsearch We are going to load data about the job applicants and ask our agents to find the ideal job for each of them based on their experience and expected salary. Configure agents AI endpoint configuration Let’s configure the AI endpoint based on the environment variables we defined in the first step. Create agents We´ll begin by creating the admin that will moderate the conversation and run the tasks the agents propose. Then, we´ll create the agents that will carry out each task: Admin: leads the conversation and executes the other agents’ actions. Researcher : navigates online searching for job offers. Retriever : looks up candidates in Elastic. Matcher : tries to match the offers and the candidates. Critic : evaluates the quality of a match before providing the final answer. Configure tools For this project, we need to create two tools: one to search in Elasticsearch and another to search online. Tools are a Python function that we will register and assign to agents next. Tools methods Assigning tools to agents For the tools to work properly, we need to define a caller that will determine the parameters for the function and an executor that will run said function. We will define the admin as the executor and the respective agent as the caller. Run task We will now define a group chat with all agents, where the administrator assigns turns for each agent to define the task it wants to call and end it once the defined conditions are met based on previous instructions. Reasoning (Formatted for readability) The output will look like this: Next speaker: Matcher Next speaker: Retriever Next speaker: Admin Admin (to chat_manager): Researcher (to chat_manager): Next speaker: Critic Admin has been informed of the successful candidate-job offer matches. Results (Formatted for readability) Note that at the end of each Elasticsearch stored candidate, you can see a match field with the job listing that best fits them! Conclusion AutoGen allows you to create groups of agents that work together to solve a problem with different degrees of complexity. One of the available patterns is 'group chat,' where an admin leads a conversation among agents to reach a successful solution. You can add more features to the project by creating more agents. For example, storing the matches provided back into Elasticsearch, and then automatically applying to the job offers using the WebSurfer agent . The WebSurfer agent can navigate websites using visual models and a headless browser. To index documents in Elasticsearch, you can use a tool similar to elasticsearch_hybrid_search, but with added ingestion logic. Then, create a special agent “ingestor” to achieve indexing. Once you have that, you can implement the WebSurfer agent by following the official documentation . Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Install dependencies and import packages Prepare data Setup keys Elasticsearch client Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using AutoGen with Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/using-autogen-with-elasticsearch","meta_description":"Learn to create an Elasticsearch tool for your agents with AutoGen. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to use Elasticsearch Vector Store Connector for Microsoft Semantic Kernel for AI Agent development Microsoft Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase. With the release of Semantic Kernel Elasticsearch Vector Store Connector, developers using Semantic Kernel for building AI agents can now plugin Elasticsearch as a scalable enterprise-grade vector store while continuing to use Semantic Kernel abstractions. Generative AI .NET Integrations Vector Database How To FB SM By: Florian Bernd and Srikanth Manvi On December 6, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In collaboration with the Microsoft Semantic Kernel team, we are announcing the availability of Semantic Kernel Elasticsearch Vector Store Connector , for Microsoft Semantic Kernel (.NET) users. Semantic Kernel simplifies building enterprise-grade AI agents, including the capability to enhance large language models (LLMs) with more relevant, data-driven responses from a Vector Store. Semantic Kernel provides a seamless abstraction layer for interacting with Vector Stores like Elasticsearch, offering essential features such as creating, listing, and deleting collections of records and uploading, retrieving, deleting individual records. The out-of-the-box Semantic Kernel Elasticsearch Vector Store Connector supports the Semantic Kernel vector store abstractions which make it very easy for developers to plugin Elasticsearch as a vector store while building AI agents. Elasticsearch has a strong foundation in the open-source community and recently adopted the AGPL license . Combined with the open-source Microsoft Semantic Kernel, these tools offer a powerful, enterprise-ready solution. You can get started locally by spinning up Elasticsearch in a few minutes by running this command curl -fsSL https://elastic.co/start-local | sh (refer start-local for details) and move to cloud-hosted or self-hosted versions while productionizing your AI agents. In this blog we look at how to use Semantic Kernel Elasticsearch Vector Store Connector when using Semantic Kernel. A Python version of the connector will be made available in the future. High-level scenario In the following section we go through an example. At a high-level we are building a RAG (Retrieval Augmented Generation) application which takes a user's question as input and returns an answer. We will use Azure OpenAI ( local LLM can be used as well) as the LLM, Elasticsearch as the vector store and Semantic Kernel (.net) as the framework to tie all components together. If you are not familiar with RAG architectures, you can have a quick introduction with this article: https://www.elastic.co/search-labs/blog/retrieval-augmented-generation-rag . The answer is generated by the LLM which is fed with context, relevant to the question, retrieved from Elasticsearch vectorstore. The response also includes the source that was used as the context by the LLM. RAG Example In this specific example, we build an application that allows users to ask questions about hotels stored in an internal hotel database. The user could e.g. search for a specific hotel, based on different criteria, or ask for a list of hotels. For the example database, we generated a list of hotels containing 100 entries. The sample size is intentionally small to allow you to try out the connector demo as easily as possible. In a real-world application, the Elasticsearch connector would show its advantages over other options, such as the `InMemory` vector store implementation, especially when working with extremely large amounts of data. The complete demo application can be found in the Elasticsearch vector store connector repository . Let’s start with adding the required NuGet packages and using directives to our project: We can now create our data model and provide it with Semantic Kernel specific attributes to define the storage model schema and some hints for the text search: The Storage Model Schema attributes (`VectorStore*`) are most relevant for the actual use of the Elasticsearch Vector Store Connector, namely: VectorStoreRecordKey to mark a property on a record class as the key under which the record is stored in a vector store. VectorStoreRecordData to mark a property on a record class as 'data'. VectorStoreRecordVector to mark a property on a record class as a vector. All of these attributes accept various optional parameters that can be used to further customize the storage model. In the case of VectorStoreRecordKey , for example, it is possible to specify a different distance function or a different index type. The text search attributes ( TextSearch* ) will be important in the last step of this example. We will come back to them later. In the next step, we initialize the Semantic Kernel engine and obtain references to the core services. In a real world application, dependency injection should be used instead of directly accessing the service collection. The same thing applies to the hardcoded configuration and secrets, which should be read using a configuration provider instead: The vectorStoreCollection service can now be used to create the collection and to ingest a few demo records : This shows how Semantic Kernel reduces the use of a vector store with all its complexity to a few simple method calls. Under the hood, a new index is created in Elasticsearch and all the necessary property mappings are created. Our data set is then mapped completely transparently into the storage model and finally stored in the index. Below is how the mappings look in Elasticsearch. The embeddings.GenerateEmbeddingsAsync() calls transparently called the configured Azure AI Embeddings Generation service. Even more magic can be observed in the last step of this demo. With just a single call to InvokePromptAsync , all of the following operations are performed when the user asks a question about the data: 1. An embedding for the user's question is generated 2. The vector store is searched for relevant entries 3. The results of the query are inserted into a prompt template 4. The actual query in the form of the final prompt is sent to the AI chat completion service Remember the TextSearch* attributes, we previously defined on our data model? These attributes enable us to use corresponding placeholders in our prompt template which are automatically populated with the information from our entries in the vector store. The final response to our question \"Please show me all hotels that have a rooftop bar.\" is as follows: The answer correctly refers to the following entry in our hotels.csv This example shows very well how the use of Microsoft Semantic Kernel achieves a significant reduction in complexity through its well thought abstractions, as well as enabling a very high level of flexibility. By changing a single line of code, for example, the vector store or the AI services used can be replaced without having to refactor any other part of the code. At the same time, the framework provides an enormous set of high-level functionality, such as the `InvokePrompt` function, or the template or search plugin system. The complete demo application can be found in the Elasticsearch vector store connector repository. What else is possible with ES Elasticsearch new semantic_text mapping: Simplifying semantic search Semantic reranking in Elasticsearch with retrievers Advanced RAG techniques part 1: Data processing Advanced RAG techniques part 2: Querying and testing Building RAG with Llama 3 open-source and Elastic A tutorial on building local agent using LangGraph, LLaMA3 and Elasticsearch vector store from scratch What's next? We showed how the Elasticsearch vector store can be easily plugged into Semantic Kernel while building GenAI applications in .NET. Stay tuned for a Python integration next. As Semantic Kernel builds abstractions for advanced search features like hybrid search , the Elasticsearch connect will enable .NET developers to easily implement them while using Semantic Kernel. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to High-level scenario RAG Example What else is possible with ES What's next? Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to use Elasticsearch Vector Store Connector for Microsoft Semantic Kernel for AI Agent development - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-connector-microsoft-semantic-kernel","meta_description":" Microsoft Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase. With the release of Semantic Kernel Elasticsearch Vector Store Connector, developers using Semantic Kernel for building AI agents can now plugin Elasticsearch as a scalable enterprise-grade vector store while continuing to use Semantic Kernel abstractions."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial ES|QL Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau ES|QL December 31, 2024 Improving the ES|QL editor experience in Kibana With the new ES|QL language becoming GA, a new editor experience has been developed in Kibana to help users write faster and better queries. Features like live validation, improved autocomplete and quick fixes will streamline the ES|QL experience. ML By: Marco Liberati ES|QL Ruby +1 October 24, 2024 How to use the ES|QL Helper in the Elasticsearch Ruby Client Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results. FB By: Fernando Briano ES|QL Python +1 September 5, 2024 From ES|QL to native Pandas dataframes in Python Learn how to export ES|QL queries as native Pandas dataframes in Python through practical examples. QP By: Quentin Pradet ES|QL Python August 20, 2024 An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus Use Elasticsearch Query Language (ES|QL) to run statistical analysis on demographic data index in Elasticsearch. BA By: Baha Azarmi ES|QL How To June 5, 2024 Elasticsearch piped query language, ES|QL, now generally available Elasticsearch Query Language (ES|QL) is now GA. Explore ES|QL's capabilities, learn about ES|QL in Kibana and discover future advancements. CL GK By: Costin Leau and George Kobar ES|QL Javascript +1 June 3, 2024 ES|QL queries to TypeScript types with the Elasticsearch JavaScript client Explore how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects. JM By: Josh Mock ES|QL Java +1 May 2, 2024 ES|QL queries to Java objects Learn how to perform ES|QL queries with the Java client. Follow this guide for step-by-step instructions, including examples. LT By: Laura Trotta 1 2 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"ES|QL - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/esql","meta_description":"ES|QL articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Filtering in ES|QL using full text search 8.17 included match and qstr functions in ES|QL, that can be used to perform full text filtering. 8.18 removed limitations on their usage. This article describes what they do, how they can be used, the difference with the existing text filtering methods, current limitations and future improvements. Search Analytics CD By: Carlos Delgado On January 10, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. ES|QL now includes full text functions that can be used to filter your data using text queries. We will review the available text filtering methods and understand why these functions provide a better alternative. We will also look at the future improvements for full text functions in ES|QL. Filtering text with ES|QL Text data in logs is critical for understanding, monitoring, and troubleshooting systems and applications. The unstructured nature of text allows for flexibility in capturing all sorts of information. Being unstructured, we need ways of isolating specific patterns, keywords, or phrases. Be it searching for an error message, narrowing down results using tags, or looking for a specific host name, are things that we do all the time to refine our results and eventually obtain the information we're looking for. ES|QL provides different methods to help you work with text. Elasticsearch 8.17 adds the full text functions match and qstr in tech preview to help tackle more complex search use cases. Limitations of text filtering ES|QL already provided text filtering capabilities, including: Text equality, to compare full strings directly using the equality operator . String start and end, using the STARTS_WITH and ENDS_WITH functions. Pattern and regex matching with the LIKE and RLIKE operators. Text filtering is useful - but it can fall short on text oriented use cases: Multivalued fields Using ES|QL functions with multivalued fields can be tricky - functions return null when applied to a multivalued field. If you need to apply a function to a multivalued field, you first need to transform the value to a single value using MV_CONCAT so you can match on a single value: Analyzed text Analyzers are incredibly useful for full text search as they allow transforming text. They allow us to extract and modify the indexed text, and modify the queries so we can maximize the possibility of finding what we're looking for. Text is not analyzed when using text filtering. This means for example that you need to match the text case when searching, or create regexes / patterns that address possible case differences. This can become more problematic when looking for multilingual text (so you can't use ASCII folding ), trying to match on parts of paths ( path hierarchy ), or removing stopwords . Performance Pattern matching and regexes take time. Lucene can do a lot of the heavy lifting by creating finite automata to match using the indexed terms dictionary, but nonetheless it's a computationally intensive process. As you can see in our 8.17 release blog , using regular expressions can be up to 50-1000x slower than using full text functions for text filtering, depending on your data set. Enter full text functions Elasticsearch 8.17 and Serverless introduced two new functions as tech preview for text matching: MATCH and query string (abbreviated QSTR ). These functions address some of the limitations that existed for text filtering: They can be used directly on multivalued fields . They will return results when any of the values in a multivalued field matches the query. They use analyzers for text fields. The query will be analyzed using any existing analyzers for the target fields, which will allow matching regardless of case. This also unlocks ASCII folding, removing stopwords, and even using synonyms . They are performant . Instead of relying on pattern matching or regular expressions, they can directly use Lucene index structures to locate specific terms in your data. MATCH function MATCH allows matching a value on a specific field: Match function uses a match query under the hood. This means that it will create a boolean query when multiple terms are used, with OR as the default operator for combining them. Match function currently has some limitations: It does not provide a way to specify parameters . It will use the defaults for the match query. It can only be used in WHERE clauses. It can't be used after a STATS or LIMIT command The following limitations exist in 8.17 version: Only text or keyword fields can be used with MATCH . MATCH can be combined with other conditions as part of an AND expression, but not as part of an OR expression. WHERE match(message, \"connection lost\") AND length(message) > 10 can be used, but not WHERE match(message, \"connection lost\") OR length(message) > 10 . These limitations have been removed in 8.18 version and in Elastic Cloud Serverless , which is continuously up to date with our new work. Match operator The match operator (:) is equivalent to the match function above, but it offers a more succinct syntax: It is more convenient to use the match operator, but you can use whichever makes more sense to you. Match operator has the same limitations as the match function. Query string function Query string function ( QSTR ) uses the query string syntax to perform complex queries on one or several fields: Query string syntax allows to specify powerful full text options and operations, including fuzzy search , proximity searches and the use of boolean operators . Refer to the docs for more details. Query string is a very powerful tool, but currently has some limitations, very similar to the MATCH function: It does not provide a way to specify parameters like the match type or specifying the default fields to search for. It can only be used in WHERE clauses. It can't be used after STATS or LIMIT commands It can't be used after commands that modify columns, like SHOW, ROW, DISSECT, DROP, ENRICH, EVAL, GROK, KEEP, MV_EXPAND, or RENAME Similar to the MATCH function, we have a limitation for the OR conditions in version 8.17, which has been removed in 8.18 and Elastic Cloud Serverless . What's next What's coming for full text search? Quite a few things have been introduced in 8.18: Adding tuning options for the behaviour of MATCH and QSTR functions An additional KQL function that can be used to port your existing Kibana queries to ES|QL We also added scoring , so you can start using ES|QL for relevance matching and not just for filtering. This is quite exciting as this will define how the future of text search will be like in Elasticsearch! Check out ES|QL - Introducing scoring and semantic search for an overview of changes in 8.18 for text search. Give it a try MATCH and QSTR are available as tech preview on Elasticsearch 8.17, and of course they are always up to date in Serverless. What are you looking for in terms of text filtering? Let us know your feedback! Happy full text filtering! Report an issue Related content Search Analytics How To June 10, 2024 Storage wins for time-series data in Elasticsearch Explore Elasticsearch's storage improvements for time series data and best practices for configuring a TSDS with storage efficiency. MG KK By: Martijn Van Groningen and Kostas Krikellas Jump to Filtering text with ES|QL Limitations of text filtering Multivalued fields Analyzed text Performance Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Filtering in ES|QL using full text search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/filtering-in-esql-full-text-search-match-qstr","meta_description":"Learn how to use the MATCH and QSTR functions for efficient full-text filtering in ES|QL. Explore what they do and how they differ from existing text filtering methods."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. Generative AI How To NS By: Neha Saini On April 25, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The LangGraph Retrieval Agent Template is a starter project developed by LangChain to facilitate the creation of retrieval-based question-answering systems using LangGraph in LangGraph Studio. This template is pre-configured to integrate seamlessly with Elasticsearch, enabling developers to rapidly build agents that can index and retrieve documents efficiently. This blog focuses on running and customizing the LangChain Retrieval Agent Template using LangGraph Studio and LangGraph CLI. The template provides a framework for building retrieval-augmented generation (RAG) applications, leveraging various retrieval backends like Elasticsearch. We will walk you through setting up, configuring the environment, and executing the template efficiently with Elastic while customizing the agent flow. Prerequisites Before proceeding, ensure you have the following installed: Elasticsearch Cloud deployment or on-prem Elasticsearch deployment (or create a 14-day Free Trial on Elastic Cloud) - Version 8.0.0 or higher Python 3.9+ Access to an LLM provider such as Cohere (used in this guide), OpenAI , or Anthropic/Claude Creating the LangGraph app 1. Install the LangGraph CLI 2. Create LangGraph app from retrieval-agent-template You will be presented with an interactive menu that will allow you to choose from a list of available templates. Select 4 for Retrieval Agent and 1 for Python, as shown below: Troubleshooting : If you encounter the error “urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)> “ Please run the Install Certificate Command of Python to resolve the issue, as shown below. 3. Install dependencies In the root of your new LangGraph app, create a virtual environment and install the dependencies in edit mode so your local changes are used by the server: Setting up the environment 1. Create a .env file The .env file holds API keys and configurations so the app can connect to your chosen LLM and retrieval provider. Generate a new .env file by duplicating the example configuration: 2. Configure the . env file The .env file comes with a set of default configurations. You can update it by adding the necessary API keys and values based on your setup. Any keys that aren't relevant to your use case can be left unchanged or removed. Example .env file (using Elastic Cloud and Cohere) Below is a sample .env configuration for using Elastic Cloud as the retrieval provider and Cohere as the LLM, as demonstrated in this blog: Note: While this guide uses Cohere for both response generation and embeddings, you’re free to use other LLM providers such as OpenAI , Claude , or even a local LLM model depending on your use case. Make sure that each key you intend to use is present and correctly set in the .env file. 3. Update configuration file - configuration.py After setting up your .env file with the appropriate API keys, the next step is to update your application’s default model configuration. Updating the configuration ensures the system uses the services and models you’ve specified in your .env file. Navigate to the configuration file: The configuration.py file contains the default model settings used by the retrieval agent for three main tasks: Embedding model – converts documents into vector representations Query model – processes the user’s query into a vector Response model – generates the final response By default, the code uses models from OpenAI (e.g., openai/text-embedding-3-small ) and Anthropic (e.g., anthropic/claude-3-5-sonnet-20240620 and anthropic/claude-3-haiku-20240307 ). In this blog, we're switching to using Cohere models. If you're already using OpenAI or Anthropic, no changes are needed. Example changes (using Cohere): Open configuration.py and modify the model defaults as shown below: Running the Retrieval Agent with LangGraph CLI 1. Launch LangGraph server This will start up the LangGraph API server locally. If this runs successfully, you should see something like: Open Studio UI URL. There are two graphs available: Retrieval Graph : Retrieves data from Elasticsearch and responds to Query using an LLM. Indexer Graph : Indexes documents into Elasticsearch and generates embeddings using an LLM. 2. Configuring the Indexer Graph Open the Indexer Graph. Click Manage Assistants. Click on 'Add New Assistant ', enter the user details as specified, and then close the window. 3. Indexing sample documents Index the following sample documents, which represent a hypothetical quarterly report for the organization NoveTech: Once the documents are indexed, you will see a delete message in the thread, as shown below. 4. Running the Retrieval Graph Switch to the Retrieval Graph. Enter the following search query: The system will return relevant documents and provide an exact answer based on the indexed data. Customize the Retrieval Agent To enhance the user experience, we introduce a customization step in the Retrieval Graph to predict the next three questions a user might ask. This prediction is based on: Context from the retrieved documents Previous user interactions Last user query The following code changes are required to implement Query Prediction feature: 1. Update graph.py Add predict_query function: Modify respond function to return response Object , instead of message: Update graph structure to add new node and edge for predict_query: 2. Update prompts.py Craft prompt for Query Prediction in prompts.py : 3. Update configuration.py Add predict_next_question_prompt : 4. Update state.py Add the following attributes: 5. Re-run the Retrieval Graph Enter the following search query again: The system will process the input and predict three related questions that users might ask, as shown below. Conclusion Integrating the Retrieval Agent template within LangGraph Studio and CLI provides several key benefits: Accelerated development : The template and visualization tools streamline the creation and debugging of retrieval workflows, reducing development time. Seamless deployment : Built-in support for APIs and auto-scaling ensures smooth deployment across environments. Easy updates: Modifying workflows, adding new functionalities, and integrating additional nodes is simple, making it easier to scale and enhance the retrieval process. Persistent memory : The system retains agent states and knowledge, improving consistency and reliability. Flexible workflow modeling : Developers can customize retrieval logic and communication rules for specific use cases. Real-time interaction and debugging : The ability to interact with running agents allows for efficient testing and issue resolution. By leveraging these features, organizations can build powerful, efficient, and scalable retrieval systems that enhance data accessibility and user experience. The full source code for this project is available on GitHub . Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Creating the LangGraph app 1. Install the LangGraph CLI 2. Create LangGraph app from retrieval-agent-template 3. Install dependencies Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Build a powerful RAG workflow using LangGraph and Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/build-rag-workflow-langgraph-elasticsearch","meta_description":"In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Python Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Python Javascript +2 October 30, 2024 Export your Kibana Dev Console requests to Python and JavaScript Code The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application. MG By: Miguel Grinberg Generative AI Python October 23, 2024 GitHub Assistant: Interact with your GitHub repository using RAG and Elasticsearch This blog introduces a GitHub Assistant using RAG with Elasticsearch to enable semantic code queries, providing insights into GitHub repositories, which can be extended to PRs feedback, issues handling, and production readiness reviews. FS By: Fram Souza Vector Database .NET +2 October 9, 2024 Building a search app with Blazor and Elasticsearch Learn how to build a search application using Blazor and Elasticsearch, and how to use the Elasticsearch .NET client for hybrid search. GL By: Gustavo Llermaly Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Integrations Python +1 September 27, 2024 Elasticsearch open inference API for Google AI Studio Elasticsearch open inference API adds support for Google AI Studio JV By: Jeff Vestal ML Research Python September 19, 2024 Evaluating search relevance part 2 - Phi-3 as relevance judge Using the Phi-3 language model as a search relevance judge, with tips & techniques to improve the agreement with human-generated annotation. TP TV By: Thanos Papaoikonomou and Thomas Veasey 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Python - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/python-programming","meta_description":"Python articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Building a Multimodal RAG system with Elasticsearch: The story of Gotham City Learn how to build a Multimodal Retrieval-Augmented Generation (RAG) system that integrates text, audio, video, and image data to provide richer, contextualized information retrieval. Generative AI How To AS By: Alex Salgado On March 11, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this blog, you'll learn how to build a Multimodal RAG (Retrieval-Augmented Generation) pipeline using Elasticsearch. We'll explore how to leverage ImageBind to generate embeddings for various data types, including text, images, audio, and depth maps. You'll also discover how to efficiently store and retrieve these embeddings in Elasticsearch using dense_vector and k-NN search. Finally, we'll integrate a large language model (LLM) to analyze retrieved evidence and generate a comprehensive final report. How does the pipeline work? Collecting clues → Images, audio, texts, and depth maps from the crime scene in Gotham. Generating embeddings → Each file is converted into a vector using the ImageBind multimodal model. Indexing in Elasticsearch → The vectors are stored for efficient retrieval. Searching by similarity → Given a new clue, the most similar vectors are retrieved. The LLM analyzes the evidence → A GPT-4 model synthesizes the response and identifies the suspect! Technologies used ImageBind → Generates unified embeddings for various modalities. Elasticsearch → Enables fast and efficient vector search. LLM (GPT-4, OpenAI) → Analyzes the evidence and generates a final report. Who is this blog for? Elastic users interested in multimodal vector search. Developers looking to understand Multimodal RAG in practice. Anyone searching for scalable solutions for analyzing data from multiple sources. Prerequisites: Setting up the environment To solve the crime in Gotham City, you need to set up your technology environment. Follow this step-by-step guide: 1. Technical requirements Component Specification Sistem OS Linux, macOS, or Windows Python 3.10 or later RAM Minimum 8GB (16GB recommended) GPU Optional but recommended for ImageBind 2. Setting up the project All investigation materials are available on GitHub, and we'll be using Jupyter Notebook (Google Colab) for this interactive crime-solving experience. Follow these steps to get started: Setting up with Jupyter Notebook (Google Colab) 1. Access the notebook Open our ready-to-use Google Colab notebook: Multimodal RAG with Elasticsearch . This notebook contains all the code and explanations you need to follow along. 2. Clone the repository 3. Install dependencies 4. Configure credentials Note: The ImageBind model (~2GB) will be downloaded automatically on the first run. Now that everything is set up, let's dive into the details and solve the crime! Introduction: The crime in Gotham City On a rainy night in Gotham City, a shocking crime shakes the city. Commissioner Gordon needs your help to unravel the mystery. Clues are scattered across different formats: blurred images, mysterious audio, encrypted texts, and even depth maps. Are you ready to use the most advanced AI technology to solve the case? In this blog, you will be guided step by step through building a Multimodal RAG (Retrieval-Augmented Generation) system that unifies different types of data ( images, audio, texts, and depth maps ) into a single search space. We will use ImageBind to generate multimodal embeddings, Elasticsearch to store and retrieve these embeddings, and a Large Language Model (LLM) to analyze the evidence and generate a final report. Fundamentals: Multimodal RAG architecture What is a Multimodal RAG? The rise of Retrieval-Augmented Generation (RAG) Multimodal is revolutionizing the way we interact with AI models. Traditionally, RAG systems work exclusively with text, retrieving relevant information from databases before generating responses. However, the world is not limited to text— images, videos, and audio also carry valuable knowledge . This is why multimodal architectures are gaining prominence, allowing AI systems to combine information from different formats for richer and more precise responses . Three main approaches for Multimodal RAG To implement a Multimodal RAG, three strategies are commonly used. Each approach has its own advantages and limitations, depending on the use case: 1. Shared vector space Data from different modalities are mapped into a common vector space using multimodal models like ImageBind. This allows text queries to retrieve images, videos, and audio without explicit format conversion. Advantages: Enables cross-modal retrieval without requiring explicit format conversion. Provides a fluid integration between different modalities, allowing direct retrieval across text, image, audio, and video. Scalable for diverse data types, making it useful for large-scale retrieval applications . Disadvantages: Training requires large multimodal datasets , which may not always be available. The shared embedding space may introduce semantic drift , where relationships between modalities are not perfectly preserved. Bias in multimodal models can impact retrieval accuracy, depending on the dataset distribution. 2. Single grounded modality All modalities are converted to a single format , usually text , before retrieval. For example, images are described through automatically generated captions , and audio is transcribed into text. Advantages: Simplifies retrieval , as everything is converted into a uniform text representation . Works well with existing text-based search engines , eliminating the need for specialized multimodal infrastructure. Can improve interpretability since retrieved results are in a human-readable format. Disadvantages: Loss of information : Certain details (e.g., spatial relationships in images, tone in audio) may not be fully captured in text descriptions. Dependent on captioning/transcription quality : Errors in automatic annotations can reduce retrieval effectiveness. Not optimal for purely visual or auditory queries since the conversion process might remove essential context. 3. Separate retrieval Maintains distinct models for each modality. The system performs separate searches for each data type and later merges the results . Advantages: Allows custom optimization per modality , improving retrieval accuracy for each type of data. Less reliance on complex multimodal models , making it easier to integrate existing retrieval systems. Provides fine-grained control over ranking and re-ranking as results from different modalities can be combined dynamically. Disadvantages: Requires fusion of results , making the retrieval and ranking process more complex. May generate inconsistent responses if different modalities return conflicting information. Higher computational cost since independent searches are performed for each modality, increasing processing time. Our choice: Shared vector space with ImageBind Among these approaches, we chose shared vector space , a strategy that aligns perfectly with the need for efficient multimodal searches . Our implementation is based on ImageBind , a model capable of representing multiple modalities ( text, image, audio, and video ) in a common vector space . This allows us to: Perform cross-modal searches between different media formats without needing to convert everything to text. Use highly expressive embeddings to capture relationships between different modalities. Ensure scalability and efficiency , storing optimized embeddings for fast retrieval in Elasticsearch. By adopting this approach, we built a robust multimodal search pipeline , where a text query can directly retrieve images or audio without additional pre-processing. This method expands practical applications from intelligent search in large repositories to advanced multimodal recommendation systems . The following figure illustrates the data flow within the Multimodal RAG pipeline, highlighting the indexing, retrieval, and response generation process based on multimodal data: How does the embedding space work? Traditionally, text embeddings come from language models (e.g., BERT, GPT). Now, with native multimodal models like Meta AI’s ImageBind , we have a backbone that generates vectors for multiple modalities: Text : Sentences and paragraphs are transformed into vectors of the same dimension. Images (vision) : Pixels are mapped into the same dimensional space used for text. Audio : Sound signals are converted into embeddings comparable to images and text. Depth Maps : Depth data is processed and also results in vectors. Thus, any clue ( text, image, audio, depth ) can be compared to any other using vector similarity metrics like cosine similarity . If a laughing audio sample and an image of a suspect's face are “close” in this space, we can infer some correlation (e.g., the same identity). Stage 1 - Collecting crime scene clues Before analyzing the evidence, we need to collect it. The crime in Gotham left traces that may be hidden in images, audio, texts, and even depth data. Let's organize these clues to feed into our system. What do we have? Commissioner Gordon sent us the following files containing evidence collected from the crime scene in four different modalities: Track description and modality a) Images (2 photos) crime_scene1.jpg, crime_scene2.jpg → Photos taken from the crime scene. Shows suspicious traces on the ground. suspect_spotted.jpg → Security camera image showing a silhouette running away from the scene. b) Audio (1 recording) joker_laugh.wav → A microphone near the crime scene captured a sinister laugh. c) Text (1 message) Riddle.txt, note2.txt → Some mysterious notes were found at the location, possibly left by the criminal. d) Depth (1 depth map) depth_suspect.png → A security camera with a depth sensor captured a suspect in a nearby alley. jdancing-depth.png → A security camera with a depth sensor captured a suspect going down the subway station. These pieces of evidence are in different formats and cannot be analyzed directly in the same way. We need to transform them into embeddings—numerical vectors that will allow cross-modal comparison. File organization Before starting processing, we need to ensure that all clues are properly organized in the data/ directory so the pipeline runs smoothly. Expected directory structure: Code to verify clue organization Before proceeding, let's ensure that all required files are in the correct location. Running the file Expected output (if all files are correct): Expected output (if any file is missing): This script helps prevent errors before we start generating embeddings and indexing them into Elasticsearch. Stage 2 - Organizing the evidence Generating embeddings with ImageBind To unify the clues, we need to transform them into embeddings—vector representations that capture the meaning of each modality. We will use ImageBind , a model by Meta AI that generates embeddings for different data types ( images, audio, text, and depth maps ) within a shared vector space. How does ImageBind work? To compare different types of evidence ( images, audio, text, and depth maps ), we need to transform them into numerical vectors using ImageBind . This model allows any type of input to be converted into the same embedding format, enabling cross-modal searches between modalities. Below is an optimized code ( src/embedding_generator.py ) to generate embeddings for any type of input using the appropriate processors for each modality: A tensor is a fundamental data structure in machine learning and deep learning, especially when working with models like ImageBind. In our context: Here, the tensor represents the input data (image, audio, or text) converted into a mathematical format that the model can process. Specifically: For images : The tensor represents the image as a multidimensional matrix of numerical values (pixels organized by height, width, and color channels). For audio : The tensor represents sound waves as a sequence of amplitudes over time. For text : The tensor represents words or tokens as numerical vectors. Testing embedding generation: Let's test our embedding generation with the following code. Save it in 02-stage/test_embedding_generation.py and execute it with this command: Expected output: Now, the image has been transformed into a 1024-dimensional vector . Stage 3 - Storage and search in Elasticsearch Now that we have generated the embeddings for the evidence, we need to store them in a vector database to enable efficient searches. For this, we will use Elasticsearch , which supports dense vectors ( dense_vector ) and allows similarity searches. This step consists of two main processes: Indexing the embeddings → Stores the generated vectors in Elasticsearch. Similarity search → Retrieves the most similar records to a new piece of evidence. Indexing the evidence in Elasticsearch Each piece of evidence processed by ImageBind (image, audio, text, or depth) is converted into a 1024-dimensional vector . We need to store these vectors in Elasticsearch to enable future searches. The following code ( src/elastic_manager.py ) creates an index in Elasticsearch and configures the mapping to store the embeddings. Running the indexing Now, let's index a piece of evidence to test the process. Expected output in Elasticsearch (summary of the indexed document): To index all multimodal evidence, please execute the following Python command: Now, the evidence is stored in Elasticsearch and is ready to be retrieved when needed. Verifying the indexing process After running the indexing script, let's verify if all our evidence was correctly stored in Elasticsearch. You can use Kibana's Dev Tools to run some verification queries: 1. First, check if the index was created: 2. Then, verify the document count per modality: 3. Finally, examine the indexed document structure: Expected results: An index named `multimodal_content` should exist. Around 7 documents distributed across different modalities (vision, audio, text, depth). Each document should contain: embedding, modality, description, metadata, and content_path fields. This verification step ensures that our evidence database is properly set up before we proceed with the similarity searches. Searching for similar evidence in Elasticsearch Now that the evidence has been indexed, we can perform searches to find the most similar records to a new clue. This search uses vector similarity to return the closest records in the embedding space . The following code performs this search. Testing the search - Using audio as a query for multimodal results Now, let's test the search for evidence using a suspicious audio file . We need to generate an embedding for the file in the same way and search for similar embeddings: Expected output in the terminal: Now, we can analyze the retrieved evidence and determine its relevance to the case. Beyond audio - Exploring multimodal searches Reversing the roles: Any modality can be a \"question\" In our Multimodal RAG system, every modality is a potential search query . Let's go beyond the audio example and explore how other data types can initiate investigations . 1. Searching by text (deciphering the criminal’s note) Scenario: You found an encrypted text message and want to find related evidence. Expected results: 2. Image search (tracking the suspicious crime scene) Scenario: A new crime scene ( crime_scene2.jpg ) needs to be compared with other evidence. Output: 3. Depth map search (3D pursuit) Scenario: A depth map ( jdancing-depth.png ) reveals image escape patterns . Output Why does this matter? Each modality reveals unique connections : Text → Linguistic patterns of the suspect. Images → Recognition of locations and objects. Depth → 3D scene reconstruction. Now, we have a structured evidence database in Elasticsearch , enabling us to store and retrieve multimodal evidence efficiently . Summary of what we've done: Stored multimodal embeddings in Elasticsearch. Performed similarity searches , finding evidence related to new clues. Tested the search using a suspicious audio file , ensuring the system works correctly. Next step: We will use an LLM (Large Language Model) to analyze the retrieved evidence and generate a final report . Stage 4 - Connecting the dots with the LLM Now that the evidence has been indexed in Elasticsearch and can be retrieved by similarity, we need a LLM (Large Language Model) to analyze it and generate a final report to send to Commissioner Gordon. The LLM will be responsible for identifying patterns, connecting clues, and suggesting a possible suspect based on the retrieved evidence. For this task, we will use GPT-4 Turbo , formulating a detailed prompt so that the model can interpret the results efficiently. LLM integration To integrate the LLM into our system, we created the LLMAnalyzer class ( src/llm_analyzer.py ), which receives the retrieved evidence from Elasticsearch and generates a forensic report using this evidence as the prompt context. Temperature setting in LLM analysis: For our forensic analysis system, we use a moderate temperature of 0.5. This balanced setting was chosen because: It represents a middle ground between deterministic (too rigid) and highly random outputs; At 0.5, the model maintains enough structure to provide logical and justifiable forensic conclusions; This setting allows the model to identify patterns and make connections while staying within reasonable forensic analysis parameters; It balances the need for consistent, reliable outputs with the ability to generate insightful analysis. This moderate temperature setting helps ensure our forensic analysis is both reliable and insightful, avoiding both overly rigid and overly speculative conclusions. Running the evidence analysis Now that we have the LLM integration , we need a script that connects all system components. This script will: Search for similar evidence in Elasticsearch. Analyze the retrieved evidence using the LLM to generate a final report. Code: Evidence analysis script Expected LLM output Conclusion: Case solved With all the clues gathered and analyzed , the Multimodal RAG system has identified a suspect: The Joker . By combining images, audio, text, and depth maps into a shared vector space using ImageBind , the system was able to detect connections that would have been impossible to identify manually. Elasticsearch ensured fast and efficient searches , while the LLM synthesized the evidence into a clear and conclusive report . However, the true power of this system goes beyond Gotham City . The Multimodal RAG architecture opens doors to numerous real-world applications : Urban surveillance: Identifying suspects based on images, audio, and sensor data . Forensic analysis: Correlating evidence from multiple sources to solve complex crimes . Multimedia recommendation: Creating recommendation systems that understand multimodal contexts (e.g., suggesting music based on images or text). Social media trends: Detecting trending topics across different data formats. Now that you’ve learned how to build a Multimodal RAG system , why not test it with your own clues ? Share your discoveries with us and help the community advance in the field of multimodal AI ! Special thanks I would like to thank Adrian Cole for his valuable contribution and review during the process of defining the deployment architecture of this code. References Build a multimodal image retrieval system using KNN search and CLIP embeddings k-Nearest Neighbor (kNN) Search PyTorch Official Documentation on Tensors ImageBind: a new way to ‘link’ AI across the senses Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to How does the pipeline work? Technologies used Who is this blog for? Prerequisites: Setting up the environment 1. Technical requirements Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Building a Multimodal RAG system with Elasticsearch: The story of Gotham City - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/building-multimodal-rag-system","meta_description":"Learn how to build a Multimodal RAG system that integrates text, audio, video, and image data to provide richer, contextualized information retrieval."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. Vector Database Search Relevance Generative AI SM By: Sunile Manjee On March 12, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Imagine searching for “recently renovated accommodations 250m from Belongil Beach with at least 4 stars and a pool and gym” and having your search engine return exactly what you needed. Intelligent search understands intent and reasons with your queries. Reasoning and intention discovery require sophistication beyond a heuristics-only approach. This is where large language model functions and Elasticsearch search templates come together to deliver a truly intelligent search experience. Try it yourself If you're not a believer, no worries. This entire end-to-end example is available within a Python notebook here . The notebook includes data, index mapping, inferencing endpoints, search templates, and LLM functions. You'll need an Elastic Cloud instance, Azure OpenAI instance, and Google Maps API key. The Problem: Capturing complex constraints Search approaches such as keyword or even vector-based methods are challenged when users pack multiple nuances into a single request. Consider a hotel finder scenario. A person wants: A hotel near Belongil Beach, within 250 meters. A minimum rating of 4 stars. A recently renovated property. Specific amenities like a pool and gym. Trying to encode all of these requirements using naive keyword matches or basic similarity scores might return incomplete or irrelevant results, reducing user trust and experience. Schema-Based Searches with LLMs Elasticsearch is built on an index schema architecture. When you index data, you define fields like “rating,” “geopoint,” or “amenities,” making it much easier to filter and rank results accurately. However, this structure demands that queries be equally structured. That’s where LLMs (like GPT-based or Generation models) become the linchpin. An LLM can interpret a user’s natural language query, extract key attributes (“distance = 250m,” “rating >= 4,” “amenities = pool, gym,” “near Belongil Beach,” etc.), and even call a geocoder service when a geo component has been detected. It then outputs a JSON payload ready to be slotted into an Elasticsearch search template —a parameterized query that cleanly separates the query logic from the dynamic values. With this approach, we capitalize on both the semantic understanding of LLMs and the schema-based filtering and faceting of Elasticsearch. Example in action Suppose your user’s query is: “recently renovated accommodations 250m from Belongil Beach with at least 4 stars and with a pool and gym.” LLM processing: The LLM breaks down the text, recognizes that it needs a distance-based filter (250m), a minimum rating (4 stars), relevant amenities (pool, gym), and even a contextual hint about being “recently renovated.” It also calls a geocoding service for “Belongil Beach,” returning precise latitude and longitude coordinates. Search template: You create an Elasticsearch search template that expects parameters like rating , distance , latitude , longitude , and possibly a text query for any free-form conditions. Once the LLM provides these parameters, your application simply fills in the placeholders and calls Elasticsearch. Here, not only is filtering leveraged, but also hybrid queries using vectors, ELSER, and lexical search. Results: The response precisely matches accommodations within 250 meters of Belongil Beach, has at least 4 stars, is flagged as recently renovated, and includes a pool and gym. An example result might be: Hotel name : Belongil Beach Apartment Rating : 4 stars City : Byron Bay, New South Wales Country : Australia Rather than depending solely on a vector space or hybrid search, you can input precise filters, which will make the recall more comprehensive and the precision more accurate. Why this approach works Precision and recall : By structuring the query according to the index schema, you remove ambiguity, ensuring you don’t miss valid results (high recall) and keep out irrelevant ones (high precision). This is often observed when relying solely on a vector space, which doesn’t naturally offer distillation features. Scalability : Elasticsearch is designed for massive data volumes. Once the parameters are extracted, the query itself remains blazing fast, even on huge indexes. Flexibility: If new attributes appear (e.g., “EV charging station”), the LLM functions should capture the attribute as a hotel amenity and inject it into the Elasticsearch search template. Resilience to complexity : No matter how complex a user’s query, the LLM’s semantic parsing ensures every relevant detail is captured: distance constraints, star ratings, location-based conditions, and more. Why search templates Elasticsearch templates enable the creation of parameterized queries, separating the query logic from dynamic values. This is particularly valuable when building dynamic queries based on user input or other variable data. For example, consider the hotel index with fields Description Attractions Rating Facilities Location Latitude Longitude Users might include any combination of these attributes in their search query. As the number of fields increases, manually constructing a query for every potential input combination becomes impractical. Search templates provide a solution by dynamically generating the appropriate query based on the user's input. If the user specifies a rating and attractions, the corresponding query is generated. Similarly, if the user provides a location and rating, the search template generates a query that reflects those inputs. Search templates are defined using a JSON format that includes placeholders for dynamic values. When you execute a search template, Elasticsearch replaces the placeholders with the actual values and then executes the query. Search templates can be used to perform a variety of tasks, such as: Filtering results based on user input Boosting the relevance of certain results Adding custom scoring functions Aggregating results Below is an example of the search template that dynamically creates a query based on input parameters. LLM functions Large Language Model functions utilize strong reasoning capabilities to determine the optimal subsequent action, such as parsing data, calling an API, or requesting additional information. When combined with search templates, LLMs can determine if a user query contains an attribute supported by a search template. If a supported attribute is identified, the LLM will execute the corresponding user-defined method call. Within the notebook , there are a few LLM functions. Each function is defined with the tools list. Let’s briefly review each one. The role of the `extract_hotel_search_parameters` LLM function is to extract the parameters from the user query that the search template supports. The `geocode_location` LLM function would be invoked if a location attribute such as \"500 meters from Belongil Beach\" is identified. The LLM function `query_elasticsearch` will be called using the `geocode_location` (if it was found within the user query) and the parameters from the LLM function `extract_hotel_search_parameters`. The completions API registers each LLM function as a tool. This list of tools was detailed earlier in the article. Azure OpenAI The notebook uses an Azure OpenAI completions model and to run it, you will need the Azure OpenAI Key (either Key 1 or Key 2), Endpoint, Deployment Name, and Version number. All this information can be found under Azure OpenAI → Keys and Endpoint. Deploy a completion model. That is the deployment name used within the notebook . Under Chat playground, click on View code to find the api version. Google Maps API The notebook uses the Google Maps API to geocode locations identified within user queries. This functionality requires a Google account and an API key, which can be generated here . Putting LLM functions and search templates into Action The LLM uses reasoning to determine the necessary functions and their order of execution based on the given query. Once a query is executed such as \"recently renovated accommodations 250m from Belongil Beach with at least 4 stars and with a pool and gym\", the LLM reasoning layer is exposed: Extract parameters The initial LLM function call is designed to pull parameters from the query. Geocoding The LLM then determined that the query contained a 'from' location and that a geocoder function should be called next. Intelligent query The reasoning layer of the LLM uses the parameters from previous function calls to execute an Elasticsearch query with search templates. Precise results Using LLM functions along with a search template to execute an intelligent query, a perfect match has been found. Conclusion Combining the power of Large Language Models functions with Elasticsearch search templates ushers in capabilities for query intent and reasoning. Rather than treating a query as an unstructured blob of text, we methodically break it down, match it against a known schema, and let Elasticsearch handle the heavy lifting of search, filtering and scoring. The result is a highly accurate, user-friendly search experience that feels almost magical—users simply speak (or type) their minds, and the system understands precisely what they mean. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to Try it yourself The Problem: Capturing complex constraints Example in action Why this approach works Why search templates Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Unifying Elastic vector database and LLM functions for intelligent query - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/llm-functions-elasticsearch-intelligent-query","meta_description":" Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Joining two indices in Elasticsearch Explaining how to use the terms query and the enrich processor for joining two indices in Elasticsearch. How To KB By: Kofi Bartlett On May 7, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In Elasticsearch, joining two indices is not as straightforward as it is in traditional SQL databases. However, it is possible to achieve similar results using certain techniques and features provided by Elasticsearch. This article will delve into the process of joining two indices in Elasticsearch, focusing on the use of the terms query and the enrich processor. Using the terms query for joining two indices The terms query is one of the most effective ways to join two indices in Elasticsearch. This query is used to retrieve documents that contain one or more exact terms in a specific field. Here’s how you can use it to join two indices: First, you need to retrieve the required data from the first index. This can be done using a simple GET request. Once you have the data from the first index, you can use it to query the second index. This is done using the terms query, where you specify the field and the values you want to match. Here is an example: In this example, field_in_second_index is the field in the second index that you want to match with the values from the first index. value1_from_first_index and value2_from_first_index are the values from the first index that you want to match in the second index. The terms query also provides support to perform the two above steps in a single shot using a technique called terms lookup. Elasticsearch will take care of transparently retrieving the values to match from another index. For example, you have a teams index containing a list of players: Now, it is possible to query a people index for all the people playing in team1, as shown below: In the example above, Elasticsearch will transparently retrieve the player names from the team1 document present in the teams index (i.e. “john”, “bill”, “michael”) and find all people documents with a name field that contains any of those values. The equivalent SQL query would be: Using the enrich processor for joining two indices The enrich processor is another powerful tool that can be used to join two indices in Elasticsearch. This processor enriches the data of incoming documents by adding data from a pre-defined enrich index. Here’s how you can use the enrich processor to join two indices: 1. First, you need to create an enrich policy. This policy defines which index to use for enrichment and which field to match on. Here is an example: 2. Once the policy is created, you need to execute it: 3. After executing the policy, you can use the enrich processor in an ingest pipeline to enrich the data of incoming documents: In this example, field_in_second_index is the field in the second index that you want to enrich with data from the first index. enriched_field is the new field that will contain the enriched data. One drawback of this approach is that if the data changes in first_index , the enrich policy needs to be re-executed as the enriched index is not updated or synchronized automatically from the source index it has been built from. However, if first_index is relatively stable, then this approach works great. Conclusion In conclusion, while Elasticsearch does not support traditional join operations, it provides features like the terms query and the enrich processor that can be used to achieve similar results. It’s important to note that these methods have their limitations and should be used judiciously based on the specific requirements and the nature of the data. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Using the terms query for joining two indices Using the enrich processor for joining two indices Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Joining two indices in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-join-two-indexes","meta_description":"Explaining how to use the terms query and the enrich processor for joining two indices in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Export your Kibana Dev Console requests to Python and JavaScript Code The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application. Python Javascript Developer Experience How To MG By: Miguel Grinberg On October 30, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Have you used the Kibana Dev Console? This is a fantastic prototyping tool that allows you to build and test your Elasticsearch requests interactively. But what do you do after you have a working request in the Console? In this article we'll take a look at the new code generation feature in the Kibana Dev Console, and how it can significantly reduce your development effort by generating ready to use code for you. This feature is available in our Serverless platform and in Elastic Cloud and self-hosted releases 8.16 and up. The Kibana Dev Console This section provides a quick introduction to the Kibana Dev Console , in case you have never used it before. Skip to the next section if you are already familiar with it. While you are in any part of the Search section in Kibana, you will notice a \"Console\" link at the bottom of your browser's page: When you click this link, the Console expands to cover the page. Click it again to collapse it. In the left-side panel of the Dev Console, you can enter Elasticsearch requests, with the help of an interactive editor that provides auto-completion and checks your syntax. Some example requests are already pre-populated so that you have something to start experimenting with. When the cursor is on a request, a \"play\" button appears to its right. You can click this button to send the request to your Elasticsearch server. After you execute a request, the response from the server appears in the panel on the right. Code Export feature in Kibana Dev Console The Dev Console makes it easy to prototype your requests or queries until you get exactly what you want. But what happens next? If you need to convert the request to code so that you can incorporate it into your application, then you can save time using the new code export feature. Next to the Play button you will find the three dot or \"kebab\" button, which opens a menu of options. The first option provides access to the code export feature. If you've never used this feature before, it will appear with a \"Copy as curl\" label. If you select this option, your clipboard will be loaded with a curl command that is equivalent to the selected request. Now, things get more interesting when you click the \"Change\" link, which allows you to switch to a different target language. In this initial release, the code export adds support for Python and JavaScript. More languages are expected to be added in future releases. You can now select your desired language and click \"Copy code\" to put the exported code in your clipboard. You can also change the default language that is offered in the menu. The exported code is a complete script in the selected language, using the official Elasticsearch client for that language. Here is an example of how the PUT /my-index request shown above looks when exported to the Python language: To use the exported code follow these steps: Paste the code from the clipboard to a new file with the correct extension ( .py for Python, or .js for JavaScript). In your terminal, add an environment variable called ELASTIC_API_KEY with a valid API Key for your Elasticsearch cluster. You can create an API key right in Kibana if you don't have one yet. Execute the script with the python or node commands depending on your language, making sure the official Elasticsearch client is installed. Now you are ready to adapt the exported code as needed to integrate it into your application! Conclusion In this article you have learned about the new Code Export feature in the Kibana Dev Console. We hope this feature will streamline your development process with Elasticsearch! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to The Kibana Dev Console Code Export feature in Kibana Dev Console Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Export your Kibana Dev Console requests to Python and JavaScript Code - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/kibana-dev-console-code-export","meta_description":"Learn how to export your Kibana Dev Console requests to Python and JavaScript using the Code Export feature."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Understanding Elasticsearch scoring and the Explain API Diving into the scoring mechanism of Elasticsearch and exploring the Explain API. How To KB By: Kofi Bartlett On May 5, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch is a powerful search engine that provides fast and relevant search results by calculating a score for each document in the index. This score is a crucial factor in determining the order of the search results. In this article, we will delve into the scoring mechanism of Elasticsearch and explore the Explain API, which helps in understanding the scoring process. Scoring mechanisms in Elasticsearch Elasticsearch uses a scoring model called the Practical Scoring Function (BM25) by default. This model is based on the probabilistic information retrieval theory and takes into account factors such as term frequency, inverse document frequency, and field-length normalization. Let’s briefly discuss these factors: Term Frequency (TF): This represents the number of times a term appears in a document. A higher term frequency indicates a stronger relationship between the term and the document. Inverse Document Frequency (IDF): This factor measures the importance of a term in the entire document collection. A term that appears in many documents is considered less important, while a term that appears in fewer documents is considered more important. Field-length Normalization : This factor accounts for the length of the field in which the term appears. Shorter fields are given more weight, as the term is considered more significant in a shorter field. Using the Explain API The Explain API in Elasticsearch is a valuable tool for understanding the scoring process. It provides a detailed explanation of how the score for a specific document was calculated. To use the Explain API, you need to send a GET request to the following endpoint: In the request body, you need to provide the query for which you want to understand the scoring. Here’s an example: The response from the Explain API will include a detailed breakdown of the scoring process, including the individual factors (TF, IDF, and field-length normalization) and their contributions to the final score. Here’s a sample response: In this example, the response shows that the score of 1.2 is a product of the IDF value (2.2) and the tfNorm value (0.5). The detailed explanation helps in understanding the factors contributing to the score and can be useful for fine-tuning the search relevance. Conclusion Elasticsearch scoring is a critical aspect of providing relevant search results. By understanding the scoring mechanisms and using the Explain API, you can gain insights into the factors affecting the search results and optimize your search queries for better relevance and performance. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Scoring mechanisms in Elasticsearch Using the Explain API Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Understanding Elasticsearch scoring and the Explain API - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-scoring-and-explain-api","meta_description":"Diving into the scoring mechanism of Elasticsearch and exploring the Explain API."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. Generative AI How To TM By: Tomás Murúa On March 31, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This article provides a practical comparison of Retrieval Augmented Generation (RAG) and fine-tuning by examining their performance in a chat box scenario for a fictional e-commerce store. The article is organized as follows: RAG vs Fine Tuning Chatbot test case: Pear Store Approach 1: Fine tuning Approach 2: RAG RAG vs. Fine-tuning RAG RAG (Retrieval Augmented Generation) combines large language models (LLMs) with information retrieval systems so that the generated answers feed on updated and specific data coming from a knowledge base. Advantages: RAG allows us to use external data without modifying the base model and provides precise, safe, and traceable answers. Implementation: In Elasticsearch, data can be indexed using optimized indexes for semantic search and document-level security. Challenges: RAG relies on external knowledge, making accuracy dependent on retrieved information. Retrieval can be costly in terms of context window size. RAG also faces integration and privacy challenges, especially with sensitive data across different sources. Fine-tuning Fine-tuning involves training a pre-trained model on a specific dataset. This process adjusts the model's internal weights, enabling it to learn patterns and generate customized answers. Fine-tuning can also be used for model distillation , a technique where a smaller model is trained on the outputs of a larger model to improve performance on a specific task. This approach allows leveraging the capabilities of a larger model at a reduced cost. Advantages: It offers a high level of optimization, adapting answers to specific tasks, making it ideal for static contexts or domains where knowledge does not change frequently. Implementation: It requires training the model with structured data using an input-output format. OpenAI fine-tuning makes this flow easier using a UI where you can upload the dataset (JSONL) and then train and test it in a controlled environment. Challenges: The retraining process consumes time and computer resources. Precision depends on the quality and size of the dataset, so small or unbalanced ones can result in generic or out-of-context answers; it requires expertise and effort to get it right. There is no grounding or per-user data segmentation. From OpenAI docs : “ We recommend first attempting to get good results with prompt engineering, prompt chaining (breaking complex tasks into multiple prompts), and function calling…” Fine-tuning and RAG comparison Aspect Fine-Tuning RAG Supported data Static Dynamic Setup cost High (training and resources) Low (index configuration) Scalability Low, requires model retraining High, real-time updates Update time Hours/Days Minutes Precision with recent changes Low when not trained with new data High thanks to semantic search Chatbot test case: Pear Store We will use a test case based on a fictional online store called 'Pear Store'. Pear Store needs an assistant to answer specific questions about its policies, promotions, and products. These answers must be truthful and consistent with the store information and useful for both employees and customers. Fine-tuning Dataset We'll use a training dataset with specific questions and their answers regarding products, policies and promotions. For example: Question : What happens if a product is defective? Answer : If a product is defective, we'll send you a free gift of one kilogram of pears along with the replacement. RAG Dataset For the RAG implementation, we will use the same dataset, converted into a PDF and indexed into Elasticsearch using Playground. Here's the PDF file content: Approach 1: Fine-tuning First, we prepare the dataset in JSONL format, as shown below: Make sure each line in the JSONL file is a valid JSON object and there are no trailing commas. Next, using the OpenAI UI, we can go to Dashboard > Fine-tuning and hit Create Then you can upload the JSONL file we just created. Now click Create to start training. After the job is finished, you can hit Playground, and you will have a convenient interface to compare the results with and without the fine-tuned model against a particular question. On the right side, we can see that the model provided the custom answer about defective products: a free kilogram of pears along with the replacement. However, the model's response was inconsistent. A subsequent attempt with the same question yielded an unexpected answer. Although fine-tuning allows us to customize the model's answers, we can see that the model still deviated and provided answers that were just generic and not aligned with our dataset. This is probably because fine-tuning needs more adjustments or a bigger dataset. Now, if we want to change the source data, we will have to repeat the fine-tuning process. Approach 2: RAG To test the dataset using RAG, we will use Playground to create the RAG application and upload the dataset to Kibana. To upload a PDF using the UI and configure the semantic text field, follow the steps from this video: To learn more about uploading PDFs and interacting with them using Playground, you can read this article . Now we're ready to interact with our data using Playground! Using the UI, we can change the AI instructions and check the source of the document used to provide an answer. When we ask the same question in Playground: \" What happens if a product is defective?\" we receive the correct answer: \" If a product is defective, we send you a free gift of one kilogram of pears along with the replacement.\". Additionally, we get a citation to verify the answer´s source and can review the instructions the model followed: If we want to change the data, we just have to update the index with the information about the Q/A . Final thoughts The choice between fine-tuning and RAG depends on the requirements of each system. A common pattern is using some domain specific fine tuned model, like FinGPT for finance, LEGAL-BERT for legal, or medAlpaca for medical to acquire common terminology. Then, frame the answers context, and build a RAG system on top of it with company specific documents. Fine-tuning is useful when you want to manage the model's behavior , and doing so through prompt engineering is not possible, or it requires so many tokens that it’s better to add that information to the training. Or perhaps the task is so narrow and structured that model distillation is the best option. RAG, on the other hand, excels at integrating knowledge through dynamic data and ensuring accurate, up-to-date responses in real-time. This makes it especially useful for scenarios like the Pear Store, where policies and promotions change frequently. RAG also provides data that is grounded in the answer and offers the ability to segment information delivered to users via document-level security. Combining fine-tuning and RAG can also be an effective strategy to leverage the strengths of both approaches and tailor solutions to specific project needs. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to RAG vs. Fine-tuning RAG Fine-tuning Fine-tuning and RAG comparison Chatbot test case: Pear Store Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"RAG vs. Fine Tuning, a practical approach - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/rag-vs-fine-tuning","meta_description":"Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. How To KB By: Kofi Bartlett On May 9, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In Elasticsearch, it is a common requirement to delete a field from a document. This can be useful when you want to remove unnecessary or outdated information from your index. In this article, we will discuss different methods to delete a field from a document in Elasticsearch, along with examples and step-by-step instructions. Method 1: Using the Update API The Update API allows you to update a document by providing a script that modifies the document’s source. You can use this API to delete a field from a document by setting the field to null. Here’s a step-by-step guide on how to do this: 1. Identify the index, document type (if using Elasticsearch 6.x or earlier), and document ID of the document you want to update. 2. Use the Update API with a script that sets the field to null, or even better, removes it from the source document. The following example demonstrates how to delete the “field_to_delete” field from a document with ID “1” in the “my_index” index: 3. Execute the request. If successful, Elasticsearch will return a response indicating that the document has been updated. Note: This method only removes the field from the specified document. The field will still exist in the mapping and other documents in the index. Method 2: Reindexing with a Modified Source If you want to delete a field from all documents in an index, you can use the Reindex API to create a new index with the modified source. Here’s how to do this: 1. Create a new index with the same settings and mappings as the original index. You can use the Get Index API to retrieve the settings and mappings of the original index. 2. Use the Reindex API to copy documents from the original index to the new index, while removing the field from the source. The following example demonstrates how to delete the “field_to_delete” field from all documents in the “my_index” index: 3. Verify that the new index contains the correct documents with the field removed. 4. If everything looks good, you can delete the original index and, if necessary, add an alias to the new index having the name of the original index name. Method 3: Updating the Mapping and Reindexing If you want to delete a field from the mapping and all documents in an index, you can update the mapping and then reindex the documents. Here’s how to do this: 1. Create a new index with the same settings as the original index. 2. Retrieve the mappings of the original index using the Get Mapping API. 3. Modify the mappings by removing the field you want to delete. 4. Apply the modified mappings to the new index using the Put Mapping API. 5. Use the Reindex API to copy documents from the original index to the new index, as described in Method 2. 6. Verify that the new index contains the correct documents with the field removed and that the field is not present in the mapping. 7. If everything looks good, you can delete the original index and, if necessary, add an alias to the new index having the name of the original index name. Conclusion In this article, we discussed three methods to delete a field from a document in Elasticsearch: using the Update API, reindexing with a modified source, and updating the mapping and reindexing. Each method has its own use cases and trade-offs, so choose the one that best fits your requirements. Always remember to test your changes and verify the results before applying them to production environments. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett How To May 14, 2025 Elasticsearch Index Number_of_Replicas Explaining how to configure the number_of_replicas, its implications and best practices. KB By: Kofi Bartlett Jump to Method 1: Using the Update API Method 2: Reindexing with a Modified Source Method 3: Updating the Mapping and Reindexing Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Deleting a field from a document in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-delete-field-from-document%E2%80%8B","meta_description":"Exploring methods for deleting a field from a document in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. Developer Experience MC By: Mike Cote On April 18, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Kibana alerting has been the monitoring solution of choice for many large organizations over the last few years. As adoption has continued to grow, so has the number of alerting rules users have created to monitor their systems. With more organizations relying on Kibana for alerting at scale, we have seen an opportunity to improve efficiency and ensure sufficient performance for future workload needs. Between Kibana 8.16 and 8.18, we tackled these issues head-on, introducing key improvements that shattered previous scalability barriers. Before these enhancements, Kibana Alerting could only support up to 3,200 rules per minute with at least 16 Kibana nodes before hitting significant performance bottlenecks. By Kibana 8.18, we’ve increased the scalability ceiling of rules per minute by 50x, supporting up to 160,000 lightweight alerting rules per minute. This was achieved by making Kibana efficiently scale beyond 16 Kibana nodes and increasing per-node throughput from 200 to up to 3,500 rules per minute. These enhancements make all alerting rules run faster, with fewer delays and more efficiently. In this blog, we’ll explore the scaling challenges we overcame, the key innovations that made it possible, and how you can leverage them to run Kibana Alerting at scale efficiently. How Kibana Alerting scales with Task Manager Kibana Alerting allows users to define rules that trigger alerts based on real-time data. Behind the scenes, the Kibana Task Manager schedules and runs these rules. The Task Manager is Kibana’s built-in job scheduler, designed to handle asynchronous background tasks separately from user interactions. Its key responsibilities include: Running one-time and recurring tasks such as alerting rules, connector actions, and reports. Dynamically distributing workloads as Kibana background nodes join or leave the cluster. Keeping the Kibana UI responsive by offloading tasks to dedicated background processes. Each alerting rule translates into a recurring background task. Each background task is an Elasticsearch document, meaning it is stored, fetched and updated as an Elasticsearch document. As the number of alerting rules increases, so do the background tasks Kibana must manage. However, each Kibana node has a limit on how many tasks it can handle simultaneously. Once capacity is reached, additional tasks must wait, leading to delays and slower task run times. The problem: Why scaling was limited Before these improvements, Task Manager faced several scalability constraints, preventing it from scaling beyond 3,200 tasks per minute and 16 Kibana nodes. At this scale, we observed diminishing returns as contention and resource inefficiencies limited further scale. These numbers were based on internal performance testing using a basic Elasticsearch query alerting rule performing a no-op query. The diminishing returns observed included: Task claiming contention Task Manager uses a distributed polling approach to claim tasks within an Elasticsearch index. Kibana nodes periodically query for tasks and attempt to claim them using Elasticsearch’s optimistic concurrency control , which prevents conflicting document updates. If another node updates the task first, the original node drops it, reducing overall efficiency. With too many Kibana nodes competing for tasks, document update conflicts increase drastically, limiting efficiency beyond 16 nodes and reducing system throughput. Inefficient per-node throughput Each Kibana node has a limit on the number of tasks that can run concurrently (default: 10 tasks at a time) to prevent memory and CPU overload. This safeguard often results in underutilized CPU and memory, requiring more nodes than necessary. Additionally, the polling interval (default: 3000ms) defines how often Task Manager claims new tasks. A shorter interval reduces task delays but increases contention as nodes compete more for updates. Resource inefficiencies When running a high volume of alerting rules, Kibana nodes perform repetitive Elasticsearch queries, repeatedly loading the same objects and lists for each alerting rule run, consuming more CPU, memory, and Elasticsearch resources than necessary. Scaling up requires costly infrastructure expansions to support the increasing request loads. Why it’s important Breaking these barriers is crucial for Kibana’s continued evolution. Improved scalability unlocks: Cost optimization : Reducing infrastructure costs for large-scale operations. Faster recovery : Enhancing Kibana’s ability to recover from node or cluster failures. Future expansion : Enabling scalability for additional workloads, such as scheduled reports and event-driven automation. Key innovations in Kibana Task Manager To achieve a 50x scalability boost, we introduced several innovations: Kibana discovery service: smarter scaling Previously, Kibana nodes were unaware of each other’s presence, leading to inefficient task distribution. The new Kibana discovery service dynamically monitors active nodes and assigns task partitions accordingly, ensuring even load distribution and reducing contention. Task partitioning: eliminating contention To prevent nodes from competing for the same tasks, we introduced task partitioning. Tasks are now distributed across 256 partitions, ensuring only a subset of Kibana background nodes attempt to claim the same tasks at any given time. By default, each partition is assigned to a maximum of two Kibana nodes, while a single Kibana node can be responsible for multiple partitions. Task costing: smarter resource allocation Not all background tasks consume the same resources. We implemented a task costing system that assigns task weights based on CPU and memory usage. This allows Task Manager to dynamically adjust the number of tasks to claim, optimize resource allocation, and ensure efficient performance. New task claiming algorithm The old algorithm relied on update-by-query with forced index refresh to identify claimed tasks. This approach was inefficient and introduced unnecessary load on Elasticsearch. The new algorithm avoids this by searching for tasks without requiring an immediate refresh. Instead, it performs the following operations on the task manager index; a _search to find candidate tasks, followed by an _mget which returns documents that may have been updated more recently but are not yet reflected in the refreshed index state. By comparing document versions from _search and _mget results, it discards mismatches before proceeding with bulk updates. This approach increases efficiency in Elasticsearch and offers finer control to support task costing. By factoring in the poll interval, task concurrency and the index refresh rate, we can calculate the upper limit of expected conflicts and adjust the _search page size accordingly. This helps ensure enough tasks are retrieved so the _mget doesn’t discard all the search results due to document version mismatches. More frequent polling for tasks By ensuring a fixed number of nodes compete for the same tasks with task partitioning and a new lightweight task claiming algorithm, Task Manager can now poll for tasks more frequently without additional stress on Elasticsearch. This reduces delays between a task completing and the next one starting, increasing overall system throughput. Performance optimizations in Kibana Alerting Before our optimizations using Elastic APM , we analyzed alerting rule performance and found that the alerting framework required at least 20 Elasticsearch queries to run any alerting rule. After the optimizations, we reduced this to just 3 queries - an 85% reduction, significantly improving run times and reducing CPU overhead. Additionally, Elasticsearch previously relied on the resource-intensive pbkdf2 hashing algorithm for API key authentication, introducing excessive overhead at scale. We optimized authentication by switching to the more efficient SHA-256 algorithm, allowing us to eliminate the use of an internal Elasticsearch cache that was severely limited by the number of API keys used concurrently. Impact: How users are benefiting Early adoption has demonstrated: 50% faster rule run times , reducing overall system load. Increased task capacity , enabling more tasks to run on existing infrastructure. Fewer under-provisioned clusters , minimizing the need for scaling infrastructure to meet demand. Drop in average task delay because of increased per-node throughput and making the cluster properly provisioned Drop in rule run duration because of alerting framework optimizations Drop in Elasticsearch requests because of alerting framework optimizations Getting started: How to scale efficiently Upgrading to Kibana 8.18 unlocks most of these benefits automatically. For additional optimization, consider adjusting the xpack.task_manager.capacity setting to maximize per-node throughput while ensuring p999 resource usage remains below 80% for memory, CPU, and event loop utilization and below 500ms for event loop delay. By default, Kibana has a guardrail of 32,000 alerting rules per minute. If you plan to exceed this limit, you can modify the xpack.alerting.rules.maxScheduledPerMinute setting accordingly. The new xpack.task_manager.capacity setting makes Kibana handle workload distributions more effectively, making the following settings unnecessary in most cases and should be removed from your kibana.yml settings: xpack.task_manager.max_workers xpack.task_manager.poll_interval If you’re running Kibana on-prem and want to isolate background tasks into dedicated nodes, you can use the node.roles setting to separate UI-serving nodes from those handling background tasks. If you’re using Kibana on Elastic Cloud Hosted (ECH), scaling to 8GB or higher will automatically enable this isolation. What’s next? We’re not stopping at 50x. Our roadmap aims for 100x+ scalability, further eliminating Elasticsearch bottlenecks. Beyond scaling, we’re also focusing on improving system monitoring at scale. Upcoming integrations will provide system administrators with deeper insights into background task performance, making it easier to decide when and how to scale. Additionally, with task costing, we plan to increase task concurrency for Elastic Cloud Hosted (ECH) customers when configured with more CPU and memory (e.g., Kibana clusters with 2GB, 4GB, or 8GB+ of memory). Stay tuned for even more advancements as we continue to push the limits of Kibana scalability! The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Jump to How Kibana Alerting scales with Task Manager The problem: Why scaling was limited Why it’s important Key innovations in Kibana Task Manager Performance optimizations in Kibana Alerting Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Kibana Alerting: Breaking past scalability limits & unlocking 50x scale - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/kibana-alerting-task-manager-scalability","meta_description":"Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Search relevance tuning: Balancing keyword and semantic search This blog offers practical strategies for tuning search relevance that can be complementary to semantic search. Vector Database How To KD By: Kathleen DeRusso On May 14, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Introduction Tuning for relevance is an essential part of user search experience. Semantic search in particular faces several challenges, many of which are solved through hybrid search and application of relevance tuning practices that have been honed by decades of research in lexical search. We'll go into some of these strategies and how you can effectively use them to tune relevance in a hybrid search world. This blog shares some of the highlights from the Haystack 2024 talk, Retro Relevance: Lessons Learned Balancing Keyword and Semantic Search . Lexical search toolbox for relevance tuning Text search algorithms like BM25 have been around for decades, and in fact BM25 is often used synonymously with text search. This blog post goes into how BM25 works in detail. Analyzers, tokenizers, filters, field weights and boosts are all tools in our lexical search toolbox that give us the power to transform text in very specific ways to support both general and very specialized search use cases. But we also have a lot of other tools at our disposal: Reranking is another powerful tool in this toolbox, whether it pertains to Learn to Rank, semantic reranking, etc. Synonyms are heavily used in keyword search to differentiate slang, domain specific jargon, and so on. General models may not handle very niche synonyms well. These tools are used to impact relevance, but also importantly to accommodate business rules. Business rules are custom rules and their use cases vary widely, but commonly include diversifying result sets or showing sponsored content based on contextual query results or other personalization factors. Challenges with semantic search Semantic search is impressively effective at representing the intent of what you're looking for, returning matching results even if they don't contain the exact keywords you specified. However - if you’re developing a search application and incorporating semantic search into your existing tech stack, semantic search is not without some pitfalls. These pitfalls largely fall under three categories: Cost Features that semantic search inherently doesn't have yet Queries that semantic search by itself doesn't do well with Cost can be money (training or licensing models, compute), or it can be time. Time can be latency (inference latency at ingest or search), or it can be the cost of development time. We don't want to spend valuable engineering time on things that are easily solved with existing tools, and instead, use that time to focus on solving the hard problems that require engineering focus. There are also many features that people have come to expect in their search solutions; for example, highlighting, spelling correction, and typo tolerance. These are all things that semantic search struggles with out of the box today, but many UI/UX folks consider these table stakes in terms of user functionality. As far as queries that semantic search may not do well with, these are typically niche queries. Examples include: Exact matches such as model numbers Domain specific jargon We also have to consider requirements including business rules (for example boosting based on popularity, conversions, or campaigns), which semantic search by itself may not handle natively. Query understanding is another issue. This could be as simple as handling numeric conversions and units of measurement, or it could be very complex such as handling negatives. You may have had a frustrating experience searching for a negative, such as “I want a restaurant that doesn't serve meat”. LLMs may be OK at returning vegetarian restaurants here, but most semantic search is going to return you restaurants that serve meat! These are hard problems to solve and they're the ones we want to spend our valuable engineering time on. Benefits of hybrid search Hybrid search is the best of both worlds: it combines the precision and functionality of BM25 text search with the semantic understanding of vector search. This leads to both better recall and better overall relevance. To help put this in perspective, let's look at some examples: Real estate: Modern farmhouse with lots of land and an inground pool in the 12866 zip code. Whether the house has a pool and its ZIP code can be filters, and semantic search can be used over the style description. eCommerce: Comfortable Skechers with memory foam insoles in purple. The color and brand can be filters, and the rest can be covered with semantic search. Job hunting: Remote software engineer jobs using Elasticsearch and cloud native technologies. The job title and preference for remote work can be filters, and the job skills can be handled with semantic search. In all the above examples, the query has something specific to filter on along with more vague text that benefits from semantic understanding. What does a hybrid search look like in Elasticsearch? The phrase \"hybrid search\" is a little buzzwordy right now, and people might think of it differently in various scenarios. In some systems, where you have a separate vector database, this might involve multiple calls to different data stores and combining them with a service. But, one of the superpowers of Elasticsearch is that all of this can be combined in one single index and one search call. In Elasticsearch, a hybrid search may be as simple as a Boolean query . Here's an example of a Boolean query structure in Elasticsearch that combines text search, KNN searches, text expansion queries, and other supported query types. This can of course be combined with rescores, and everything else that makes Elasticsearch so powerful. Boolean queries are a very easy way to combine these text and vector searches into one, single query. One note about this example is that KNN was introduced as a query in addition to the top level search in 8.12, making this query structure even more powerful. Another option is to use retrievers , which starting with Elasticsearch 8.14.0 are an easier way of describing these complex retrieval pipelines. Here is an example that combines a standard query as a retriever, with a kNN query as a retriever, all rolled up to use Reciprocal Rank Fusion (RRF) to rank the results. Combining result sets Now that you have a hybrid search query, how do you combine all this into a single result set? This is a hard problem, especially when the scores are virtually guaranteed to be vastly different depending on how the results were retrieved. The classic way, using the Boolean query example, is with linear combination where you can apply boosts to each individual clause in the larger query. This is tried and true, nice old technology that we all know and love, but it can be finicky. It requires tuning to get right and then you may never get it perfect. If you're using retrievers you can also use RRF . This is easier - you can rely on an algorithm and don't have to do any tuning. There are some trade-offs - you have less fine grained control over your result sets. RRF doesn't take BM25 boosting into account, so if you're boosting on business rules, you might not get the results you want out of the box. Ultimately the method you should choose depends on your data and your use case. Tuning for relevance Once you've created your query, tuning for relevance is a hard problem to solve, but you have several tools at your disposal: Business metrics. These are the most important metrics in a lot of ways: Are users clicking on results, and in eCommerce use cases for example better yet are they completing purchases? Is your conversion rate increasing? Are users spending a decent amount of time reading the content on your site? These are all measures of user experience but they’re gathered through analytics and they’re direct proof of whether your search is providing results that are actually useful. For use cases like RAG, where the results are custom, subjective, and subject to change, this might be the only way to really measure the impact of your search changes. User surveys. Why not ask users if they thought the results were good and bad? You do have to take some things into account such as whether users will provide truthful responses, but it’s a good way of taking a pulse of what users think of your search engine. Quantitative ways of measuring relevance such as MAP and NDCG. These metrics require judgment lists which can then also be used for Learn to Rank. The single biggest trap that people can fall into, though, is tuning for one or a few “pet” queries: the handful of queries that you - or maybe your boss - enters. You can change everything about your algorithm to get that best top result for that one query, but then it can have cascading effects downstream, because now you’ve unknowingly messed up the bulk of your other queries. The good news is that there are some tools available to help! Applying tools for search relevance tuning Query rules Remember that pet query? Well I have good news for you - you can still have great results for that pet query without modifying your relevance algorithm, using the concept of pinned or promoted documents. At Elastic, we call these query rules. Query rules allow you to send in some type of context, such as a user-entered query string, and if it matches some criteria we can configure specific documents that we want to rank first, second, third, etc. One great use case for query rules is the application of business rules. Another use case is “fixing” relevance. Overall relevance shouldn't be nitpicky, and we should rely on methods like ranking, reranking, and/or RRF to get it right. But there are always exceptions. Maybe overall relevance is good enough, but you have a couple of queries that people complain about? OK, just set up a rule. But you can go further if you want: it can potentially be a worthwhile investment to take a quick pass through your head queries to make sure that they're returning the right information and these users are getting a good search experience. It's not cheating to correct some of the common user-entered queries, and then focus on improving your torso and tail queries through the power of semantic search where it really shines. So how does this work? Elastic query rules are defined by creating a query ruleset, or a list of one or more rules. Each rule has criteria that must match the rule in order for a query to be applied, and then actions that we take on the rule if it matches. A rule can have multiple criteria, based on the metadata you send in from the client. In this example, a user's query string and their location was sent in a rule - both of those criteria would have to be met in order for the rule to match. To trigger these rules at search time, you send in a corresponding rule query that specifies the metadata that you want to match on. We'll apply all matching rules in the ruleset, in order, and pin the documents that you want to come back. We're currently working on plans to make this feature generally available and extend the functionality: for example to support tokenizers and analyzers on rule criteria, making it easier for non-technical people to manage query rules, and to potentially provide additional actions on top of just pinning documents. You can read more about query rules and how to use them in this guide and corresponding blog post . Synonyms Next, let's talk about synonyms. Maybe you have some domain specific jargon that is unique to only your business and isn't in any of the current models - and you don't necessarily want to take on the expense to fine tune and train your own model. For example: while ELSER will recognize both pug and beagle as related to dog , it will not recognize puggle (a crossbeed of pug and beagle) as a dog . Synonyms can help here! Synonyms are a great way of translating this domain specific terminology, slang, and alternate ways of saying a word that may just be too specialized for a model to return the matches we want. In Elasticsearch, we used to manage this in a way that required a lot of manual overhead - you had to upload synonyms files and reload analyzers. In Elasticsearch 8.10, we introduced a synonyms API that makes this easier. Similar to query rules, you create synonyms sets with one or more defined synonyms, and then when you use the API to add, update, or remove synonyms it reloads analyzers for you - pretty easy! You can then update your mappings to define a custom analyzer that uses this synonyms set. The nice thing about synonyms being supported with analyzers is that when we do support analyzers in query rules in the future, we'll be able to support synonyms as well out of the box. You can read more about the synonyms API and how to use it in this guide and corresponding blog post . Wrapping up Semantic search doesn't replace BM25 search, it's an enhancement to existing search technologies. Hybrid search solves many problems innate to semantic search and is the best of both worlds in terms of both recall and functionality. Semantic search really shines with long tail and torso queries. Tools like query rules and synonyms can help provide the best search experience possible while freeing up valuable developer time to focus on solving important problems. As the landscape evolves, we're getting better and better at solving some of the hard problems that come with semantic search, and making it easier to use both semantic and hybrid search through simplification and tooling. Our goal as search practitioners is to return the best results possible. Our other goal is to do this as easily as possible, and minimize costs - those costs include money and time, and time can mean latency or engineering overhead. We don't want to waste that valuable engineering time - we want to spend it solving hard problems! You can try the features I've talked about out in Cloud today! Be sure to head over to our discuss forums and let us know what you think. Watch the Haystack talk Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Introduction Lexical search toolbox for relevance tuning Challenges with semantic search Benefits of hybrid search What does a hybrid search look like in Elasticsearch? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Search relevance tuning: Balancing keyword and semantic search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/search-relevance-tuning-in-semantic-search","meta_description":"This blog offers practical strategies for tuning search relevance that can be complementary to semantic search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ML Research ST By: Shubha Anjur Tupil On December 10, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Take your search experiences up to level 11 with our new state-of-the-art cross-encoder Elastic Rerank model (in Tech Preview). Reranking models provide a semantic boost to any search experience, without requiring you to change the schema of your data, giving you room to explore other relevance tools for semantic relevance on your own time and within your budget. Semantic boost your keyword search : Regardless of where or how your data is stored, indexed or searched today, semantic reranking is an easy additional step that allows you to boost your existing search results with semantic understanding. You have the flexibility to apply this as needed– without requiring changes to your existing data or indexing pipelines and you can do this with an Elastic foundational model as your easy first choice. Flexibility of choice for any budget : All search experiences can be improved with the addition of semantic meaning which is typically applied by utilizing a dense or sparse vector model such as ELSER. However, achieving your relevance goals doesn’t require a one-size-fits-all solution, it’s about mixing and matching tools to balance performance and cost. Hybrid search is one such option, improving relevance by combining semantic search with keyword search using reciprocal rank fusion (RRF) in Elasticsearch. The Elastic Rerank model is now an additional lever to enhance search relevance in place of semantic search, giving you the flexibility to optimize for both relevance and budget. First made available on serverless, but now available in tech preview in 8.17 for Elasticsearch, the benefits of our model exceed those of other models in the market today. Performant and efficient : The Elastic Rerank model outperforms other significantly larger reranking models. Built on the DeBERTa v3 architecture, it has been fine-tuned by distillation on a diverse dataset. Our detailed testing shows a 40% uplift on a broad range of retrieval tasks and up to 90% on question answering data sets. As a comparison, the Elastic Rerank model is significantly better or comparable in terms of relevance even with much larger models. In our testing a few models, such as bge-re-ranker-v2-gemma , came closest in relevance but are an order of magnitude larger in terms of parameter count. That being said, we provide integrations in our Open Inference API to enable access to other third-party rerankers, so you can easily test and see for yourself. Using the Elastic Rerank model Not only are the performance and cost characteristics of the Elastic Rerank model great, we have also made it really easy to use to improve the relevance for lexical search. We want to provide easy to use primitives that help you build effective search, quickly, and without having to make lots of decisions; from which models to use, to how to use them in your search pipeline. We make it easy to get started and to scale. You can now use Elastic Rerank using the Inference API with the text_similiarity_reranker retriever. Once downloaded and deployed each search request can handle a full hybrid search query and rerank the resulting set in one simple _search query. It’s really easy to integrate the Elastic Rerank model in your code, to combine different retrievers to combine hybrid search with reranking. Here is an example that uses ELSER for semantic search, RRF for hybrid search and the reranker to rank the results. If you have a fun dataset like mine that combines the love of AI with Cobrai Kai you will get something meaningful. Summary: The Elastic Rerank Model English only cross-encoder model Semantic Boost your Keyword Search with little to no changes how data is indexed and searched already More control and flexibility over the cost of semantic boosting decoupled from indexing and search Reuse the data you already have in Elasticsearch Delivers significant improvements in relevance and performance (40% better on average for a large range of retrieval tasks and up to 90% better on question answering tasks as compared to significantly larger models, tested with over 21 datasets with an average of +13 points nDCG@10 improvement) Easy-to-use, out-of-the-box; built into the Elastic Inference API, easy to load and use in search pipelines Available in technical preview on across our product suite, easiest way to get started is on Elasticsearch Serverless If you want to read all the details of how we built this, head over to our blog on Search Labs . Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 5, 2024 Exploring depth in a 'retrieve-and-rerank' pipeline Select an optimal re-ranking depth for your model and dataset. TP TV QH By: Thanos Papaoikonomou , Thomas Veasey and Quentin Herreros Jump to Using the Elastic Rerank model Summary: The Elastic Rerank Model Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-rerank-model-introduction","meta_description":"Learn about the Elastic Rerank model and explore how to use it through practical examples, including how to to integrate the rerank model in your code."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. Lucene Vector Database BT By: Benjamin Trent On February 27, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. For years, Apache Lucene and Elasticsearch have supported filtered search with kNN queries, allowing users to retrieve the nearest neighbors that meet a specified metadata filter. However, performance has always suffered when dealing with semi-restrictive filters. In Apache Lucene, we are introducing a variation of ACORN-1 —a new approach for filtered kNN search that achieves up to 5x faster searches with little drop in recall. This blog goes over the challenges of filtered HNSW search, explaining why performance slows as filtering increases, and how we improved HNSW vector search in Apache Lucene with the ACORN-1 algorithm. Why searching fewer docs is actually slower Counterintuitively, filtering documents—thereby reducing the number of candidates—can actually make kNN searches slower. For traditional lexical search, fewer documents means fewer scoring operations, meaning faster search. However, in an HNSW graph, the primary cost is the number of vector comparisons needed to identify the k nearest neighbors. At certain filter set sizes, the number of vector comparisons can increase significantly, slowing down search performance. Here is an example of unfiltered graph search. Note there are about 6 vector operations. Because the HNSW graph in Apache Lucene has no knowledge of filtering criteria when built, it constructs purely based on vector similarity. When applying a filter to retrieve the k nearest neighbors, the search process traverses more of the graph. This happens because the natural nearest neighbors within a local graph neighborhood may be filtered out , requiring deeper exploration and increasing the number of vector comparisons. Here is an example of the current filtered graph search. The “dashed circles” are vectors that do not match the filter. We even make vector comparisons against the filtered out vectors, resulting in more vector ops, about 9 total. You may ask, why perform vector comparisons against nodes that don’t match the filter at all? Well, HNSW graphs are already sparsely connected. If we were to consider only matching nodes during exploration, the search process could easily get stuck , unable to traverse the graph efficiently. Note the filtered “gulf” between the entry point and the first valid filtered set. In a typical graph, it's possible for such a gap to exist, causing exploration to end prematurely and resulting in poor recall. We gotta make this faster Since the graph doesn’t account for filtering criteria, we have to explore the graph more. Additionally, to avoid getting stuck, we must perform vector comparisons against filtered-out nodes. How can we reduce the number of vector operations without getting stuck? This is the exact problem tackled by Liana Patel et. al. in their ACORN paper. While the paper discusses multiple graph techniques, the specific algorithm we care about with Apache Lucene is their ACORN-1 algorithm. The main idea is that you only explore nodes that satisfy your filter. To compensate for the increased sparsity, ACORN-1 extends the exploration beyond the immediate neighborhood. Now instead of exploring just the immediate neighbors, each neighbor’s neighbor is also explored. This means that for a graph with 32 connections, instead of only looking at the nearest 32 neighbors, exploration will attempt to find matching neighbors in 32*32=1024 extended neighborhood. Here you can see the ACORN algorithm in action. Only doing vector comparisons and exploration for valid matching vectors, quickly expanding from the immediate neighborhood. Resulting in much fewer vector ops, about 6 in total. Within Lucene, we have slightly adapted the ACORN-1 algorithm in the following ways. The extended neighborhoods are only explored if more than 10% of the vectors are filtered out in the immediate neighborhood. Additionally, the extended neighborhood isn’t explored if we have already scored at least neighborCount * 1.0/(1.0 - neighborFilterRatio) . This allows the searcher to take advantage of more densely connected neighborhoods where the neighborhood connectedness is highly correlated with the filter. We also have noticed both in inversely correlated filters (e.g. filters that only match vectors that are far away from the query vector) or exceptionally restrictive filters, only exploring the neighborhood of each neighbor isn’t enough. The searcher will also attempt branching further than the neighbors’ neighbors when no valid vectors passing the filter are found. However, to prevent getting lost in the graph, this additional exploration is bounded. Numbers don’t lie Across multiple real-world datasets, this new filtering approach has delivered significant speed improvements . Here is randomly filtering at 0.05% 1M Cohere vectors : Up and to the left is “winning”, which shows that the candidate is significantly better. Though, to achieve the same recall, search parameters (e.g. num_candidates ) need to be adjusted. To further investigate this reduction in improvement as more vectors pass the filter, we did another test over an 8M Cohere Wiki document data set . Generally, no matter the number of vectors filtered, you want higher recall, with fewer visited vectors. A simple way to quantify this is by examining the recall-to-visited ratio . Here we see how the new filtered search methodology achieves much better recall vs. visited ratio. It's clear that near 60%, the improvements level off or disappear. Consequently, in Lucene, this new algorithm will only be utilized when 40% or more of the vectors are filtered out. Even our nightly Lucene benchmarks saw an impressive improvement with this change. Apache Lucene runs over 8M 768 document vectors with a random filter that allows 5% of the vectors to pass. These kinds of graphs make me happy. Gotta go fast Filtering kNN search over metadata is key for real world use-cases. In Lucene 10.2, we have made it as much as 5x faster, using fewer resources, and keeping high recall. I am so excited about getting this in the hands of our users in a future Elasticsearch v9 release. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Why searching fewer docs is actually slower We gotta make this faster Numbers don’t lie Gotta go fast Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Filtered HNSW search, fast mode - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/filtered-hnsw-knn-search","meta_description":"Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog GenAI for Customer Support — Part 3: Designing a chat interface for chatbots... for humans This series gives you an inside look at how we’re using generative AI in customer support. Join us as we share our journey in designing a GenAI chatbot interface. Inside Elastic IM By: Ian Moersen On August 9, 2024 Part of Series GenAI for customer support Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog series reveals how our Field Engineering team used the Elastic stack with generative AI to develop a lovable and effective customer support chatbot. If you missed other installments in the series, be sure to check out part one , part two , part four , the launch blog , and part five . The idea of chatting via a web app has been around for a very long time. So you might think that means a GenAI chatbot would be a standard, boring interface to build. But it turns out an AI chatbot presents a few interesting and novel challenges. I’ll mention a few of them here, and hopefully, if you’re looking to build your own chat interface, you can use some of these tips and tricks to help you out. In my role as a UI designer, I like to make big deals about tiny things. Is the hex color for an avatar a shade too dark? I’m definitely complaining. Is the animation on this tooltip not eased properly? Let’s spend the time to track down the right bezier curves. No no, trust me, it’s definitely worth it. Is the font rendering slightly different on the new page? Oh yeah, you’re definitely going to be hearing about it from Ian. So when my team began work on a new automated Support Assistant, we had to decide: Do we pull a library off the shelf to handle the chat interface? Do we develop our own from scratch? For me, I hardly wanted to consider the former. Getting the small things right for our chatbot is a designer’s dream. Let’s do this. 1. Choosing the library So when I said “develop our own from scratch” earlier, I didn’t mean from scratch scratch. Sorry folks, this is 2024 AD, most people don’t develop UI components from scratch anymore. Many developers rely on component libraries to build new things, and at Elastic we’re no exception. Although we are pretty exceptional in one respect: We have our very own Elastic UI component library , and it’s free for anyone to use. EUI currently has no “ChatBot” component, but it does provide the avatars, “panels”, text areas, etc, one might need to create a nice little chat window. If you want to follow along with the rest of this post, feel free to open this sample EUI chat interface I made in another tab and you can give it a spin yourself. Have fun! 2. Animations... With some unlikely help After designing & assembling the major building blocks of the chat interface (which you can check out in the sandbox link above), one of our next challenges was how to keep users engaged during the sometimes-lengthy period of time the chatbot took to respond. To make matters worse, the first LLM endpoint we were using (for internal alpha-testing) wasn’t streaming its responses; it simply generated and sent the entire answer back to us in a single HTTP response body. This took forever. Not great. Action From -> to Approx. observed latency Initial request Client -> server 100 - 500ms RAG search Server -> cluster 1 - 2.5s Call to LLM Server -> LLM 1 - 2.5s First streamed byte LLM -> server -> client 3 - 6s Total 5.1 - 11.5 seconds Our first line of defense here was a compelling “loading” animation. I wanted something custom, interesting to look at, but also one that stuck very closely to Elastic’s overall brand guidelines. To that end, I decided to use Elastic’s existing EuiIcon component to display three dots, then use Elastic brand colors and EUI’s default animation bezier curves —those mathematical descriptions of how animations appear to accelerate and decelerate—to keep things feeling “Elastic” as it pulsed, blinked and changed colors. Choreographing bouncing, color-changing and opacity fading in CSS was a bit outside my comfort zone. So instead of spending a whole day guessing at values to use, it occurred to me I had someone I could ask sitting right in front of me. That’s right, I asked for (an early version of) the chatbot to program its own loading animation. It came up with something very nearly perfect on its first try. After a bit of fine-tuning and code refactoring, this was the result: (Bonus points if you can figure out which props to edit in the sandbox link above to see these loading dots yourself. All the code’s there!) This resulted in a pleasing little loading animation that I still enjoy looking at for a few seconds at a time; just what we needed! Now, whether a chatbot programming itself is existentially worrying?... That’s a question I’ll leave to the philosophers. But in my capacity as a web developer, I needed to focus on more practical matters. Like what we should do if the response from the LLM takes too long or drops entirely. 3. Killswitch engage Handling network timeouts and failures is pretty straightforward in most traditional web apps. Just check the error code of the response and handle those appropriately. Any additional handling for timeouts can be caught in a try/catch block or something similar. Generally, a typical HTTP fetch will know how to handle timeouts, which are usually configured to happen after a reasonably short amount of time and happen relatively rarely. The current state of generative AI API endpoints is not quite like that. Yes, occasionally, you’ll get a quick failure response with an error code, but remember that we’re streaming the LLM’s response here. Much more often than not, we receive a 200 OK from the API endpoint quickly, which tells us that the large language model is ready to begin streaming its response... But then it can take an exceedingly long time to receive any data at all. Or, partway through the stream, the trail goes cold, and the connection simply hangs. In either case, we didn’t want to rely on traditional network timeouts to give the user an option to retry their question. It is a much better user experience to have a short timeout on a failed attempt and then a quick, successful retry than a successful response that took way, way too long. So, after we found most failed streams will take over one minute to resolve, we went to work finding the shortest amount of time that would guarantee the stream was likely going to fail (or take an excessive amount of time to resolve). We kept cutting it shorter and shorter until we found that after only 10 seconds of radio silence, we could be nearly certain that the stream would either eventually fail or take longer than one minute to pick back up. Here’s some pseudocode illustrating the concept. It’s an example of the kind of code you might find in the primary function that calls a streaming LLM API after a user asks a question. By just being a little clever with AbortController signals and a setTimeout , you can implement a “killswitch” on the fetch() function to quickly return an error to the user if the stream goes dead for more than 10 seconds: So after solving these problems, and probably about a hundred others, it was time to focus on another challenge unique to site-wide generative AI interfaces: Context. 4. Chat history context While chatting with an AI assistant, you expect it to have the context of your previous messages. If you ask it to clarify its answer, for example, it needs to “remember” the question you asked it, as well as its own response. You can’t just send “Can you clarify that?” all by itself to the LLM and expect a useful response. Context, within a conversation, is straightforward to find and send. Just turn all previous chat messages into a JSON object and send it along with the latest question to the LLM endpoint. Although there may be a few smaller considerations to make—like how to serialize and store metadata or RAG results—it is comparatively uncomplicated. Here’s a bit of pseudocode illustrating how to enrich a default prompt with conversational context. But what about other types of context? For example: When you’re reading a support case and see a chat widget on the page, wouldn’t it make sense to ask the AI assistant “how long has this case been open?”. Well, to provide that answer, we’ll need to pass the support case itself as context to the LLM. But what if you’re reading that support case and one of the replies contains a term you don’t understand. It would make sense to ask the assistant to explain this highly technical term. Well for that, we’ll need to send a different context to the LLM (in our case, the results from a search of our knowledge base for that technical term). How do we convey to the user something as complex and unique as context, in order to orient them in the conversation? How can we also let users also choose which context to send? And, maybe hardest of all, how do we do all of this with such a limited number of pixels? After designing and evaluating quite a few options (breadcrumbs? a sticky alert bar within the chat window? tiny little badges??) we settled on a “prepended” element to the text input area. This keeps context right next to the “action item” it describes; context is attached only to your next question, not your last answer! UI Element Pros Cons Breadcrumbs Small footprint, easy to interact with Better for representing URLs and paths Banner at top Out of the way, allows for long description Not easy to interact with, can get lost Micro-badges Easy to display multiple contexts Difficult to edit context Prepended menu w/number badge Close to input field, easy to interact with Tight squeeze in the space available Additionally, an EUI context menu can be used to allow power users to edit their context. Let’s say you want to ask the Assistant something that would require both the case history and a thorough search of the Elastic knowledge base; those are two very different contexts. “How do I implement the changes the Elastic engineer is asking me to make?” for example. You could then use the context menu to ensure both sources of information are being used for the Assistant’s response. This also gives us more flexibility. If we want the LLM itself to determine context after each question, for example, we’d be able to display that to the user easily, and the little pink notification badge could alert users if there are any updates. These were just a handful of the many smaller problems we needed to solve while developing our GenAI Support Assistant interface. Even though it seems like everyone’s releasing a chatbot these days, I hadn’t seen many breakdowns of real-life problems one might encounter while engineering the interfaces and experiences. Building a frictionless interface with emphasis on making streaming experiences feel snappy, making affordances for unexpected timeouts and designing for complex concepts like chat context with only a few pixels to spare are just a few of the problems we needed to solve. Implementing an AI chatbot naturally puts the bulk of the engineering focus on the LLM and backend services. However, it’s important to keep in mind that the UX/UI components of a new tool will require adequate time and attention as well. Even though we’re building a generation of products that use AI technology, it’s always going to be important to design for humans. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Jump to 1. Choosing the library 2. Animations... With some unlikely help 3. Killswitch engage 4. Chat history context Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"GenAI for Customer Support — Part 3: Designing a chat interface for chatbots... for humans - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/genai-elastic-elser-chat-interface","meta_description":"Discover how we designed a GenAI chatbot interface for customer support. Learn about AI chat interfaces​, animations, chat history context, handling timeouts, and more."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. AutoOps How To SF By: Sachin Frayne On November 20, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. There are a number of ways that hotspotting can occur in an Elasticsearch cluster. Some we can control, like noisy neighbors, and some we have less control over, like the shard allocation algorithm in Elasticsearch. The good news is that the new desired_balance cluster.routing.allocation.type algorithm (see shards-rebalancing-heuristics ) is much better at determining which nodes in the cluster should get the new shards. If there is an imbalance present, it will figure out the optimal balance for us. The bad news is that older Elasticsearch clusters are still using the balanced allocation algorithm which has a more limited calculation that is prone to making mistakes when choosing nodes that can lead to imbalanced or hotspotted clusters. In this blog we will explore this old algorithm, how it is supposed to work and when it does not work, and what we can do to address it. We will then go through the new algorithm and how it solves this problem, and finally we will look at how we used AutoOps to highlight this issue for a customer use case. We will however not go into all the causes for hotspotting, nor will we go into all the specific solutions as they are quite numerous. Balanced allocation In Elasticsearch 8.5 and earlier we used the following method to determine which node to place a shard, this method mostly came down to choosing the node with the least number of shards: https://github.com/elastic/elasticsearch/blob/8.5/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java#L242 node.numShards() : the number of shards allocated to a specific node in the cluster balancer.avgShardsPerNode() : the mean of the shards across all the nodes in the cluster node.numShards(index) : the number of shards for a specific index allocated to a specific node in the cluster balancer.avgShardsPerNode(index) : the mean of the shards for a specific index across all the nodes in the cluster theta0: ( cluster.routing.allocation.balance.shard ) weight factor for the total number of shards, defaults to 0.45f, increasing this raises the tendency to equalise the number of shards per node, (see Shard balancing heuristics settings ) theta1 : ( cluster.routing.allocation.balance.index ) weight factor for the total number of shards per index, defaults to 0.55f, increasing this raises the tendency to equalise the number of shards per index per node, (see Shard balancing heuristics settings ) The target value for this algorithm across the cluster is to pick a node in such a way that the weight across all the nodes in the cluster gets us back to 0 or gets us the closest to 0. Example Let's explore a situation where we have 2 nodes with 1 index made up of 3 primary shards, and let's assume we have 1 shard on node 1 and 2 shards on node 2. What should happen when we add a new index to the cluster with 1 shard? w e i g h t N o d e 1 = 0.45 f ( 1 − 1.5 ) + 0.55 f ( 0 − 0 ) = − 0.225 weightNode1 = 0.45f(1 - 1.5) + 0.55f(0 - 0) = -0.225 w e i g h tN o d e 1 = 0.45 f ( 1 − 1.5 ) + 0.55 f ( 0 − 0 ) = − 0.225 w e i g h t N o d e 2 = 0.45 f ( 2 − 1.5 ) + 0.55 f ( 0 − 0 ) = 0.225 weightNode2 = 0.45f(2 - 1.5) + 0.55f(0 - 0) = 0.225 w e i g h tN o d e 2 = 0.45 f ( 2 − 1.5 ) + 0.55 f ( 0 − 0 ) = 0.225 Since the new index has no shards anywhere else in the cluster, the weightIndex term reduces to 0, as we can see in the next calculation adding the shard to node 1 will bring the balance back to 0 so we choose node 1. w e i g h t N o d e 1 = 0.45 f ( 2 − 2 ) + 0.55 f ( 0 − 0 ) = 0 weightNode1 = 0.45f(2 - 2) + 0.55f(0 - 0) = 0 w e i g h tN o d e 1 = 0.45 f ( 2 − 2 ) + 0.55 f ( 0 − 0 ) = 0 w e i g h t N o d e 2 = 0.45 f ( 2 − 2 ) + 0.55 f ( 0 − 0 ) = 0 weightNode2 = 0.45f(2 - 2) + 0.55f(0 - 0) = 0 w e i g h tN o d e 2 = 0.45 f ( 2 − 2 ) + 0.55 f ( 0 − 0 ) = 0 Now let's add another index with 2 shards, the first shard will go randomly to one of the nodes since we are now balanced. Assuming node 1 was chosen for the first shard, the second shard will go to node 2. w e i g h t N o d e 1 = 0.45 f ( 3 − 2.5 ) + 0.55 f ( 1 − 0.5 ) = 0.5 weightNode1 = 0.45f(3 - 2.5) + 0.55f(1 - 0.5) = 0.5 w e i g h tN o d e 1 = 0.45 f ( 3 − 2.5 ) + 0.55 f ( 1 − 0.5 ) = 0.5 w e i g h t N o d e 2 = 0.45 f ( 2 − 2.5 ) + 0.55 f ( 0 − 0.5 ) = − 0.5 weightNode2 = 0.45f(2 - 2.5) + 0.55f(0 - 0.5) = -0.5 w e i g h tN o d e 2 = 0.45 f ( 2 − 2.5 ) + 0.55 f ( 0 − 0.5 ) = − 0.5 The new balance will finally be: w e i g h t N o d e 1 = 0.45 f ( 3 − 3 ) + 0.55 f ( 0 − 0 ) = 0 weightNode1 = 0.45f(3 - 3) + 0.55f(0 - 0) = 0 w e i g h tN o d e 1 = 0.45 f ( 3 − 3 ) + 0.55 f ( 0 − 0 ) = 0 w e i g h t N o d e 2 = 0.45 f ( 3 − 3 ) + 0.55 f ( 0 − 0 ) = 0 weightNode2 = 0.45f(3 - 3) + 0.55f(0 - 0) = 0 w e i g h tN o d e 2 = 0.45 f ( 3 − 3 ) + 0.55 f ( 0 − 0 ) = 0 This algorithm will work well if all indices/shards in the cluster are doing approximately the same amount of work in terms of ingest, search and storage requirements. In reality, most Elasticsearch use cases are not this simple and the load across the shards is not always the same, imagine the following scenario. Image 1: Elasticsearch cluster (the exaggerated size of the shards represent how “busy” the shards actually are) Index 1, small search use case with a few thousand documents, incorrect number of shards, Index 2, very large index, but not being actively written to and occasional searching, Index 3, light indexing and searching, Index 4, heavy ingest application logs. Let’s suppose, we have 3 nodes and 4 indices with only primary shards, deliberately in an unbalanced state. To visually understand what is going on I have exaggerated the size of the shards according to how busy they are and what busy could mean (write, read, cpu, ram or storage). Even though node 3 already has the busiest index, new shards will route to that node. Index lifecycle management (ILM) won’t solve this situation for us, when the index is rolled over, the new shards will be placed on node 3. We could manually ease this problem by forcing Elasticsearch to spread the shards evenly using cluster reroute , but this does not scale, as our distributed system should take care of this. Still, without any rebalance or other kinds of intervention, this situation will remain and potentially get worse. What’s more, while this example is fake, this kind of distribution is inevitable in older Elasticsearch clusters with mixed use cases (i.e., search, logging, security) especially when one or more of the use cases is heavy ingest, determining when this will occur is not trivial. While the timeframe to predict this issue is complicated, a good solution that works well in some use cases is to keep the shard density across all indices the same, this is achieved by rolling all indices when their shards get to a predetermined size in Gigabytes, (see size your shards ). This does not work in all use cases, as we will see in the cluster caught by AutoOps below. Desired balance allocation To address this issue and a few others, a new algorithm that can take into account both write load and disk usage was initially released in 8.6 and underwent some minor, yet meaningful, changes in versions 8.7 and 8.8: https://github.com/elastic/elasticsearch/blob/8.8/server/src/main/java/org/elasticsearch/cluster/routing/allocation/allocator/BalancedShardsAllocator.java#L305 node.writeLoad() : the write or indexing load of a specific node balancer.avgWriteLoadPerNode() : the mean write load across the cluster node.diskUsageInBytes() : the disk usage for a specific node balancer.avgDiskUsageInBytesPerNode() : the mean disk usage across the cluster theta2 : ( cluster.routing.allocation.balance.write_load ) weight factor for the write load, defaults to 10.0f, increasing this raises the tendency to equalise the write load per node, (see Shard balancing heuristics settings ) theta3 : ( cluster.routing.allocation.balance.disk_usage ) weight factor for the disk usage, defaults to 2e-11f, increasing this raises the tendency to equalise the disk usage per node, (see Shard balancing heuristics settings ) I will not go into detail in this blog on the calculations that this algorithm is doing, however the data that is used by Elasticsearch to decide where the shards should live is available via an API: Get desired balance . It is still a best practice to follow our guidance when you size your shards and there are still good reasons to separate out use cases into dedicated Elasticsearch clusters. Yet this algorithm is much better at balancing Elasticsearch, so much so that it resolved the balancing issues for our customer below. (If you are facing the issue described in this blog, I recommend that you upgrade to 8.8). One final thing to note, this algorithm does not take into account search load, this is not trivial to measure and even harder to predict. Adaptive replica selection , introduced in 6.1, goes a long way to addressing search load. In a future blog we will dive deeper into the topic of search performance and specifically how we can use AutoOps to catch our search performance issues before they occur. Detecting hotspotting in AutoOps Not only is the situation described above difficult to predict, but when it occurs it used to be difficult to detect, it takes a lot of internal understanding of Elasticsearch and very specific conditions for our clusters to end up in this state. Now with AutoOps detecting this issue is a cinch. Let’s see a real world example; In this setup there is a queueing mechanism in front of Elasticsearch for spikes in the data, however the use case is near real time logs - sustained lag is not acceptable. We had a situation with sustained lag that we had to troubleshoot. Starting in the cluster view we pick up some useful information, in the image below we learn that there are 3 master nodes, 8 data nodes (and 3 other nodes that are not interesting to the case). We also learn that the cluster is red, (this could be networking or performance issues), the version is 8.5.1 and there are 6355 shards; these last 2 will become important later. Image 2: Cluster Info There is a lot going on in this cluster, it is going red a lot, these are related to the nodes leaving the cluster. The nodes are leaving the cluster around the time that we observe indexing rejections and the rejections are happening shortly after the indexing queues are getting filled up too frequently, the darker the yellow, the more high indexing events in the time block. Image 3: Timeline of events in the cluster, (importantly let’s highlight Data Node Disconnected) Moving to the node view and focusing in on the timeframe around the last node disconnect we can see that another node, node 9, has a much higher indexing rate than the rest of the nodes, followed by the second highest indexing rate observed on node 4, which has had some disconnects earlier in the month. You will also notice that there is a fairly large drop in indexing rate around the same timeframe, this was in fact also related to intermittent latency in this particular cluster between the compute resources and the storage. Image 4: Data node 9, high indexing rate. AutoOps by default will only report nodes that disconnect for more than 300 seconds, but we know that other nodes including node 9 are frequently leaving the cluster, as can be seen in the image below, the number of shards on the node are growing too fast to be moving shards, therefore they must be re-initialising after a node disconnect/restart. With these pieces of information we can safely conclude that the cluster is experiencing a performance issue, but not only that it is a hotspotting performance issue. Since Elasticsearch works as a cluster, it can only work as fast as its slowest node and since node 9 is being asked to do more work than the other nodes and it can’t keep up, the other nodes are always waiting for it and are occasionally getting disconnected themselves. Image 5: Data node 9, number of shards increasing. We do not need more information at this point, but to further illustrate the power of AutoOps below is another image which shows how much more work node 9 is doing than the other nodes, specifically how much data it is writing to disk. Image 6: Disk write and IOPS. We decided to move all the shards off of node 9, by randomly sending them to the rest of the nodes in the cluster; this was achieved with the following command. After this the indexing performance of the whole cluster improved and the lag disappeared. Now that we have observed, confirmed and circumvented the issue, we need to find a long term solution to the problem, which brings us back to the technical analysis at the beginning of the blog. The best practices were being followed, the shards rolled at a predetermined size and we were even limiting the number of shards for a specific index per node. We hit an edge case that the algorithm could not deal with, heavy index and frequently rolled indices. We thought about whether we could rebalance the cluster manually, but with around 2000 indices made up of 6355 shards, this was not going to be trivial, not to mention, with this level of indexing we would be racing against ILM to rebalance. This is exactly what the new algorithm was designed for and so our final recommendation is to upgrade the cluster. Final thoughts This blog is a summary of a fairly specific but complicated set of circumstances that can cause a problem with Elasticsearch performance. You may even see some of these issues in your cluster today but may never get into a position where your cluster is affected as badly as this user was. This case underscores the importance of keeping up with the latest versions of Elasticsearch to consistently take advantage of the latest innovations in managing data better and it helps to showcase the power of AutoOps in finding/diagnosing and alerting us to issues, before they become full production incidents. Thinking about migrating to at least version 8.8 https://www.elastic.co/guide/en/elasticsearch/reference/8.8/migrating-8.8.html Report an issue Related content AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Balanced allocation Example Desired balance allocation Detecting hotspotting in AutoOps Final thoughts Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Hotspotting in Elasticsearch and how to resolve them with AutoOps - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/hotspot-elasticsearch-autoops","meta_description":"Explore hotspotting in Elasticsearch and how to resolve it using AutoOps.\n"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. Search Relevance IT By: Ioana Tagirta On April 16, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Search with ES|QL With Elasticsearch 8.18 and 9.0, ES|QL adds a host of new functionalities, including: support for scoring semantic search more configuration options for the match function a new KQL function In this blog, we will review these 8.18 features and other exciting new features that we plan to add to ES|QL, reinforcing our investment in making ES|QL a modern search language ready to fit your needs, whether you are building a search application powered by ES|QL or analyzing your data in Kibana Discover. Introducing scoring In 8.17 we added the ability to filter documents using full text functions. If you are unfamiliar with full text filtering in ES|QL, we suggest reading our original blog post about it. With 8.18 and 9.0 we introduce support for scoring , making it possible to return documents in order of their relevance. To access the score for each document, simply add the metadata _score field to your ES|QL query: We retrieve the same scores we get from the equivalent search API query: Full text search functions such as match , qstr and kql can only be used in the context of a WHERE condition and are the only ones that contribute to the score. The _score column can not only be used to sort documents by relevance, but also in custom scoring formulas. In the next example, we keep only the most relevant results using a score threshold and then add a score boost based on the reader rating: Improving the match function In ES|QL, the match function simply translates to a Query DSL match query . In 8.18 and 9.0, we expanded the match function's capabilities to include all options that are currently available in Query DSL. It is now possible to set well-known match options such as boost, fuzziness and operator in ES|QL too: Enter semantic search The 8.18 release comes with the exciting announcement that semantic search is now generally available. We've expanded the match function to support querying over semantic_text field types. In ES|QL, executing a semantic query is now as simple as performing a full-text query, as shown in this example: In this example, we set semantic_title to use the semantic_text field type. Mapping your index fields as semantic_text is all it takes to set up your index for semantic search. Check our search with semantic text tutorial for more details. Hybrid search with ES|QL ES|QL makes it straightforward to do both semantic and lexical search at the same time. It is also possible to set different boosts, prioritizing results from semantic search or lexical search, depending on your use case: Transitioning from KQL If you are a long-term user of Kibana Discover and use KQL ( Kibana Query Language ) to query and visualize your data and you'd like to try ES|QL but don't know where to start, don't worry, we got you! In 8.18 and 9.0, ES|QL adds a new function which allows you to use KQL inside ES|QL . This is as simple as: ES|QL is already available in Kibana Discover. This way, you get the best of both worlds: you can continue to use KQL and start getting more familiar with ES|QL at your own pace. Check out our getting started with ES|QL guide for more information. Beyond 8.18 and 9.0 In future releases, we'll be adding more and more search capabilities to ES|QL, including vector search, semantic reranking, enhanced score customization options, and additional methods for combining hybrid search results, such as Reciprocal Rank Fusion (RRF). Try it out yourself These changes are available starting with Elasticsearch 8.18, but they are already available in Elasticsearch Serverless. For Elasticsearch Serverless, start a free trial cloud today or try Elastic on your local machine now! Follow the Search and filter in ES|QL tutorial for a hands-on introduction to the features described in this blog post! Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Jump to Search with ES|QL Introducing scoring Improving the match function Enter semantic search Hybrid search with ES|QL Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"ES|QL, you know, for Search - Introducing scoring and semantic search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-introducing-scoring-semantic-search","meta_description":"With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Elasticsearch in JavaScript the proper way Learn how to create a production-ready Elasticsearch backend in JavaScript, follow best practices, and run the Elasticsearch Node.js client in Serverless environments. Part1 Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Part2 Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch in JavaScript the proper way - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/elasticsearch-in-javascript","meta_description":"Learn how to create a production-ready Elasticsearch backend in JavaScript, follow best practices, and run the Elasticsearch Node.js client in Serverless environments.\n\n"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Semantic Text: Simpler, better, leaner, stronger Our latest semantic_text iteration brings a host of improvements. In addition to streamlining representation in _source, benefits include reduced verbosity, more efficient disk utilization, and better integration with other Elasticsearch features. You can now use highlighting to retrieve the chunks most relevant to your query. And perhaps best of all, it is now a generally available (GA) feature! Vector Database MP By: Mike Pellegrini On March 13, 2025 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. We’ve gone on quite a journey with the semantic_text field type, and this latest iteration promises to make semantic search simpler than ever. In addition to streamlining semantic_text representation in _source , benefits include reduced verbosity, more efficient disk utilization, and better integration with other Elasticsearch features. You can now use highlighting to retrieve the chunks most relevant to your query. And perhaps best of all, it is now a generally available (GA) feature! Semantic search evolution Our approach to semantic search has evolved over the years, with the goal of making it as simple as possible. Prior to the semantic_text field type, performing semantic search required manually: Configuring your mappings to be compatible with your embeddings. Configuring an ingest pipeline to use a ML model to generate embeddings. Using the pipeline to ingest your docs. Using the same ML model at query time to generate embeddings for your query text. We called this setup “easy” at the time , but we knew we could make it far simpler than this. Enter semantic_text . In the beginning We introduced semantic_text in Elasticsearch 8.15 with the goal of simplifying semantic search. If semantic_text is new to you, we suggest reading our original blog post about it first for background about our approach. We released semantic_text as a beta feature first for a good reason. It’s a well-known truism in software development that making something simple can be quite difficult, and semantic_text is no exception. There are a lot of moving pieces behind the scenes that enable the magical semantic_text experience. We wanted to take the time to make sure we had those pieces right before moving the feature out of beta. That time was well spent: We iterated on our original approach, adding features and streamlining storage to create a simpler, leaner version of semantic_text that is more supportable in the long-term. Our original implementation relied on modifying _source to store inference results. This meant that semantic_text fields had a relatively complex subfield structure: This structure created a few problems: It was needlessly verbose. In addition to the original value, it contained metadata and chunk information, which made API responses hard to read and larger than necessary. It increased index sizes on disk. Embeddings, which can be quite large, were effectively being stored twice: once in the Lucene index for retrieval purposes and again in _source . This significantly impacted the ability to use semantic_text at scale for larger datasets. It was unintuitive to manage. The original value provided was under the text subfield, which meant special handling was required to get this value for follow-up actions. This meant that semantic_text field values didn’t act like other field values in the text family, which had numerous knock-on effects that complicated our efforts to integrate it into higher-level workflows. Semantic text as text Our revised implementation elegantly improves on those friction points with a focused simplification in approach to how we represent semantic_text in _source . Instead of using a complex subfield structure to store metadata and chunk information directly within the semantic_text field, we use a hidden metafield for this purpose. This means we no longer need to modify _source to store inference results. In practical terms, it means that the document _source that you provide to us for indexing is the same _source that you will get back upon document retrieval. Notice that there are no longer subfields like text or inference in the _source representation. Instead, the _source is as you provided it. So much simpler! 🚨 Note that if you parse semantic_text field values returned in search results or Get APIs , this is a breaking change. That is to say, if you parse the infer_field.text subfield value, you will need to update your code to instead parse the infer_field value. We try our best to avoid breaking changes, but this was an unavoidable side-effect of removing the subfield structure from _source . There are numerous benefits to this _source representation simplification: Simpler to work with. You no longer need to parse a complex subfield structure to get the original text value, you can just take the field value as the original value. Less verbose. Metadata and chunk information does not clutter up API responses. More efficient disk utilization. Embeddings are no longer stored in _source . Better integration. It allows semantic_text to integrate better with other Elasticsearch features, such as multi-fields, partial document updates, and reindexing. Let’s expand on that last point a bit because it covers a few areas. With this simplification, semantic_text fields can now be used as the source and target of multi-fields : Semantic_text fields now also support partial document updates through the Bulk API : And you can now reindex into a semantic_text field that uses a different inference_id : Semantic highlighting One of the most requested semantic_text features is the ability to retrieve the most relevant chunks within a semantic_text field. This functionality is critical for RAG use cases. Up until now, we have (unofficially) accommodated this with some hacky workarounds involving inner_hits . However, we are retiring inner_hits in favor of a more streamlined solution: highlighting. Highlighting is a well-known lexical search technique one can apply to text fields. As a member of the text field family, it only makes sense to adapt the technique for semantic_text . To this end, we have added a semantic highlighter that you can use to retrieve the chunks that are most relevant to your query: See the semantic_text documentation for more information about how to use highlighting. Ready for primetime With the _source representation change in place, we are now officially announcing that semantic_text is a generally available (GA) feature 🎉! This means that we are committed to not making any more breaking changes to the feature and supporting it in production environments. As a customer, you should feel comfortable integrating semantic_text into your production workflows knowing that Elastic is committed to supporting you and providing long-term continuity. Migrating from beta To enable an orderly migration from the beta implementation, all indices with semantic_text fields created in Elasticsearch 8.15 to 8.17 or created in Serverless prior to January 30th will continue to operate as they do today. That is to say, they will continue to use the beta _source representation . We recommend migrating to the GA _source representation at your earliest convenience. You can do so by reindexing into a new index: Note the use of the script param to account for the _source representation change. The script is taking the value from the text subfield and assigning it directly to the semantic_text field value. Try it out yourself These changes will be available in stack hosted Elasticsearch 8.18+, but if you want to try them today, they are already available in Serverless. They also pair well with semantic search simplifications we are rolling out at the same time. Use both to take semantic search to the next level! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Semantic search evolution In the beginning Semantic text as text Semantic highlighting Ready for primetime Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Semantic Text: Simpler, better, leaner, stronger - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/semantic-text-ga","meta_description":"he Elasticsearch semantic_text field type is now GA. Explore the latest improvements: semantic highlighting, simpler representation in _source and more."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. Vector Database AL By: Andre Luiz On May 13, 2025 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. The use of embeddings to improve the relevance and accuracy of information retrieval has grown significantly over the years. Tools like Elasticsearch have evolved to support this type of data through specialized field types such as dense vectors, sparse vectors, and semantic text. However, to achieve good results, it is essential to understand how to properly map embeddings to the available Elasticsearch field types: semantic_text, dense_vector, and sparse_vector. In this article, we will discuss these field types, when to use each one, and how they relate to embedding generation and usage strategies, both during indexing and querying. Dense vector type The dense_vector field type in Elasticsearch is used to store dense vectors, which are numerical representations of text where almost all dimensions are relevant. These vectors are generated by language models, such as OpenAI, Cohere, and Hugging Face, and are designed to capture the overall semantic meaning of a text, even when it does not share exact terms with other documents. In Elasticsearch, dense vectors can have up to 4096 dimensions depending on the model used. For example, the all-MiniLM-L6-v2 model generates vectors with 384 dimensions, while OpenAI’s text-embedding-ada-002 produces vectors with 1536 dimensions. The dense_vector field is commonly adopted as the default type for storing this kind of embedding when greater control is needed, such as using pre-generated vectors, applying custom similarity functions, or integrating with external models. When and why to use dense_vector type? Dense vectors are excellent for capturing semantic similarity between sentences, paragraphs, or entire documents. They work very well when the goal is to compare the overall meaning of texts, even if they do not share the same terms. The dense vector field is ideal when you already have an external embedding generation pipeline using models such as OpenAI, Cohere, or Hugging Face and only want to store and query these vectors manually. This type of field offers high compatibility with embedding models and full flexibility in generation and querying, allowing you to control how the vectors are produced, indexed, and used during search. In addition, it supports different forms of semantic search, with queries such as KNN or script_score for cases where it is necessary to adjust the ranking logic. These possibilities make the dense vector ideal for applications such as RAG (Retrieval-Augmented Generation), recommendation systems, and personalized searches based on similarity. Finally, the field allows you to customize the relevance logic, using functions such as cosineSimilarity, dotProduct or l2norm to adapt the ranking according to the needs of your use case. Dense vectors remain the best option for those who need flexibility, customization, and compatibility with advanced use cases like the ones mentioned above. How to use the query for dense vector type? Searches on fields defined as dense_vector use the k-nearest neighbor query. This query is responsible for finding documents whose dense vector is closest to the query vector. Below is an example of how to apply a Knn query to a dense vector field: In addition to the Knn query, if there is a need to customize the document scoring, it is also possible to use the script_score query, combining it with vector comparison functions such as cosineSimilarity, dotProduct, or l2norm to calculate relevance in a more controlled way. See the example: If you want to dive deeper, I recommend exploring the article How to set up vector search in Elasticsearch. Sparse vector type The sparse_vector field type is used to store sparse vectors, which are numerical representations where most values are zero and only a few terms have significant weights. This type of vector is common in term-based models such as SPLADE or ELSER (Elastic Learned Sparse EncodeR). When and why to use sparse vector type? Sparse vectors are ideal when you need a more precise search in lexical terms, without sacrificing semantic intelligence. They represent the text as token/value pairs, highlighting only the most relevant terms with associated weights, which provides clarity, control and efficiency. This type of field is especially useful when you generate vectors based on terms, such as in the ELSER or SPLADE models, which assign different weights to each token based on its relative importance in the text. For the occasions when you want to control the influence of specific words in the query, sparse vector types allow you to manually adjust the weight of the terms to optimize the ranking of the results. Among the main benefits are transparency in the search since it is possible to clearly understand why a document was considered relevant, and storage efficiency since only tokens with a non-zero value are saved, unlike dense vectors that store all dimensions. Furthermore, sparse vectors are the ideal complement in hybrid search strategies, and can even be combined with dense vectors to combine lexical precision with semantic understanding. How to use the query for sparse vector type? The sparse_vector query allows you to search for documents based on a query vector in token/value format. See an example of the query below: If you prefer to use a trained model, it is possible to use an inference endpoint that automatically transforms the query text into a sparse vector: To explore this topic further, I suggest reading Understanding sparse vector embeddings with trained ML models . Semantic text type The semantic_text field type is the simplest and most straightforward way to use semantic search in Elasticsearch. It automatically handles embedding generation, both at indexing and query time, through an inference endpoint. This means you don’t have to worry about generating or storing vectors manually. When and why to use semantic text? The semantic_text field is is ideal for those who want to get started with minimal technical effort and without having to handle vectors manually. This field automates steps like embedding generation and vector search mapping, making the setup faster and more convenient. You should consider using semantic_text when you value simplicity and abstraction , as it eliminates the complexity of manually configuring mappings, embedding generation, and ingestion pipelines . Just select the inference model, and Elasticsearch takes care of the rest. Key advantages include automatic embedding generation, performed during both indexing and querying, and ready-to-use mapping , which comes preconfigured to support the selected inference model. In addition, the field offers native support for automatic splitting of long texts (text chunking) , allowing large texts to be divided into smaller passages, each with its own embedding, which improves search precision. This greatly boosts productivity, especially for teams that want to deliver value quickly without dealing with the underlying engineering of semantic search. However, while semantic_text provides speed and simplicity, this approach has some limitations. It allows the use of market standard models, as long as they are available as inference endpoints in Elasticsearch. But it does not support externally generated embeddings , as is possible with the dense_vector field. If you need more control over how vectors are generated, want to use your own embeddings, or need to combine multiple fields for advanced strategies, the dense_vector and sparse_vector fields provide the flexibility required for more customized or domain-specific scenarios. How to use the query for semantic text type Before semantic_text , it was necessary to use a different query depending on the type of embedding (dense or sparse). A sparse_vector query was used for sparse fields, while dense_vector fields required KNN queries. With the semantic text type, the search is performed using the semantic query , which automatically generates the query vector and compares it with the embeddings of the indexed documents. The semantic_text type allows you to define an inference endpoint to embed the query, but if none is specified, the same endpoint used during indexing will be applied to the query. To learn more, I suggest reading the article Elasticsearch new semantic_text mapping: Simplifying semantic search . Conclusion When choosing how to map embeddings in Elasticsearch, it is essential to understand how you want to generate the vectors and what level of control you need over them. If you are looking for simplicity, the semantic text field enables automatic and scalable semantic search, making it ideal for many initial use cases. When more control, fine-tuned performance, or integration with custom models is required, the dense vector and sparse vector fields provide the necessary flexibility. The ideal field type depends on your use case, available infrastructure, and the maturity of your machine learning stack. Most importantly, Elastic offers the tools to build modern and highly adaptable search systems. References Semantic text field type Sparse vector field type Dense vector field type Semantic query Sparse vector query kNN search Elasticsearch new semantic_text mapping: Simplifying semantic search Understanding sparse vector embeddings with trained ML models Report an issue Related content Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Jump to Dense vector type When and why to use dense_vector type? How to use the query for dense vector type? Sparse vector type When and why to use sparse vector type? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/mapping-embeddings-to-elasticsearch-field-types","meta_description":"Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Improving information retrieval in the Elastic Stack This series explores steps to improve search relevance, benchmarking passage retrieval, ELSER, and hybrid retrieval. Part1 Generative AI July 13, 2023 Improving information retrieval in the Elastic Stack: Steps to improve search relevance In this first blog post, we will list and explain the differences between the primary building blocks available in the Elastic Stack to do information retrieval. GC QH TV By: Grégoire Corbière , Quentin Herreros and Thomas Veasey Part2 Generative AI July 13, 2023 Improving information retrieval in the Elastic Stack: Benchmarking passage retrieval In this blog post, we'll examine benchmark solutions to compare retrieval methods. We use a collection of data sets to benchmark BM25 against two dense models and illustrate the potential gain using fine-tuning strategies with one of those models. GC QH TV By: Grégoire Corbière , Quentin Herreros and Thomas Veasey Part3 ML Research June 21, 2023 Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model Learn about the Elastic Learned Sparse Encoder (ELSER), its retrieval performance, architecture, and training process. TV QH By: Thomas Veasey and Quentin Herreros Part4 Generative AI July 20, 2023 Improving information retrieval in the Elastic Stack: Hybrid retrieval In this blog we introduce hybrid retrieval and explore two concrete implementations in Elasticsearch. We explore improving Elastic Learned Sparse Encoder’s performance by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores. QH TV By: Quentin Herreros and Thomas Veasey Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving information retrieval in the Elastic Stack - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/improving-information-retrieval-in-the-elastic-stack","meta_description":"This series explores steps to improve search relevance, benchmarking passage retrieval, ELSER, and hybrid retrieval.\n"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. Vector Database How To SF JG By: Sachin Frayne and Jessica Garson On April 23, 2025 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector search provides the foundation when implementing semantic search for text or similarity search for images, videos, or audio. With vector search, the vectors are mathematical representations of data that can be huge and sometimes sluggish. Better Binary Quantization (hereafter referred to as BBQ) works as a compression method for vectors. It allows you to find the right matches while shrinking the vectors to make them faster to search and process. This article will cover BBQ and rescore_vector, a field only available for quantized indices that automatically rescores vectors. All complete queries and outputs mentioned in this article can be found in our Elasticsearch Labs code repository . Why would you implement this in your use case? Note: for an in-depth understanding of how the mathematics behind BBQ works, please check out the “Further learning” section below. For the purposes of this blog, the focus is on the implementation. While the mathematics is intriguing and it is crucial if you want to fully grasp why your vector searches remain precise. Ultimately, this is all about compression since it turns out that with the current vector search algorithms you are limited by the read speed of data. Therefore if you can fit all of that data into memory, you get a significant speed boost when compared to reading from storage ( memory is approximately 200x faster than SSDs ). There are a few things to keep in mind: Graph-based indices like HNSW (Hierarchical Navigable Small World) are the fastest for vector retrieval. HNSW: An approximate nearest neighbor search algorithm that builds a multi-layer graph structure to enable efficient high-dimensional similarity searches. HNSW is fundamentally limited in speed by data read speed from memory, or in the worst case, from storage. Ideally, you want to be able to load all of your stored vectors into memory. Embedding models generally produce vectors with float32 precision, 4 bytes per floating point number. And finally, depending on how many vectors and/or dimensions you have, you can very quickly run out of memory to keep all of your vectors in. Taking this for granted, you see that a problem arises quickly once you start ingesting millions or even billions of vectors, each with potentially hundreds or even thousands of dimensions. The section entitled “ Approximate numbers on the compression ratios ” provides some rough numbers. What do you need to get started? To get started, you will need the following: If you are using Elastic Cloud or on-prem, you will need a version of Elasticsearch higher than 8.18. While BBQ was introduced in 8.16, in this article, you will use vector_rescore , which was introduced in 8.18. Additionally, you will also need to ensure there is a machine learning (ML) node in your cluster. (Note: an ML node with a minimum of 4GB is needed to load the model, but you will likely need much larger nodes for full production workloads.) If you are using Serverless, you will need to select an instance that is optimized for vectors. You will also need a base level of knowledge regarding vector databases. If you aren’t already familiar with vector search concepts in Elastic, you may want to first check out the following resources: Navigating an Elastic Vector Database The big ideas behind retrieval augmented generation Implementation To keep this blog simple, you will use built-in functions when they are available. In this case, you have the .multilingual-e5-small vector embedding model that will run directly inside Elasticsearch on a machine learning node. Note that you can replace the text_embedding model with the embedder of your choosing ( OpenAI , Google AI Studio , Cohere and plenty more. If your preferred model is not yet integrated, you can also bring your own dense vector embeddings .) First, you will need to create an inference endpoint to generate vectors for a given piece of text. You will run all of these commands from the Kibana Dev Tools Console . This command will download the .multilingual-e5-small . If it does not already exist, then it will set up your endpoint; this may take a minute to run. You can see the expected output in the file 01-create-an-inference-endpoint-output.json in the Outputs folder. Once this has returned, your model will be set up and you can test that the model works as expected with the following command. You can see the expected output in the file 02-embed-text-output.json in the Outputs folder. If you run into issues around your trained model not being allocated to any nodes, you may need to start your model manually. Now let's create a new mapping with 2 properties, a standard text field ( my_field ) and a dense vector field ( my_vector ) with 384 dimensions to match the output from the embedding model. You will also override the index_options.type to bbq_hnsw . You can see the expected output in the file 03-create-byte-qauntized-index-output.json in the Outputs folder. To ensure Elasticsearch generates your vectors, you can make use of an Ingest Pipeline . This pipeline will require 3 things: the endpoint, ( model_id ), the input_field that you want to create vectors for and the output_field to store those vectors in. The first command below will create an inference ingest pipeline, which uses the inference service under the hood, and the second will test that the pipeline is working correctly. You can see the expected output in the file 04-create-and-simulate-ingest-pipeline-output.json in the Outputs folder. You are now ready to add some documents with the first 2 commands below and to test that your searches work with the 3rd command. You can check out the expected output in the file 05-bbq-index-output.json in the Outputs folder. As recommended in this post , rescoring and oversampling are advised when you scale to non-trivial amounts of data because they help maintain high recall accuracy while benefiting from the compression advantages. From Elasticsearch version 8.18, you can do it this way using rescore_vector . The expected output is in the file 06-bbq-search-8-18-output.json in the Outputs folder. How do these scores compare to those you would get for raw data? If you do everything above again but with index_options.type: hnsw , you will see that the scores are very comparable. You can see the expected output in the file 07-raw-vector-output.json in the Outputs folder. Approximate numbers on the compression ratios Storage and memory requirements can quickly become a significant challenge when working with vector search. The following breakdown illustrates how different quantization techniques dramatically reduce the memory footprint of vector data. Vectors (V) Dimensions (D) raw (V x D x 4) int8 (V x (D x 1 + 4)) int4 (V x (D x 0.5 + 4)) bbq (V x (D x 0.125 + 4)) 10,000,000 384 14.31GB 3.61GB 1.83GB 0.58GB 50,000,000 384 71.53GB 18.07GB 9.13GB 2.89GB 100,00,0000 384 143.05GB 36.14GB 18.25GB 5.77GB Conclusion BBQ is an optimization you can apply to your vector data for compression without sacrificing accuracy. It works by converting vectors into bits, allowing you to search the data effectively and empowering you to scale your AI workflows to accelerate searches and optimize data storage. Further learning If you are interested in learning more about BBQ, be sure to check out the following resources: Binary Quantization (BBQ) in Lucene and Elasticsearch Better Binary Quantization (BBQ) vs. Product Quantization Optimized Scalar Quantization: Even Better Binary Quantization Better Binary Quantization (BBQ): From Bytes to BBQ, The Secret to Better Vector Search by Ben Trent Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Why would you implement this in your use case? What do you need to get started? Implementation Approximate numbers on the compression ratios Conclusion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to implement Better Binary Quantization (BBQ) into your use case and why you should - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/bbq-implementation-into-use-case","meta_description":"Learn how to implement Better Binary Quantization (BBQ) into your use case and why you should."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model Learn about the Elastic Learned Sparse Encoder (ELSER), its retrieval performance, architecture, and training process. ML Research TV QH By: Thomas Veasey and Quentin Herreros On June 21, 2023 Part of Series Improving information retrieval in the Elastic Stack Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this blog, we discuss the work we've been doing to augment Elastic's out-of-the-box retrieval with a pre-trained language model: the Elastic Learned Sparse Encoder (ELSER). In our previous blog post in this series, we discussed some of the challenges applying dense models to retrieval in a zero-shot setting. This is well known and was highlighted by the BEIR benchmark, which assembled diverse retrieval tasks as a proxy to the performance one might expect from a model applied to an unseen data set. Good retrieval in a zero-shot setting is exactly what we want to achieve, namely a one-click experience that enables textual fields to be searched using a pre-trained model. This new capability fits into the Elasticsearch _search endpoint as just another query clause, a text_expansion query. This is attractive because it allows search engineers to continue to tune queries with all the tools Elasticsearch already provides. Furthermore, to truly achieve a one-click experience, we've integrated it with the new Elasticsearch Relevance Engine . However, rather than focus on the integration, this blog digs a little into ELSER's model architecture and the work we did to train it. We had another goal at the outset of this project. The natural language processing (NLP) field is fast moving, and new architectures and training methodologies are being introduced rapidly. While some of our users will keep on top of the latest developments and want full control over the models they deploy, others simply want to consume a high quality search product. By developing our own training pipeline, we have a playground for implementing and evaluating the latest ideas, such as new retrieval relevant pre-training tasks or more effective distillation tasks , and making the best ones available to our users. Finally, it is worth mentioning that we view this feature as complementary to the existing model deployment and vector search capabilities in the Elastic Stack, which are needed for those more custom use cases like cross-modal retrieval. ELSER performance results Before looking at some of the details of the architecture and how we trained our model, the Elastic Learned Sparse Encoder (ELSER) , it's interesting to review the results we get, as ultimately the proof of the pudding is in the eating. As we discussed before , we use a subset of BEIR to evaluate our performance. While this is by no means perfect, and won't necessarily represent how the model behaves on your own data, we at least found it challenging to make significant improvements on this benchmark. So we feel confident that improvements we get on this translate to real improvements in the model. Since absolute performance numbers on benchmarks by themselves aren't particularly informative, it is nice to be able to compare with other strong baselines, which we do below. The table below shows the performance of Elastic Learned Sparse Encoder compared to Elasticsearch's BM25 with an English analyzer broken down by the 12 data sets we evaluated. We have 10 wins, 1 draw, and 1 loss and an average improvement in NDCG@10 of 17%. NDCG@10 for BEIR data sets for BM25 and Elastic Learned Sparse Encoder (referred to as “ELSER” above, note higher is better) In the following table, we compare our average performance to some other strong baselines. The Vespa results are based on a linear combination of BM25 and their implementation of ColBERT as reported here , the Instructor results are from this paper, the SPLADEv2 results are taken from this paper and the OpenAI results are reported here . Note that we've separated out the OpenAI results because they use a different subset of the BEIR suite. Specifically, they average over ArguAna, Climate FEVER, DBPedia, FEVER, FiQA, HotpotQA, NFCorpus, QuoraRetrieval, SciFact, TREC COVID and Touche. If you follow that link, you will notice they also report NDCG@10 expressed as a percentage. We refer the reader to the links above for more information on these approaches. Average NDCG@10 for BEIR data sets vs. various high quality baselines (higher is better). Note: OpenAI chose a different subset, and we report our results on this set separately. Finally, we note it has been widely observed that an ensemble of statistical (a la BM25) and model based retrieval, or hybrid search, tends to outperform either in a zero-shot setting. Already in 8.8, Elastic allows one to do this for text_expansion with linear boosting and this works well if you calibrate to your data set. We are also working on Reciprocal Rank Fusion (RRF), which performs well without calibration. Stay tuned for our next blog in this series, which will discuss hybrid search. Having seen how ELSER performs, we next discuss its architecture and some aspects of how it is trained. What are learned sparse models and why are they attractive? We showed in our previous blog post that, while very effective if fine-tuned, dense retrieval tends not to perform well in a zero-shot setting. By contrast cross-encoder architectures , which don't scale well for retrieval, tend to learn robust query and document representations and work well on most text. It has been suggested that part of the reason for this difference is the bottleneck of the query and document interacting only via a relatively low dimensional vector “dot product.” Based on this observation, a couple of model architectures have been recently proposed that try to reduce this bottleneck — these are ColBERT and SPLADE. From our perspective, SPLADE has some additional advantages: Compared to ColBERT, it is extremely storage efficient. Indeed, we find that document passages expand to about 100 tokens on average and we've seen approximate size parity with normal text indices. With some caveats , retrieval can make use of inverted indices for which we already have very mature implementations in Lucene. Compared to ANN, these use memory extremely efficiently. It provides natural controls as it is being trained that allow us to trade retrieval quality for retrieval latency. In particular, the FLOPS regularizer, which we discuss below, allows one to add a term to the loss for the expected retrieval cost. We plan to take advantage of this as we move toward GA. One last clear advantage compared to dense retrieval is that SPLADE allows one a simple and compute efficient route to highlight words generating a match. This simplifies surfacing relevant passages in long documents and helps users better understand how the retriever is working. Taken together, we felt that these provided a compelling case for adopting the SPLADE architecture for our initial release of this feature. There are multiple good detailed descriptions of this architecture — if you are interested in diving in, this , for example, is a nice write up by the team that created the model. In very brief outline, the idea is rather than use a distributed representation, say averaging BERT token output embeddings, instead use the token logits, or log-odds the tokens are predicted, for masked word prediction. When language models are used to predict masked words, they achieve this by predicting a probability distribution over the tokens of their vocabulary. The BERT vocabulary, for WordPiece, contains many common real words such as cat, house, and so on. It also contains common word endings — things like ##ing (with the ## simply denoting it is a continuation). Since words can't be arbitrarily exchanged, relatively few tokens will be predicted for any given mask position. SPLADE takes as a starting point for its representation of a piece of text the tokens most strongly predicted by masking each word of that text. As noted, this is a naturally disentangled or sparse representation of that text. It is reasonable to think of these token probabilities for word prediction as roughly capturing contextual synonyms. This has led people to view learned sparse representations, such as SPLADE, as something close to automatic synonym expansion of text, and we see this in multiple online explanations of the model. In our view, this is at best an oversimplification and at worst misleading. SPLADE takes as the starting point for fine-tuning the maximum token logits for a piece of text, but it then trains on a relevance prediction task, which crucially accounts for the interaction between all shared tokens in a query and document. This process somewhat re-entangles the tokens, which start to behave more like components of a vector representation (albeit in a very high dimensional vector space). We explored this a little as we worked on this project. We saw as we tried removing low score and apparently unrelated tokens in the expansion post hoc that it reduced all quality metrics, including precision(!), in our benchmark suite. This would be explained if they were behaving more like a distributed vector representation, where zeroing individual components is clearly nonsensical. We also observed that we can simply remove large parts of BERT's vocabulary at random and still train highly effective models as the figure below illustrates. In this context, parts of the vocabulary must be being repurposed to account for the missing words. Margin MSE validation loss for student models with different vocabulary sizes Finally, we note that unlike say generative tasks where size really does matter a great deal, retrieval doesn't as clearly benefit from having huge models. We saw in the result section that this approach is able to achieve near state-of-the-art performance with only 100M parameters, as compared to hundreds of millions or even billions of parameters in some of the larger generative models. Typical search applications have fairly stringent requirements on query latency and throughput, so this is a real advantage. Exploring the training design space for ELSER In our first blog , we introduced some of the ideas around training dense retrieval models. In practice, this is a multi stage process and one typically picks up a model that has already been pre-trained. This pre-training task can be rather important for achieving the best possible results on specific downstream tasks. We don't discuss this further because to date this hasn't been our focus, but note in passing that like many current effective retrieval models, we start from a co-condenser pre-trained model . There are many potential avenues to explore when designing training pipelines. We explored quite a few, and suffice to say, we found making consistent improvements on our benchmark was challenging. Multiple ideas that looked promising on paper didn't provide compelling improvements. To avoid this blog becoming too long, we first give a quick overview of the key ingredients of the training task and focus on one novelty we introduced, which provided the most significant improvements. Independent of specific ingredients, we also made some qualitative and quantitative observations regarding the role of the FLOPS regularization, which we will discuss at the end. When training models for retrieval, there are two common paradigms: contrastive approaches and distillation approaches. We adopted the distillation approach because this was shown to be very effective for training SPLADE in this paper. The distillation approach is slightly different from the common paradigm, which informs the name, of shrinking a large model to a small, but almost as accurate, “copy.” Instead the idea is to distill the ranking information present in a cross-encoder architecture. This poses a small technical challenge: since the representation is different, it isn't immediately clear how one should mimic the behavior of the cross-encoder with the model being trained. The standard idea we used is to present both models with triplets of the form (query, relevant document, irrelevant document). The teacher model computes a score margin, namely s c o r e ( q u e r y , r e l e v a n t d o c u m e n t ) − s c o r e ( q u e r y i r r e l e v a n t d o c u m e n t ) score(query, relevant\\;document) - score(query\\;irrelevant\\;document) score ( q u ery , re l e v an t d oc u m e n t ) − score ( q u ery i rre l e v an t d oc u m e n t ) , and we train the student model to reproduce this score margin using MSE to penalize the errors it makes. Let's think a little about what this process does since it motivates the training detail we wish to discuss. If we recall that the interaction between a query and document using the SPLADE architecture is computed using the dot product between two sparse vectors, of non-negative weights for each token, then we can think about this operation as wanting to increase the similarity between the query and the higher scored document weight vectors. It is not 100% accurate, but not misleading, to think of this as something like “rotating” the query in the plane spanned by the two documents' weight vectors toward the more relevant one. Over many batches, this process gradually adjusts the weight vectors starting positions so the distances between queries and documents captures the relevance score provided by the teacher model. This leads to an observation regarding the feasibility of reproducing the teacher scores. In normal distillation, one knows that given enough capacity the student would be able to reduce the training loss to zero. This is not the case for cross-encoder distillation because the student scores are constrained by the properties of a metric space induced by the dot product on their weight vectors. The cross-encoder has no such constraint. It is quite possible that for particular training queries q 1 q_1 q 1 ​ and q 2 q_2 q 2 ​ and documents d 1 d_1 d 1 ​ and d 2 d_2 d 2 ​ we have to simultaneously arrange for q 1 q_1 q 1 ​ to be close to d 1 d_1 d 1 ​ and d 2 d_2 d 2 ​ , and q 2 q_2 q 2 ​ to be close to d 1 d_1 d 1 ​ but far from d 2 d_2 d 2 ​ . This is not necessarily possible, and since we penalize the MSE in the scores, one effect is an arbitrary reweighting of the training triplets associated with these queries and documents by the minimum margin we can achieve. One of the observations we had while working on training ELSER was the teacher was far from infallible. We initially observed this by manually investigating query-relevant document pairs that were assigned unusually low scores. In the process, we found objectively misscored query-document pairs. Aside from manual intervention in the scoring process, we also decided to explore introducing a better teacher. Following the literature, we were using MiniLM L-6 from the SBERT family for our initial teacher. While this shows strong performance in multiple settings, there are better teachers, based on their ranking quality. One example is a ranker based on a large generative model: monot5 3b . In the figure below, we compare the query-document score pair distribution of these two models. The monot5 3b distribution is clearly much less uniform, and we found when we tried to train our student model using its raw scores the performance saturated significantly below using MiniLM L-6 as our teacher. As before, we postulated that this was down to many important score differences in the peak around zero getting lost with training worried instead about unfixable problems related to the long lower tail. Monot5 3b and MiniLM L-6 score distributions on a matched scale for a random sample of query-document pairs from the NQ data set. Note: the X-axis does not show the actual scores returned by either of the models. It is clear that all rankers are of equivalent quality up to monotonic transforms of their scores. Specifically, it doesn't matter if we use s c o r e ( q u e r y , d o c u m e n t ) score(query, document) score ( q u ery , d oc u m e n t ) or f ( s c o r e ( q u e r y , d o c u m e n t ) ) f(score(query, document)) f ( score ( q u ery , d oc u m e n t )) provided f ( ⋅ ) f(\\cdot) f ( ⋅ ) is a monotonic increasing function; any ranking quality measure will be the same. However, not all such functions are equivalently effective teachers. We used this fact to smooth out the distribution of monot5 3b scores, and suddenly our student model trained and started to beat the previous best model. In the end, we used a weighted ensemble of our two teachers. Before closing out this section, we want to briefly mention the FLOPS regularizer. This is a key ingredient of the improved SPLADE v2 training process. It was proposed in this paper as a means of penalizing a metric directly related to the compute cost for retrieval from an inverted index. In particular, it encourages tokens that provide little information for ranking to be dropped from the query and document representations based on their impact on the cost for retrieving from an inverted index. We had three observations: Our first observation was that the great majority of tokens are actually dropped while the regularizer is still warming up. In our training recipe, the regularizer uses quadratic warm up for the first 50,000 batches. This means that in the first 10,000 batches, it is no more than 1/25th its terminal value, and indeed we see that the contribution to the loss from MSE in the score margin is orders of magnitude larger than the regularization loss at this point. However, during this period the number of query and document tokens per batch activated by our training data drops from around 4k and 14k on average to around 50 and 300, respectively. In fact, 99% of all token pruning happens in this phase and seems largely driven by removing tokens which actually hurt ranking performance. Our second observation was that we found it contributes to ELSER's generalization performance for retrieval. Both turning down the amount of regularization and substituting regularizers that induce more sparseness, such as the sum of absolute weight values, reduced average ranking performance across our benchmark. Our final observation was that larger batches and diverse batches both positively impacted retrieval quality; we tried by contrast query clustering with in-batch negatives. So why could this be, since it is primarily aimed at optimizing retrieval cost? The FLOPS regularizer is defined as follows: it first averages the weights for each token in the batch across all the queries and separately the documents it contains, it then sums the squares of these average weights. If we consider that the batch typically contains a diverse set of queries and documents, this acts like a penalty that encourages something analogous to stop word removal. Tokens that appear for many distinct queries and documents will dominate the loss, since the contribution from rarely activated tokens is divided by the square of the batch size. We postulate that this is actually helping the model to find better representations for retrieval. From this perspective, the fact that the regularizer term only gets to observe the token weights of queries and documents in the batch is undesirable. This is an area we'd like to revisit. Conclusion We have given a brief overview of the model, the Elastic Learned Sparse Encoder (ELSER), its rationale, and some aspects of the training process behind the feature we're releasing in a technical preview for the new text_expansion query and integrating with the new Elasticsearch Relevance Engine . To date, we have focused on retrieval quality in a zero-shot setting and demonstrated good results against a variety of strong baselines. As we move toward GA, we plan to do more work on operationalizing this model and in particular around improving inference and retrieval performance. Stay tuned for the next blog post in this series, where we'll look at combining various retrieval methods using hybrid retrieval as we continue to explore exciting new retrieval methods using Elasticsearch. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid retrieval Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to ELSER performance results What are learned sparse models and why are they attractive? Exploring the training design space for ELSER Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-learned-sparse-encoder-elser-retrieval-performance","meta_description":"using sparse vectors in Elasticsearch, with a foundation model based on SPLADE concepts"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. ML Research TV By: Thomas Veasey On December 19, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. At Elastic we continue to work on innovations that improve the cost and performance of our vector indices. In this post we’re going to discuss in detail the implementation and intuition for a new version of scalar quantization we’ve been working on which we’re calling optimized scalar quantization or OSQ. This provides a further enhancement to our BBQ index format . Indeed, we set new state-of-the-art performance for 32 × \\times × compression. It also allows us to get very significant improvements in accuracy compared to naive scalar quantization approaches while retaining similar index compression and query performance. We plan to roll out 2, 4 and 7 bit quantization schemes and indeed unify all our index compression options to use this same underlying technique. Furthermore, we anticipate that with 7 bit quantization we should be able to discard the raw floating point vectors and plan to evaluate this thoroughly. Measuring success For any compression scheme we need to worry about its impact on the query behavior. For nearest neighbor queries there is a metric that much prior art focuses on. This is the recall at k k k . Specifically, one measures the fraction of true nearest neighbor vectors that a query returns. In fact, we already exploit the relaxation that we don't find the true nearest neighbors, even when we search with uncompressed vectors in Lucene. Linear complexity is unavoidable if one wants to find the true nearest neighbors in a high dimensional vector space. The data structure that Lucene uses to index vector data is an HNSW graph , which allows it to run approximate nearest neighbor queries significantly faster than a linear scan. In this blog we study a metric that we’ll abbreviate as recall@ k ∣ n k|n k ∣ n , which is the recall at k k k retrieving n ≥ k n\\geq k n ≥ k candidates and reranking them using the uncompressed vector distance calculation. Note that if n = k n=k n = k then it's simply equal to the recall at k k k . We also look at the quality of the approximation of the uncompressed vector distances. We’ll discuss all this in more detail when we discuss our results. Motivation Scalar quantization, as we’ve discussed before , is an approach which allows one to represent floating point vectors by integer vectors. There are two reasons to do this: It allows you to reduce the amount of memory needed to represent a vector, by using fewer than 32 bits per integer, and It allows you to accelerate the distance calculation between vectors, which is the bottleneck for performing nearest neighbor queries in vector databases. Recently, a paper , which we’ve discussed before , proposed an approach called RaBitQ that is able to achieve good recall using only 1 bit per component. This is exciting because 32 × \\times × compression is competitive with Product Quantization (PQ), which was the previous state-of-the-art approach when compression was paramount. A key advantage of RaBitQ is that it’s relatively straightforward to accelerate distance calculations with SIMD operations. Certainly, when it is compared to PQ that uses lookups to compute distance, for which it is much harder to exploit hardware parallelism. The authors performed extensive experiments and showed they were able to achieve consistently higher recall as a function of query latency than PQ using the same compression ratio with an IVF index. Although the RaBitQ approach is conceptually rather different to scalar quantization, we were inspired to re-evaluate whether similar performance could be unlocked for scalar quantization. In our companion piece we will discuss the result of integrating OSQ with HNSW and specifically how it compares to the baseline BBQ quantization scheme, which is inspired by RaBitQ. As an incentive to keep reading this blog we note that we were able to achieve systematically higher recall in this setting, sometimes by as much as 30%. Another advantage of OSQ is it immediately generalizes to using n n n bits per component. For example, we show below that we’re able to achieve excellent recall in all cases we tested using minimal or even no reranking with only a few bits per component. This post is somewhat involved. We will take you through step-by-step the innovations we’ve introduced and explain some of the intuition at each step. Dot products are enough In the following we discuss only the dot product, which would translate to a MIP query in Elasticsearch. In fact, this is sufficient because well known reductions exist to convert the other metrics Elasticsearch supports, Euclidean and cosine, to computing a dot product. For the Euclidean distance we note that ∥ y − x ∥ 2 = ∥ x ∥ 2 + ∥ y ∥ 2 − 2 x t y \\|y-x\\|^2 = \\|x\\|^2 + \\|y\\|^2 - 2x^ty ∥ y − x ∥ 2 = ∥ x ∥ 2 + ∥ y ∥ 2 − 2 x t y and also that ∥ x ∥ \\|x\\| ∥ x ∥ and ∥ y ∥ \\|y\\| ∥ y ∥ can be precomputed and cached, so we need only compute the dot product y t x y^t x y t x in order to compare two vectors. For cosine we simply need to normalize vectors and then y t x ∥ y ∥ ∥ x ∥ = y t x \\frac{y^t x}{\\|y\\|\\|x\\|} = y^t x ∥ y ∥∥ x ∥ y t x ​ = y t x Scalar quantization refresher Scalar quantization is typically defined by the follow componentwise equation x ^ i = round ( 2 n − 1 b − a ( clamp ( x i , a , b ) − a ) ) \\hat{x}_i = \\text{round}\\left(\\frac{2^n-1}{b-a} \\left(\\text{clamp}(x_i, a, b) - a\\right)\\right) x ^ i ​ = round ( b − a 2 n − 1 ​ ( clamp ( x i ​ , a , b ) − a ) ) Here, clamp ( ⋅ , a , b ) = max ⁡ ( min ⁡ ( ⋅ , b ) , a ) \\text{clamp}(\\cdot, a, b) = \\max(\\min(\\cdot, b), a) clamp ( ⋅ , a , b ) = max ( min ( ⋅ , b ) , a ) . The quantized vector x ^ \\hat{x} x ^ is integer valued with values less than or equal to 2 n − 1 2^n-1 2 n − 1 , that is we’re using n n n bits to represent each component. The interval [ a , b ] [a,b] [ a , b ] is called the quantization interval. We also define the quantized approximation of the original floating point vector, which is our best estimate of the floating point vector after we’ve applied quantization. We never directly work with this quantity, but it is convenient to describe the overall approach. This is defined as follows x ˉ = a 1 + b − a 2 n − 1 x ^ \\bar{x} = a 1 + \\frac{b-a}{2^n-1} \\hat{x} x ˉ = a 1 + 2 n − 1 b − a ​ x ^ Here, 1 1 1 denotes a vector whose components are all equal to one. Note that we abuse notation here and below by using 1 1 1 to denote both a vector and scalar and we rely on the context to make the meaning clear. The need for speed A key requirement we identified for scalar quantization is that we can perform distance comparisons directly on integer vectors. Integer arithmetic operations are more compute efficient than floating point ones. Furthermore, higher throughput specialized hardware instructions exist for performing them in parallel. We’ve discussed how to achieve this in the context of scalar quantization before. Below we show that as long as we cache a couple of corrective factors we can use a different quantization interval and number of bits for each vector we quantize and still compute the dot product using integer vectors. This is the initial step towards achieving OSQ. First of all observe that our best estimate of the dot product between two quantized vectors is y t x = ( a y 1 + b y − a y 2 n y − 1 y ^ ) t ( a x 1 + b x − a x 2 n x − 1 x ^ ) y^t x = \\left(a_y 1 + \\frac{b_y-a_y}{2^{n_y} - 1}\\hat{y}\\right)^t \\left(a_x 1 + \\frac{b_x-a_x}{2^{n_x} - 1}\\hat{x}\\right) y t x = ( a y ​ 1 + 2 n y ​ − 1 b y ​ − a y ​ ​ y ^ ​ ) t ( a x ​ 1 + 2 n x ​ − 1 b x ​ − a x ​ ​ x ^ ) Expanding we obtain y ^ t x ^ = a y a x 1 t 1 + b y − a y 2 n y − 1 a x 1 t y ^ + b x − a x 2 n x − 1 a y 1 t x ^ + b y − a y 2 n y − 1 b x − a x 2 n x − 1 y ^ t x ^ \\hat{y}^t \\hat{x} = a_y a_x 1^t 1 + \\frac{b_y-a_y}{2^{n_y} - 1} a_x 1^t \\hat{y} + \\frac{b_x-a_x}{2^{n_x} - 1} a_y 1^t \\hat{x} + \\frac{b_y-a_y}{2^{n_y} - 1} \\frac{b_x-a_x}{2^{n_x} - 1} \\hat{y}^t \\hat{x} y ^ ​ t x ^ = a y ​ a x ​ 1 t 1 + 2 n y ​ − 1 b y ​ − a y ​ ​ a x ​ 1 t y ^ ​ + 2 n x ​ − 1 b x ​ − a x ​ ​ a y ​ 1 t x ^ + 2 n y ​ − 1 b y ​ − a y ​ ​ 2 n x ​ − 1 b x ​ − a x ​ ​ y ^ ​ t x ^ Focusing on the vector dot products, since these dominate the compute cost, we observe that 1 t 1 1^t 1 1 t 1 is just equal to the vector dimension, and 1 t y ^ 1^t\\hat{y} 1 t y ^ ​ and 1 t x ^ 1^t\\hat{x} 1 t x ^ are the sums of the quantized query and document vector components, respectively. For the query this can be computed once upfront and for the documents these can be computed at index time and stored with the quantized vector. Therefore, we need only compute the integer vector dot product y ^ t x ^ \\hat{y}^t\\hat{x} y ^ ​ t x ^ per comparison. The geometry of scalar quantization To build a bit more intuition about OSQ we digress to understand more about how scalar quantization represents a vector. Observe that the set of all possible quantized vectors lie inside a cube centered on the point a + b 2 1 \\frac{a+b}{2}1 2 a + b ​ 1 with side length b − a b−a b − a . If only 1 bit is being used then the possible vectors lie at the 2 d 2^d 2 d corners of the cube. Otherwise, the possible vectors lie on a regular grid with 2 n d 2^{nd} 2 n d points. By changing a a a and b b b , we are only able to expand or contract the cube or slide it along the line spanned by the 1 1 1 vector. In particular, suppose the vectors in the corpus are centered around some point m m m . We can decompose this into a component parallel to the 1 1 1 vector and a component perpendicular to it as follows m = 1 1 t d m + ( I − 1 1 t d ) m m = \\frac{1 1^t}{d} m + \\left(I - \\frac{1 1^t}{d}\\right) m m = d 1 1 t ​ m + ( I − d 1 1 t ​ ) m If we center the cube on the point at 1 t m d 1 \\frac{1^t m}{d}1 d 1 t m ​ 1 then it must still expand it to encompass the offset m − 1 t m d 1 m-\\frac{1^t m}{d}1 m − d 1 t m ​ 1 before it covers even the center of the data distribution. This is illustrated in the figure below for 2 dimensions. Since the quantization errors will be proportional on average to the side length of the cube, this suggests that we want to minimize ∥ m − 1 t m d 1 ∥ \\| m-\\frac{1^t m}{d}1 \\| ∥ m − d 1 t m ​ 1∥ . An easy way to do this is to center the query and document vectors before quantizing. We show below that if we do this we can still recover the dot product between the original vectors. Note that y t x = ( y − m + m ) t ( x − m + m ) = ( y − m ) t ( x − m ) + m t y + m t x − m t m \\begin{align*} y^t x &= (y - m + m)^t (x - m + m) \\\\ &= (y - m)^t (x - m) + m^t y + m^t x - m^t m \\end{align*} y t x ​ = ( y − m + m ) t ( x − m + m ) = ( y − m ) t ( x − m ) + m t y + m t x − m t m ​ The values m t y m^t y m t y and m t x m^t x m t x , which are the inner product between the query and document vector and the centroid of the vectors in the corpus, can be precomputed and in the case of the document vector cached with its quantized representation. The quantity m t m m^t m m t m is a global constant. This means we only need to estimate ( y − m ) t ( x − m ) (y−m)^t(x−m) ( y − m ) t ( x − m ) when comparing a query and document vector. In other words, we can quantize the centered vectors and recover an estimate of the actual dot product. The distribution of centered vectors So far we’ve shown that we can use a different bit count and a different quantization interval per vector. We next show how to exploit this to significantly improve the accuracy of scalar quantization. We propose an effective criterion and procedure for optimizing the choice of the constants a a a and b b b . However, before discussing this it is useful to see some examples of the component distribution in real centered embedding vectors. We observe that the values are all fairly normally distributed and will use this observation to choose our initial quantization interval. Looking at these plots one might guess it would be beneficial to scale components. Specifically, it seems natural to standardize the component distributions of the Cohere v2 and the gte-base-en-v1.5 embeddings. In general, this would amount to applying a scaling matrix to the document vectors before quantization as follows Diag ( σ ) − 1 ( x − m ) \\text{Diag}(\\sigma)^{-1} (x - m) Diag ( σ ) − 1 ( x − m ) Here, Diag ( σ ) \\text{Diag}(\\sigma) Diag ( σ ) is a diagonal matrix whose diagonal entries are the standard deviations of the components for the corpus as a whole. We can apply this operation and still compute the dot product efficiently because we simply need to apply the inverse transformation to the query vector before quantizing it ( y − m ) t ( x − m ) = ( Diag ( σ ) ( y − m ) ) t ( Diag ( σ ) − 1 ( x − m ) ) (y - m)^t (x - m) = \\left(\\text{Diag}(\\sigma) (y - m)\\right)^t \\left(\\text{Diag}(\\sigma)^{-1} (x - m)\\right) ( y − m ) t ( x − m ) = ( Diag ( σ ) ( y − m ) ) t ( Diag ( σ ) − 1 ( x − m ) ) The effect is not symmetric because, as we’ll discuss below, we use a higher bit count for the query. Referring back to our geometric picture this would amount to stretching the different edges of the cube. We tried this but found it didn’t measurably improve effectiveness once we optimize the interval directly for each vector so avoid the extra complexity. Initializing the quantization interval Let’s consider a natural criterion for setting a global quantization interval for normally distributed data. If we pick a vector, X X X , at random from the corpus then the quantization error is given by ∥ X − x ( a , b ) ∥ 2 = ∑ i = 1 d ( X i − ( a + b − a 2 n − 1 x ^ i ) ) 2 \\| X - x(a,b)\\|^2 = \\sum_{i=1}^d \\left(X_i - \\left(a+\\frac{b-a}{2^n-1}\\hat{x}_i\\right)\\right)^2 ∥ X − x ( a , b ) ∥ 2 = i = 1 ∑ d ​ ( X i ​ − ( a + 2 n − 1 b − a ​ x ^ i ​ ) ) 2 By assumption, and as we showed empirically is often the case, each component of X X X is normally distributed or X i ∼ N ( 0 , σ i ) X_i \\sim N(0,\\sigma_i) X i ​ ∼ N ( 0 , σ i ​ ) . In such a scenario it is reasonable to minimize the expected square error. Specifically, we seek a ∗ , b ∗ = a r g min ⁡ a , b E X [ ∑ i = 1 d ( X i − ( a + b − a 2 n − 1 x ^ i ) ) 2 ] a^*,b^* = arg\\min_{a,b} \\; \\mathbb{E}_{X} \\left[ \\sum_{i=1}^d \\left(X_i - \\left(a+\\frac{b-a}{2^n-1}\\hat{x}_i\\right)\\right)^2 \\right] a ∗ , b ∗ = a r g a , b min ​ E X ​ [ i = 1 ∑ d ​ ( X i ​ − ( a + 2 n − 1 b − a ​ x ^ i ​ ) ) 2 ] Since the expectation is a linear operator it distributes over the summation and we can focus on a single term. Without loss of generality we can assume that X i X_i X i ​ ​ is a unit normal since we can always rescale the interval by the data standard deviation. To compute this expectation we make use of the following quantity I ( x , c ) = 1 2 π ∫ x ( t − c ) 2 e − t 2 / 2 d t = 1 2 ( c 2 + 1 ) erf ( x 2 ) + 1 2 π e − x 2 / 2 ( 2 c − x ) + constant \\begin{align*} I(x,c) &= \\frac{1}{\\sqrt{2\\pi}} \\int^x (t-c)^2 e^{-t^2 / 2} dt \\\\ &= \\frac{1}{2}\\left(c^2+1\\right) \\text{erf}\\left(\\frac{x}{\\sqrt{2}}\\right) + \\frac{1}{\\sqrt{2\\pi}} e^{-x^2/2} (2c-x) + \\text{constant} \\end{align*} I ( x , c ) ​ = 2 π ​ 1 ​ ∫ x ( t − c ) 2 e − t 2 /2 d t = 2 1 ​ ( c 2 + 1 ) erf ( 2 ​ x ​ ) + 2 π ​ 1 ​ e − x 2 /2 ( 2 c − x ) + constant ​ This is the expectation of the square error when rounding a normally distributed value to a fixed point c c c expressed as an indefinite integral, alternatively, before we’ve determined the range the value can take. In order to minimize the expected quantization error we should snap floating point values to their nearest grid point. This means we can express the expected square quantization error as follows E r r o r e ( a , b | n ) = I ( a + s 2 , a ) − I ( − ∞ , a ) + ∑ i = 1 2 n − 2 I ( a + 2 i + 1 2 s , a + i s ) − I ( a + 2 i − 1 2 s , a + i s ) + I ( ∞ , b ) − I ( a + 2 n + 1 − 3 2 s , b ) \\begin{align*} Error_e\\left(a,b\\; \\middle| \\; n\\right) =& \\; I(a+\\frac{s}{2}, a) - I(-\\infty, a)\\;+ \\\\ & \\sum_{i=1}^{2^n-2} I\\left(a+\\frac{2i+1}{2}s, a+is\\right) - I\\left(a+\\frac{2i-1}{2}s, a+is\\right) + \\\\ & I(\\infty, b) - I\\left(a+\\frac{2^{n+1}-3}{2}s, b\\right) \\end{align*} E rro r e ​ ( a , b ∣ n ) = ​ I ( a + 2 s ​ , a ) − I ( − ∞ , a ) + i = 1 ∑ 2 n − 2 ​ I ( a + 2 2 i + 1 ​ s , a + i s ) − I ( a + 2 2 i − 1 ​ s , a + i s ) + I ( ∞ , b ) − I ( a + 2 2 n + 1 − 3 ​ s , b ) ​ Where we defined s = b − a 2 n − 1 s=\\frac{b−a}{2^n−1} s = 2 n − 1 b − a ​ ​. The integration limits are determined by the condition that we snap to the nearest grid point. We now have a function, in terms of the interval endpoints a a a and b b b , for the expected square quantization error using a reasonable assumption about the vector components’ distribution. It is relatively straightforward to show that this quantity is minimized by an interval centered on the origin. This means that we need to search for the value of a single variable z z z which minimizes E r r o r e ( − z , z | n ) Error_e\\left(-z,z\\; \\middle| \\; n\\right) E rro r e ​ ( − z , z ∣ n ) . The figure below shows the error as a function of z z z for various bit counts. Optimizing this function numerically for various choices of n n n gives the following quantization intervals [ a ( 1 ) , b ( 1 ) ] = [ − 0.798 , 0.798 ] [ a ( 2 ) , b ( 2 ) ] = [ − 1.493 , 1.493 ] [ a ( 3 ) , b ( 3 ) ] = [ − 2.051 , 2.051 ] [ a ( 4 ) , b ( 4 ) ] = [ − 2.514 , 2.514 ] [ a ( 7 ) , b ( 7 ) ] = [ − 3.611 , 3.611 ] \\begin{align*} \\left[a_{(1)}, b_{(1)}\\right] &= [-0.798, 0.798] \\\\ \\left[a_{(2)}, b_{(2)}\\right] &= [-1.493, 1.493] \\\\ \\left[a_{(3)}, b_{(3)}\\right] &= [-2.051, 2.051] \\\\ \\left[a_{(4)}, b_{(4)}\\right] &= [-2.514, 2.514] \\\\ \\left[a_{(7)}, b_{(7)}\\right] &= [-3.611, 3.611] \\\\ \\end{align*} [ a ( 1 ) ​ , b ( 1 ) ​ ] [ a ( 2 ) ​ , b ( 2 ) ​ ] [ a ( 3 ) ​ , b ( 3 ) ​ ] [ a ( 4 ) ​ , b ( 4 ) ​ ] [ a ( 7 ) ​ , b ( 7 ) ​ ] ​ = [ − 0.798 , 0.798 ] = [ − 1.493 , 1.493 ] = [ − 2.051 , 2.051 ] = [ − 2.514 , 2.514 ] = [ − 3.611 , 3.611 ] ​ We’re denoting the interval for n n n bits [ a ( n ) , b ( n ) ] \\left[a_{(n)}, b_{(n)}\\right] [ a ( n ) ​ , b ( n ) ​ ] . Finally, we need to map these fixed intervals to the specific interval to use for a vector x x x . To do this we shift by the mean of its components m x m_x m x ​ ​ and scale by their standard deviation σ x \\sigma_x σ x ​ ​. It’s clear that we should always choose a ≥ x m i n a\\geq x_{min} a ≥ x min ​ ​ and b ≤ x m a x b\\leq x_{max} b ≤ x ma x ​ . Therefore, our initial estimate for the quantization interval for a vector x x x using n n n bits per component is [ max ⁡ ( m x + a ( n ) σ x , x m i n ) , min ⁡ ( m x + b ( n ) σ x , x m a x ) ] \\left[\\max\\left(m_x+a_{(n)}\\sigma_x, x_{min}\\right), \\min\\left(m_x+b_{(n)}\\sigma_x, x_{max}\\right)\\right] [ max ( m x ​ + a ( n ) ​ σ x ​ , x min ​ ) , min ( m x ​ + b ( n ) ​ σ x ​ , x ma x ​ ) ] Refining the quantization interval The initialization scheme actually works surprisingly well. We present the results of quantizing using this approach alone as one of the ablation studies when we examine the performance of OSQ. However, we can do better. It has been noted in the context of PQ that targeting minimum square quantization error is not actually the best criterion when you care about recall. In particular, you know you’re going to be running nearest neighbor queries on the corpus and the nearest neighbors of a query are very likely to be fairly parallel to the query vector. Let’s consider what this means. Suppose we have query vector y ∣ ∣ y_{||} y ∣∣ ​ for which the document is relevant. Ignoring the quantization of the query vector, we can decompose the square of the error in the dot product into a component that is parallel to and a component which is perpendicular to the document vector as follows ( y ∣ ∣ t ( x − x ˉ ) ) 2 = ( ( x x t ∥ x ∥ 2 y ∣ ∣ ) t x x t ∥ x ∥ 2 ( x − x ˉ ) + ( ( I − x x t ∥ x ∥ 2 ) y ∣ ∣ ) t ( I − x x t ∥ x ∥ 2 ) ( x − x ˉ ) ) 2 \\left(y_{||}^t (x - \\bar{x})\\right)^2 = \\left(\\left(\\frac{x x^t}{\\|x\\|^2}y_{||}\\right)^t \\frac{x x^t}{{\\|x\\|^2}}(x - \\bar{x}) + \\left(\\left(I - \\frac{x x^t}{\\|x\\|^2}\\right) y_{||}\\right)^t \\left(I - \\frac{x x^t}{\\|x\\|^2}\\right) (x - \\bar{x}) \\right)^2 ( y ∣∣ t ​ ( x − x ˉ ) ) 2 = ( ( ∥ x ∥ 2 x x t ​ y ∣∣ ​ ) t ∥ x ∥ 2 x x t ​ ( x − x ˉ ) + ( ( I − ∥ x ∥ 2 x x t ​ ) y ∣∣ ​ ) t ( I − ∥ x ∥ 2 x x t ​ ) ( x − x ˉ ) ) 2 Now we expect ∥ ( I − x x t ∥ x ∥ 2 ) y ∣ ∣ ∥ ≪ ∥ x x t ∥ x ∥ 2 y ∣ ∣ ∥ \\left\\|\\left(I-\\frac{x x^t}{\\|x\\|^2}\\right) y_{||}\\right\\| \\ll \\left\\|\\frac{x x^t}{\\|x\\|^2}y_{||}\\right\\| ​ ( I − ∥ x ∥ 2 x x t ​ ) y ∣∣ ​ ​ ≪ ​ ∥ x ∥ 2 x x t ​ y ∣∣ ​ ​ . Furthermore, we would like to minimize the error in the dot product in this specific case, since this is when the document is relevant and our query should return it. Let ∥ ( I − x x t ∥ x ∥ 2 ) y ∣ ∣ ∥ = λ ∥ x x t ∥ x ∥ 2 y ∣ ∣ ∥ \\left\\|\\left(I-\\frac{x x^t}{\\|x\\|^2}\\right) y_{||}\\right\\| = \\sqrt{\\lambda} \\left\\|\\frac{x x^t}{\\|x\\|^2}y_{||}\\right\\| ​ ( I − ∥ x ∥ 2 x x t ​ ) y ∣∣ ​ ​ = λ ​ ​ ∥ x ∥ 2 x x t ​ y ∣∣ ​ ​ for some λ ≪ 1 \\lambda \\ll 1 λ ≪ 1 . We can bound our quantization error in the dot product as follows ( y ∣ ∣ t ( x − x ˉ ) ) 2 ≤ ∥ x x t ∥ x ∥ 2 y ∣ ∣ ∥ 2 ∥ x x t ∥ x ∥ 2 ( x − x ˉ ) + λ ( I − x x t ∥ x ∥ 2 ) ( x − x ˉ ) ∥ 2 \\left(y_{||}^t (x - \\bar{x})\\right)^2 \\leq \\left\\|\\frac{x x^t}{\\|x\\|^2}y_{||}\\right\\|^2 \\left\\|\\frac{x x^t}{\\|x\\|^2}(x - \\bar{x})+\\sqrt{\\lambda} \\left( I - \\frac{x x^t}{\\|x\\|^2}\\right)(x - \\bar{x}) \\right\\|^2 ( y ∣∣ t ​ ( x − x ˉ ) ) 2 ≤ ​ ∥ x ∥ 2 x x t ​ y ∣∣ ​ ​ 2 ​ ∥ x ∥ 2 x x t ​ ( x − x ˉ ) + λ ​ ( I − ∥ x ∥ 2 x x t ​ ) ( x − x ˉ ) ​ 2 Whatever way we quantize we can’t affect the quantity ∥ x x t ∥ x ∥ 2 y ∣ ∣ ∥ \\left\\|\\frac{x x^t}{\\|x\\|^2}y_{||}\\right\\| ​ ∥ x ∥ 2 x x t ​ y ∣∣ ​ ​ so all we care about is minimizing the second factor. A little linear algebra shows that this is equal to E r r o r ( a , b | λ ) = ( x ˉ ( a , b ) − x ) t ( x x t ∥ x ∥ 2 + λ ( I − x x t ∥ x ∥ 2 ) ) ( x ˉ ( a , b ) − x ) Error \\left(a, b\\; \\middle| \\; \\lambda \\right) = \\left(\\bar{x}(a,b) - x\\right)^t \\left( \\frac{x x^t}{\\|x\\|^2} + \\lambda \\left(I-\\frac{x x^t}{\\|x\\|^2}\\right)\\right) \\left(\\bar{x}(a,b) - x\\right) E rror ( a , b ∣ λ ) = ( x ˉ ( a , b ) − x ) t ( ∥ x ∥ 2 x x t ​ + λ ( I − ∥ x ∥ 2 x x t ​ ) ) ( x ˉ ( a , b ) − x ) Here, we’ve made the dependence of this expression on the quantization interval explicit. So far we’ve proposed a natural quantity to minimize in order to reduce the impact of quantization on MIP recall. We now turn our attention to how to efficiently minimize this quantity with respect to the interval [ a , b ] [a,b] [ a , b ] . There is a complication because the assignment of components of the vector to grid points depends on a a a and b b b while the optimal choice for a a a and b b b depends on this assignment. We use a coordinate descent approach, which alternates between computing the quantized vector x ^ \\hat{x} x ^ while holding a a a and b b b fixed, and optimizing the quantization interval while holding x ^ \\hat{x} x ^ fixed. This is described schematically as follows. 1 \\;\\; a 0 , b 0 ← max ⁡ ( m x + a ( n ) σ x , x m i n ) , min ⁡ ( m x + b ( n ) σ x , x m a x ) a_0,b_0 \\leftarrow \\max\\left(m_x+a_{(n)}\\sigma_x, x_{min}\\right), \\min\\left(m_x+b_{(n)}\\sigma_x, x_{max}\\right) a 0 ​ , b 0 ​ ← max ( m x ​ + a ( n ) ​ σ x ​ , x min ​ ) , min ( m x ​ + b ( n ) ​ σ x ​ , x ma x ​ ) 2 \\;\\; for k ∈ { 1 , 2 , . . . , r o u n d s } k \\in \\{1,2,...,rounds\\} k ∈ { 1 , 2 , ... , ro u n d s } do 3 \\;\\;\\;\\; compute x ^ ( k ) \\hat{x}_{(k)} x ^ ( k ) ​ from a k − 1 a_{k-1} a k − 1 ​ and b k − 1 b_{k-1} b k − 1 ​ 4 \\;\\;\\;\\; a k , b k ← a r g min ⁡ a , b E r r o r ( a , b | λ , x ^ ( k ) ) a_k,b_k \\leftarrow arg\\min_{a,b}\\; Error\\left(a,b\\; \\middle| \\; \\lambda, \\hat{x}_{(k)} \\right) a k ​ , b k ​ ← a r g min a , b ​ E rror ( a , b ​ λ , x ^ ( k ) ​ ) 5 \\;\\;\\;\\; if E r r o r ( a k , b k | λ , x ^ ( k ) ) > E r r o r ( a k − 1 , b k − 1 | λ , x ^ ( k − 1 ) ) Error\\left(a_k,b_k\\; \\middle| \\; \\lambda, \\hat{x}_{(k)} \\right) > Error\\left(a_{k-1},b_{k-1}\\; \\middle| \\; \\lambda, \\hat{x}_{(k-1)} \\right) E rror ( a k ​ , b k ​ ​ λ , x ^ ( k ) ​ ) > E rror ( a k − 1 ​ , b k − 1 ​ ​ λ , x ^ ( k − 1 ) ​ ) then break First we’ll focus on line 3. The simplest approach to compute x ^ \\hat{x} x ^ uses standard scalar quantization x ^ ( k ) , i = round ( 2 n − 1 b k − 1 − a k − 1 ( clamp ( x i , a k − 1 , b k − 1 ) − a k − 1 ) ) \\hat{x}_{(k), i} = \\text{round}\\left(\\frac{2^n-1}{b_{k-1}-a_{k-1}} \\left(\\text{clamp}(x_i, a_{k-1}, b_{k-1}) - a_{k-1}\\right)\\right) x ^ ( k ) , i ​ = round ( b k − 1 ​ − a k − 1 ​ 2 n − 1 ​ ( clamp ( x i ​ , a k − 1 ​ , b k − 1 ​ ) − a k − 1 ​ ) ) Specifically, this would amount to snapping each component of x x x to the nearest grid point. In practice this does not minimize E r r o r ( a , b | λ ) Error\\left(a,b\\; \\middle| \\; \\lambda\\right) E rror ( a , b ∣ λ ) as we illustrate below. Unfortunately, we can’t just enumerate grid points and find the minimum error one since there are 2 n d 2^{nd} 2 n d candidates; therefore, we tried the following heuristic: Snap to the nearest grid point, Coordinate-wise check if rounding in the other direction reduces the error. This isn’t guaranteed to find the global optimum, which in fact isn’t guaranteed to be one of the corners of the grid square containing the floating point vector. However, in practice we found it meant the error almost never increased in an iteration of the loop over k k k . By contrast, when snapping to the nearest grid point the loop frequently exits due to this condition. The heuristic yields a small but systematic improvement in the brute force recall. On average it amounted to +0.3% compared to just snapping to the nearest grid point. Given the impact is so small we decided it wasn’t worth the extra complexity and increased runtime. We now turn our attention to line 4. This expression decomposes as follows E r r o r ( a , b | λ , x ^ ) = 1 − λ ∥ x ∥ 2 ( x t ( x ˉ ( a , b ) − x ) ) 2 + λ ( x ˉ ( a , b ) − x ) t ( x ˉ ( a , b ) − x ) Error\\left(a,b\\; \\middle| \\; \\lambda, \\hat{x} \\right) = \\frac{1-\\lambda}{\\|x\\|^2} \\left(x^t(\\bar{x}(a,b)-x)\\right)^2 + \\lambda \\left(\\bar{x}(a,b)-x\\right)^t \\left(\\bar{x}(a,b)-x\\right) E rror ( a , b ∣ λ , x ^ ) = ∥ x ∥ 2 1 − λ ​ ( x t ( x ˉ ( a , b ) − x ) ) 2 + λ ( x ˉ ( a , b ) − x ) t ( x ˉ ( a , b ) − x ) It’s fairly easy to see that this is a convex quadratic form of a a a and b b b . This means it has a unique minimum where the partial derivatives w.r.t. a a a and b b b vanish. We won’t show the full calculation but give a flavor. For example, we can use the chain rule to help evaluate the first term ∂ ∂ a ( x t ( x ˉ ( a , b ) − x ) ) 2 = 2 x t ( x ˉ ( a , b ) − x ) ∂ x t x ˉ ( a , b ) ∂ a \\frac{\\partial}{\\partial a} \\left(x^t(\\bar{x}(a,b)-x)\\right)^2 = 2 x^t(\\bar{x}(a,b)-x) \\frac{\\partial x^t\\bar{x}(a,b)}{\\partial a} ∂ a ∂ ​ ( x t ( x ˉ ( a , b ) − x ) ) 2 = 2 x t ( x ˉ ( a , b ) − x ) ∂ a ∂ x t x ˉ ( a , b ) ​ then ∂ x t x ˉ ( a , b ) ∂ a = ∂ ∂ a ∑ i = 1 d x i ( a + b − a 2 n − 1 x ^ i ) = ∑ i = 1 d x i ( 1 − 1 2 n − 1 x ^ i ) = ∑ i = 1 d x i ( 1 − s i ) \\frac{\\partial x^t\\bar{x}(a,b)}{\\partial a} = \\frac{\\partial}{\\partial a} \\sum_{i=1}^d x_i \\left(a + \\frac{b-a}{2^n-1}\\hat{x}_i\\right) = \\sum_{i=1}^d x_i \\left(1-\\frac{1}{2^n-1}\\hat{x}_i\\right) = \\sum_{i=1}^d x_i (1-s_i) ∂ a ∂ x t x ˉ ( a , b ) ​ = ∂ a ∂ ​ i = 1 ∑ d ​ x i ​ ( a + 2 n − 1 b − a ​ x ^ i ​ ) = i = 1 ∑ d ​ x i ​ ( 1 − 2 n − 1 1 ​ x ^ i ​ ) = i = 1 ∑ d ​ x i ​ ( 1 − s i ​ ) Where we’ve defined s i = 1 2 n − 1 x ^ i s_i = \\frac{1}{2^n-1}\\hat{x}_i s i ​ = 2 n − 1 1 ​ x ^ i ​ . The final result is that the optimal interval satisfies [ 1 − λ ∥ x ∥ 2 ( ∑ i x i ( 1 − s i ) ) 2 + λ ∑ i ( 1 − s i ) 2 1 − λ ∥ x ∥ 2 ( ∑ i x i ( 1 − s i ) ) ∑ i x i s i + λ ∑ i ( 1 − s i ) s i 1 − λ ∥ x ∥ 2 ( ∑ i x i ( 1 − s i ) ) ∑ i x i s i + λ ∑ i ( 1 − s i ) s i 1 − λ ∥ x ∥ 2 ( ∑ i x i s i ) 2 + λ ∑ i s i 2 ] [ a b ] = [ ∑ i x i ( 1 − s i ) ∑ i x i s i ] \\footnotesize \\left[ \\begin{matrix} \\frac{1-\\lambda}{\\|x\\|^2}\\left(\\sum_i x_i(1-s_i)\\right)^2+\\lambda \\sum_i (1-s_i)^2 & \\frac{1-\\lambda}{\\|x\\|^2}\\left(\\sum_i x_i(1-s_i)\\right)\\sum_i x_i s_i+\\lambda \\sum_i(1-s_i)s_i \\\\ \\frac{1-\\lambda}{\\|x\\|^2}\\left(\\sum_i x_i(1-s_i)\\right)\\sum_i x_i s_i+\\lambda \\sum_i(1-s_i)s_i & \\frac{1-\\lambda}{\\|x\\|^2}\\left(\\sum_i x_i s_i\\right)^2 + \\lambda \\sum_i s_i^2 \\end{matrix} \\right] \\left[\\begin{matrix}a \\\\ b\\end{matrix}\\right]= \\left[\\begin{matrix}\\sum_i x_i(1-s_i) \\\\ \\sum_i x_i s_i \\end{matrix}\\right] [ ∥ x ∥ 2 1 − λ ​ ( ∑ i ​ x i ​ ( 1 − s i ​ ) ) 2 + λ ∑ i ​ ( 1 − s i ​ ) 2 ∥ x ∥ 2 1 − λ ​ ( ∑ i ​ x i ​ ( 1 − s i ​ ) ) ∑ i ​ x i ​ s i ​ + λ ∑ i ​ ( 1 − s i ​ ) s i ​ ​ ∥ x ∥ 2 1 − λ ​ ( ∑ i ​ x i ​ ( 1 − s i ​ ) ) ∑ i ​ x i ​ s i ​ + λ ∑ i ​ ( 1 − s i ​ ) s i ​ ∥ x ∥ 2 1 − λ ​ ( ∑ i ​ x i ​ s i ​ ) 2 + λ ∑ i ​ s i 2 ​ ​ ] [ a b ​ ] = [ ∑ i ​ x i ​ ( 1 − s i ​ ) ∑ i ​ x i ​ s i ​ ​ ] which is trivial to solve for a a a ​ and b b b ​. Taken together with the preprocessing and interval initialization steps this defines OSQ. The query and document vectors are different We noted already that each vector could choose its bit count. Whilst there are certain technical disadvantages to using different bit counts for different document vectors, the query and the document vectors are different. In particular, the compression factor the quantization scheme yields only depends on the number of bits used to represent the document vectors. There are side effects from using more bits to represent the query, principally that it can affect the dot product performance. Also there’s a limit to what one can gain based on the document quantization error. However, we get large recall gains by using asymmetric quantization for high compression factors and this translates to significant net wins in terms of recall as a function of query latency. Therefore, we always quantize the query using at least 4 bits. How does it perform? In this section we compare the brute force recall@ k ∣ n k|n k ∣ n and the correlation between the estimated and actual dot products for OSQ and our baseline BBQ quantization scheme, which was an adaptation of RaBitQ to use with an HNSW index. We have previously evaluated this baseline scheme and its accuracy is commensurate with RaBitQ. The authors of RaBitQ did extensive comparisons with alternative methods showing its superiority; we therefore consider it sufficient to simply compare to this very strong baseline. We also perform a couple of ablation studies: against a global interval and against the per vector intervals calculated by our initialization scheme. To compute a global interval we use the OSQ initialization strategy, but compute the mean and variance of the corpus as a whole. For low bit counts, this is significantly better than the usual fixed interval quantization scheme, which tends to use something like the 99th percentile centered confidence interval. High confidence intervals in such cases often completely fail because the grid points are far from the majority of the data. We evaluate against a variety of embeddings e5-small-v2 arctic-embed-m gte-base-en-v1.5 Cohere v2 Gist descriptors and datasets Quora FiQA A sample of around 1M passages from the English portion of wikipedia-22-12 GIST1M The embedding dimensions vary from 384 to 960. E5, Arctic and GTE use cosine similarity, and Cohere and GIST using MIP. The datasets vary from around 100k to 1M vectors. In all our experiments we set λ = 0.1 \\lambda=0.1 λ = 0.1 , which we found to be an effective choice. First off we study 1 bit quantization. We report brute force recall in these experiments to take any effects of the indexing choice out of the picture. As such we do not compare recall vs latency curves, which are very strongly affected by both the indexing data structure and the dot product implementation. Instead we focus on the recall at 10 reranking the top n hits. In our next blog we’ll study how OSQ behaves when it is integrated with Lucene's HNSW index implementation and turn our attention to query latency. The figures below show our brute force recall@ 10 ∣ n 10|n 10∣ n for n ∈ { 10 , 20 , 30 , 40 , 50 } n\\in\\{10,20,30,40,50\\} n ∈ { 10 , 20 , 30 , 40 , 50 } . Rolling these up into the average recall@ 10 ∣ n 10|n 10∣ n for the 8 retrieval experiments we tested we get the table below. n Baseline average recall@10|n OSQ average recall@10|n Global average recall@10|n Initialization average recall@10|n 10 0.71 0.74 0.54 0.65 20 0.88 0.90 0.69 0.81 30 0.93 0.94 0.76 0.86 40 0.95 0.96 0.80 0.89 50 0.96 0.97 0.82 0.90 Compared to the baseline we gain 2% on average in recall. As for the ablation study, compared to using a global quantization interval we gain 26% and compared to our initial per vector quantization intervals we gain 10%. The figure below shows the floating point dot product values versus the corresponding 1 bit quantized dot product values for a sample of 2000 gte-base-en-v1.5 embeddings. Visually the correlation is high. We can quantify this by computing the R 2 R^2 R 2 between the floating point and quantized dot products. For each dataset and model combination we computed the average R 2 R^2 R 2 for every query against the full corpus. We see small but systematic improvement in R 2 R^2 R 2 comparing OSQ to the baseline. The table below shows the R 2 R^2 R 2 values broken down by the dataset and model combinations we tested. Dataset Model Baseline R2 OSQ R2 FiQA e5-small 0.849 0.865 FiQA arctic 0.850 0.863 FiQA gte 0.925 0.930 Quora e5-small 0.868 0.881 Quora arctic 0.817 0.838 Quora gte 0.887 0.897 Wiki Cohere v2 0.884 0.898 GIST1M - 0.953 0.974 Interestingly, when we integrated OSQ with HNSW we got substantially larger improvements in recall than we see for brute force search, as we’ll show in our next blog. One hypothesis we have is that the improvements we see in correlation with the true floating point dot products are more beneficial for graph search than brute force search. For many queries the tail of high scoring documents are well separated from the bulk of the score distribution and are less prone to being reordered by quantization errors. By contrast we have to navigate through regions of low scoring documents as we traverse the HNSW graph. Here any gains in accuracy can be important. Finally, the table below compares the average recall for 1 and 2 bit OSQ for the same 8 retrieval experiments. With 2 bits we reach 95% recall reranking between 10 and 20 candidates. The average R 2 R^2 R 2 rises from 0.893 for 1 bit to 0.968 for 2 bit. n 1 bit OSQ average recall@10|n 2 bit OSQ average recall@10|n 10 0.74 0.84 20 0.90 0.97 30 0.94 0.99 40 0.96 0.995 50 0.97 0.997 Conclusion We’ve presented an improved automatic scalar quantization scheme which allows us to achieve high recall with relatively modest reranking depth. Avoiding deep reranking has significant system advantages. For 1 bit quantization we compared it to a very strong baseline and showed it was able to achieve systematic improvements in both recall and the accuracy with which it approximates the floating point vector distance calculation. Therefore, we feel comfortable saying that it sets new state-of-the-art performance for at 32 × \\times × compression of the raw vectors. It also allows one to simply trade compression for retrieval quality using the same underlying approach and achieves significant performance improvements compared to standard scalar quantization techniques. We are working on integrating this new approach into Elasticsearch. In our next blog we will discuss how it is able to enhance the performance of our existing BBQ scalar quantization index formats. Report an issue Related content Vector Database Lucene December 4, 2024 Smokin' fast BBQ with hardware accelerated SIMD instructions How we optimized vector comparisons in BBQ with hardware accelerated SIMD (Single Instruction Multiple Data) instructions. CH By: Chris Hegarty Lucene Vector Database November 11, 2024 Better Binary Quantization (BBQ) in Lucene and Elasticsearch How Better Binary Quantization (BBQ) works in Lucene and Elasticsearch. BT By: Benjamin Trent Vector Database Lucene +1 October 22, 2024 RaBitQ binary quantization 101 Understand the most critical components of RaBitQ binary quantization, how it works and its benefits. This guide also covers the math behind the quantization and examples. JW By: John Wagster Vector Database Lucene +1 November 18, 2024 Better Binary Quantization (BBQ) vs. Product Quantization Why we chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch. BT By: Benjamin Trent Lucene ML Research November 11, 2023 Understanding scalar quantization in Lucene Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights. BT By: Benjamin Trent Jump to Measuring success Motivation Dot products are enough Scalar quantization refresher The need for speed Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Understanding optimized scalar quantization - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/scalar-quantization-optimization","meta_description":"Get a refresher of scalar quantization and learn about optimized scalar quantization, a new form of scalar quantization we've developed at Elastic."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Accessing machine learning models in Elastic Explore the machine learning (ML) models supported in Elastic, the Eland library for loading models and how to apply transformers & NLP in Elastic. Integrations BS JD By: Bernhard Suhm and Josh Devins On June 21, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This blog explores the machine learning (ML) models supported in Elastic, including built-in, third-party, and custom models. It also discusses the Eland library for loading models and explains how to apply transformers and NLP in Elastic, which is a common use case in the context of search applications. Elastic supports the machine learning models you need Elastic lets you apply the machine learning (ML) that’s appropriate for your use case and level of ML expertise. You have multiple options: Leverage the models that come built-in. Aside from models targeting specific security threats and types of system issues in our observability and security solution, you can use our proprietary Elastic Learned Sparse Encoder model out of the box, as well as a language identification — useful if you’re working with non-English text data. Access third-party PyTorch models from anywhere including the HuggingFace model hub. Load a model you trained yourself — primarily NLP transformers at this point. Using built-in models gets you value out of the box, without requiring any ML expertise from you, yet you have the flexibility to try out different models and determine what performs best on your data We designed our model management to be scalable across multiple nodes in a cluster, while also ensuring good inference performance for both high throughput and low latency workloads. That’s in part by empowering ingest pipelines to run inference and by using dedicated nodes for the computationally demanding model inference — during the ingestion phase, as well as data analysis and search. Read on to learn more about the Eland library that lets you load models into Elastic and how that plays out for the various types of machine learning you might use within Elasticsearch — from the latest transformer and natural language processing (NLP) models to boosted tree models for regression. Eland lets you load ML models into Elastic Our Eland library provides an easy interface to load ML models into Elasticsearch — provided they were trained using PyTorch. Using the native library libtorch, and expecting models that have been exported or saved as a TorchScript representation, Elasticsearch avoids running a Python interpreter while performing model inference. By integrating with one of the most popular formats for building NLP models in PyTorch, Elasticsearch can provide a platform that works with a large variety of NLP tasks and use cases. We’ll get more into that in the section on transformers that follows. You have three options for using Eland to upload a model: command-line, Docker, and from within your own Python code. Docker is less complex because it does not require a local installation of Eland and all of its dependencies. Once you have access to Eland, the code sample below shows how to upload a DistilBERT NER model, as an example: Further below we’ll walk through each of the arguments of eland_import_hub_model. And you can issue the same command from a Docker container. Once uploaded, Kibana’s ML Model Management user interface lets you manage the models on an Elasticsearch cluster, including increasing allocations for additional throughput, and stop/resume models while (re)configuring your system. Which models Elastic support? Elastic supports a variety of transformer models, as well as the most popular supervised learning libraries: NLP and embedding models: All transformers that conform to the standard BERT model interface and use the WordPiece tokenization algorithm. View a complete list of supported model architectures. Supervised learning: Trained models from scikit-learn, XGBoost, and LightGBM libraries to be serialized and used as an inference model in Elasticsearch. Our documentation provides an example for training an XGBoost classify on data in Elastic . You can also export and import supervised models trained in Elastic with our data frame analytics. Generative AI: You can use the API provided for the LLM to pass queries — potentially enriched with context retrieved from Elastic — and process the results returned. For further instructions, refer to this blog , which links to a GitHub repository with example code for communicating via ChatGPT’s API. Below we provide more information for the type of model you’re most likely to use in the context of search applications: NLP transformers. How to apply transformers and NLP in Elastic, with ease! Let us walk you through the steps to load and use an NLP model, for example a popular NER model from Hugging Face, going over the arguments identified in below code snippet. Specify the Elastic Cloud identifier. Alternatively, use --url . Provide authentication details to access your cluster. You can look up available authentication methods . Specify the identifier for the model in the Hugging Face model hub. Specify the type of NLP task. Supported values are fill_mask, ner, text_classification, text_embedding, and zero_shot_classification. Once you’ve loaded the model, next you need to deploy it. You accomplish this on the Model Management screen of the Machine Learning tab in Kibana. Then you’d typically test the model to ensure it’s working properly. Now you’re ready to use the deployed model for inference. For example to extract named entities, you call the _infer endpoint on the loaded NER model: The model identifies two entities: the person \"Josh\" and the location \"Berlin.\" For additional steps, like using this model in an inference pipeline and tuning the deployment, read the blog that describes this example. Want to see how to apply semantic search — for example, how to create embeddings for text and then apply vector search to find related documents? This blog lays that out step-by-step, including validating model performance. Don’t know which type of task for which model? This table should help you get started. Hugging Face Model task-type Named entity recognition ner Text embedding text_embedding Text classification text_classification Zero shot classification zero_shot_classification Question answering question_answering Elastic also supports comparing how similar two pieces of text are to each other as text_similarity task-type — this is useful for ranking document text when comparing it to another provided text input, and it’s sometimes referred to as cross-encoding. Check these resources for more details Support for PyTorch transformers, including design considerations for Eland Steps for loading transformers into Elastic and using them in inference Blog describing how to query your proprietary data using ChatGPT Adapt a pre trained transformer to a text classification task, and load the custom model into Elastic Built-in language identification that lets you identify non-English text before passing into models that support only English Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Jump to Elastic supports the machine learning models you need Eland lets you load ML models into Elastic Which models Elastic support? How to apply transformers and NLP in Elastic, with ease! Check these resources for more details Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Accessing machine learning models in Elastic - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-machine-learning-models","meta_description":"Explore the machine learning (ML) models supported in Elastic, the Eland library for loading models and how to apply transformers & NLP in Elastic."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Improving the ES|QL editor experience in Kibana With the new ES|QL language becoming GA, a new editor experience has been developed in Kibana to help users write faster and better queries. Features like live validation, improved autocomplete and quick fixes will streamline the ES|QL experience. ES|QL ML By: Marco Liberati On December 31, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We’re going to cover the guiding principles behind improving the ES|QL editor experience in Kibana and what we did to achieve that goal. We’ll cover features like live validation, improved autocomplete and quick fixes which all streamline the ES|QL experience. ES|QL background Since Elastic 8.11, a technical preview is now available of Elastic’s new piped query language, ES|QL (Elasticsearch Query Language), which transforms, enriches, and simplifies data investigations. Powered by a new query engine, ES|QL delivers advanced search capabilities with concurrent processing, improving speed and efficiency, irrespective of data source and structure. Accelerate resolution by creating aggregations and visualizations from one screen, delivering an iterative, uninterrupted workflow. As a developer, learning a new language can be both an interesting challenge and a frustrating scenario. For a query language having nice syntax, lots of documentation and examples makes it click, but then moving from the walled garden of the documentation examples into the real world queries can be challenging. When adopting a new language as a developer, I’m interested in quickly iterating and jumping from a trial and error environment to the documentation to check more in-depth topics about syntax, limits and caveats. Writing a correct ES|QL query should be easy With ES|QL, we want to provide the best possible experience for developers to push on all of the possibilities that modern web editors can provide. As such, the ES|QL editor in Kibana has a critical role as it is one of the main mediums for users to approach the new language. Improving its user experience is of high importance to us. In order to improve the user experience in the editor, these four principles have been identified: The user should not need to memorize all of the knowledge regarding indices/fields/policies/functions etc… It should take seconds, not minutes, to understand what’s wrong with a query. Autocomplete should make it easy for users to build the right queries. The user should not be blamed for errors, rather the editor should help to fix them. Catching ES|QL errors early on (and fix them) with Kibana editor In 8.13 ES|QL in Discover offers a complete client side validation engine, making it easy to catch potential errors before submitting a query to Elasticsearch. The validation runs while typing and offers immediate feedback for the incorrect parts of the query: (When expanded it is possible to inspect specific errors in the ES|QL editor with cursor hovering) The validation has some resiliency to syntax errors and can still provide useful information to the user with incomplete queries. (The ES|QL can validate the entire query at multiple points: errors are collected and fully reported on demand) As a developer that is comfortable using IDEs in my daily coding environment, I’m used to the quick fix menu that provides suggestions on how to address common problems like spelling errors or using the wrong quotes. Kibana uses the Monaco editor under the hood, which is a smaller version of the VSCode editor, and that provides an interface to deliver a similar feature also on the web. An initial quick fix feature has been developed and some basic suggestions are already supported: (The new ES|QL will leverage internal knowledge to propose a quick fix with existing indexes) Current list of supported quick fixes includes: Wrong field quoting Wrong literals quoting Index, field (and metafields), function, policies typos …and more will be added in subsequent versions The quick fixes function is still in its initial development and we are looking for feedback and enhancement requests. Better ES|QL auto-complete in Kibana editor Since its’ release ES|QL has been shipped with a basic auto-complete feature in the Kibana editor, which already provides some useful suggestions for first time users. In 8.13 the autocomplete logic has been refactored with improved feedback for the user, leveraging all field and function types, together with a deep knowledge of its ES|QL implementation in Elasticsearch. In simple terms this means that from 8.13 on autocomplete will only suggest the right “thing” in many scenarios that were before uncovered. A list (not necessarily complete) of covered features are: Propose the right function, even when used within another function: (The ES|QL autocomplete knows what functions are compatible with each other, even when nested) Propose the right argument for a function, either filtering out fields by type or proposing the right constants (Autocomplete can help with special enums for particular functions, listing all of them directly) Know when to quote or not a field/index name The new autocomplete attempts to reduce the amount of information the user has to keep in mind in order to build a query, with both the application of many contextual type filters and leveraging some deep knowledge of the new language syntax. Provide more contextual help in ES|QL Kibana editor The new autocomplete contains the hidden gem of providing a full contextual help for any proposed suggestion, in particular for functions or commands with examples. (Autocomplete can provide full inline documentation with examples on demand for commands and functions) Another useful way to get more information within the editor is hover over specific parts of the query, like the policy name to gather more metadata information about it. (Contextual tooltips helps with quick summaries of enrich policies with same basic informations) Make the most of your data with the ES|QL Kibana editor In this post, we showcased some of the new ES|QL Kibana editor features. In summary, the list of features are as follows: Users can get immediate feedback when typing a query about syntax and/or invalid query statement Users can quickly get fix suggestions on some specific errors Index, fields and policies are automatically suggested to the users in the right place Help is provided inline with full documentation and examples. Elastic invites SREs and developers to experience this editor feature firsthand and unlock new horizons in their data tasks. Try it today at https://ela.st/free-trial now. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau ES|QL Ruby +1 October 24, 2024 How to use the ES|QL Helper in the Elasticsearch Ruby Client Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results. FB By: Fernando Briano ES|QL Python +1 September 5, 2024 From ES|QL to native Pandas dataframes in Python Learn how to export ES|QL queries as native Pandas dataframes in Python through practical examples. QP By: Quentin Pradet ES|QL Python August 20, 2024 An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus Use Elasticsearch Query Language (ES|QL) to run statistical analysis on demographic data index in Elasticsearch. BA By: Baha Azarmi Jump to ES|QL background Writing a correct ES|QL query should be easy Catching ES|QL errors early on (and fix them) with Kibana editor Better ES|QL auto-complete in Kibana editor Provide more contextual help in ES|QL Kibana editor Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving the ES|QL editor experience in Kibana - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/improving-esql-editor-experience-in-kibana","meta_description":"Learn about the ES|QL editor in Kibana and its features, such as live validation, improved autocomplete, and quick fixes."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch vs. OpenSearch: Vector Search Performance Comparison Elasticsearch is out-of-the-box 2x–12x faster than OpenSearch for vector search Vector Database Lucene US By: Ugo Sangiorgi On June 26, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. TLDR: Elasticsearch is up to 12x faster - We at Elastic have received numerous requests from our community to clarify the performance differences between Elasticsearch and OpenSearch, particularly in the realm of Semantic Search / Vector Search, so we have undertaken this performance testing to provide a clear, data-driven comparison — no ambiguity, just straightforward facts to inform our users. The results show that Elasticsearch is up to 12x faster than OpenSearch for vector search and therefore requires fewer computational resources. This reflects Elastic's focus on consolidating Lucene as the best vector database for search and retrieval use cases. Vector search is revolutionizing the way we conduct similarity searches, particularly in fields like AI and machine learning. With the increasing adoption of vector embedding models, the ability to efficiently search through millions of high-dimension vectors becomes critical. When it comes to powering vector databases, Elastic and OpenSearch have taken notably different approaches. Elastic has invested heavily in optimizing Apache Lucene together with Elasticsearch to elevate them as the top-tier choice for vector search applications. In contrast, OpenSearch has broadened its focus, integrating other vector search implementations and exploring beyond Lucene's scope. Our focus on Lucene is strategic, enabling us to provide highly integrated support in our version of Elasticsearch, resulting in an enhanced feature set where each component complements and amplifies the capabilities of the other. This blog presents a detailed comparison between Elasticsearch 8.14 and OpenSearch 2.14 accounting for different configurations and vector engines. In this performance analysis, Elasticsearch proved to be the superior platform for vector search operations, and upcoming features will widen the differences even more significantly . When pitted against OpenSearch, it excelled in every benchmark track — offering 2x to 12x faster performance on average . This was across scenarios using varying vector amounts and dimensions including so_vector (2M vectors, 768D), openai_vector (2.5M vectors, 1536D), and dense_vector (10M vectors, 96D), all available in this repository alongside the Terraform scripts for provisioning all the required infrastructure on Google Cloud and Kubernetes manifests for running the tests. The results detailed in this blog complement the results from a previously published and third-party validated study that shows Elasticsearch is 40%–140% faster than OpenSearch for the most common search analytics operations: Text Querying, Sort, Range, Date Histogram and Terms filtering. Now we can add another differentiator: Vector Search. Up to 12x faster out-of-the-box Our focused benchmarks across the four vector data sets involved both Approximate KNN and Exact KNN searches, considering different sizes, dimensions and configurations, totaling 40.189.820 uncached search requests. The results: Elasticsearch is up to 12x faster than OpenSearch for vector search and therefore requires fewer computational resources. Figure 1: Grouped tasks for ANN and Exact KNN across different combinations in Elasticsearch and OpenSearch. The groups like knn-10-100 means KNN search with k : 10 k:10 k : 10 and n : 100 n:100 n : 100 . In HNSW vector search, k k k determines the number of nearest neighbors to retrieve for a query vector. It specifies how many similar vectors to find as a result. n n n sets the number of candidate vectors to retrieve at each segment. More candidates can enhance accuracy but require greater computational resources. We also tested with different quantization techniques and leveraged engine-specific optimizations, the detailed results for each track, task and vector engine are available below. Exact KNN and Approximate KNN When dealing with varying data sets and use cases, the right approach for vector search will differ. In this blog all tasks stated as knn-* like knn-10-100 use Approximate KNN and script-score-* refer to Exact KNN , but what is the difference between them, and why are they important? In essence, if you're handling more substantial data sets, the preferred method is the Approximate K-Nearest Neighbor (ANN) due to its superior scalability. For more modest data sets that may require a filtration process, Exact KNN method is ideal. Exact KNN uses a brute-force method, calculating the distance between one vector and every other vector in the data set. It then ranks these distances to find the k k k nearest neighbors. While this method ensures an exact match, it suffers from scalability challenges for large, high-dimensional data sets. However, there are many cases in which Exact KNN is needed: Rescoring : In scenarios involving lexical or semantic searches followed by vector-based rescoring, Exact KNN is essential. For example, in a product search engine, initial search results can be filtered based on textual queries (e.g., keywords, categories), and then vectors associated with the filtered items are used for a more accurate similarity assessment. Personalization : When dealing with a large number of users, each represented by a relatively small number (like 1 million) of distinct vectors, sorting the index by user-specific metadata (e.g., user_id) and brute-force scoring with vectors becomes efficient. This approach allows for personalized recommendations or content delivery based on precise vector comparisons tailored to individual user preferences. Exact KNN therefore ensures that the final ranking and recommendations based on vector similarity are precise and tailored to user preferences. Approximate KNN (or ANN) on the other hand employs methods to make data searching faster and more efficient than Exact KNN, especially in large, high-dimensional data sets. Instead of a brute-force approach, which measures the exact nearest distance between a query and all points leading to computation and scaling challenges, ANN uses certain techniques to efficiently restructure the indexes and dimensions of searchable vectors in the data set. While this may cause a slight inaccuracy, it significantly boosts the speed of the search process, making it an effective alternative for dealing with large data sets. In this blog all tasks stated as knn-* like knn-10-100 use Approximate KNN and script-score-* refer to Exact KNN . Testing methodology While Elasticsearch and OpenSearch are similar in terms of API for BM25 search operations, since the latter is a fork of the former, it is not the case for Vector Search, which was introduced after the fork. OpenSearch took a different approach than Elasticsearch when it comes to algorithms, by introducing two other engines — nmslib and faiss — apart from lucene , each with their specific configurations and limitations (e.g., nmslib in OpenSearch does not allow for filters, an essential feature for many use cases). All three engines use the Hierarchical Navigable Small World (HNSW) algorithm, which is efficient for approximate nearest neighbor search, and especially powerful when dealing with high-dimensional data. It's important to note that faiss also supports a second algorithm, ivf , but since it requires pre-training on the data set, we are going to focus solely on HNSW. The core idea of HNSW is to organize the data into multiple layers of connected graphs, with each layer representing a different granularity of the data set. The search begins at the top layer with the coarsest view and progresses down to finer and finer layers until reaching the base level. Both search engines were tested under identical conditions in a controlled environment to ensure fair testing grounds. The method applied is similar to this previously published performance comparison , with dedicated node pools for Elasticsearch, OpenSearch, and Rally. The terraform script is available (alongside all sources) to provision a Kubernetes cluster with: 1 Node pool for Elasticsearch with 3 e2-standard-32 machines (128GB RAM and 32 CPUs) 1 Node pool for OpenSearch with 3 e2-standard-32 machines (128GB RAM and 32 CPUs) 1 Node pool for Rally with 2 t2a-standard-16 machines (64GB RAM and 16 CPUs) Each \"track\" (or test) ran for 10 times for each configuration, which included different engines, different configurations and different vector types. The tracks have tasks that repeat between 1000 and 10000 times, depending on the track. If one of the tasks in a track failed for instance due to a network timeout, then all tasks were discarded, so all results represent tracks that started and finished without problems. All test results are statistically validated, ensuring that improvements aren’t coincidental. Detailed findings Why compare using the 99th percentile and not the average latency? Consider a hypothetical example of average house prices in a certain neighborhood. The average price may indicate an expensive area, but on closer inspection, it may turn out that most homes are valued much lower, with only a few luxury properties inflating the average figure. This illustrates how the average price can fail to accurately represent the full spectrum of house values in the area. This is akin to examining response times, where the average can conceal critical issues. Tasks Approximate KNN with k:10 n:50 Approximate KNN with k:10 n:100 Approximate KNN with k:100 n:1000 Approximate KNN with k:10 n:50 and keyword filters Approximate KNN with k:10 n:100 and keyword filters Approximate KNN with k:100 n:1000 and keyword filters Approximate KNN with k:10 n:100 in conjunction with indexing Exact KNN (script score) Vector engines lucene in Elasticsearch and OpenSearch, both on version 9.10 faiss in OpenSearch nmslib in OpenSearch Vector types hnsw in Elasticsearch and OpenSearch int8_hnsw in Elasticsearch (HNSW with automatic 8 bit quantization: link ) sq_fp16 hnsw in OpenSearch (HNSW with automatic 16 bit quantization: link ) Out-of-the-box and Concurrent Segment Search As you probably know, Lucene is a highly performant text search engine library written in Java that serves as the backbone for many search platforms like Elasticsearch, OpenSearch, and Solr. At its core, Lucene organizes data into segments, which are essentially self-contained indices that allow Lucene to execute searches more efficiently. So when you issue a search to any Lucene-based search engine, your search will end up being executed in those segments, either sequentially or in parallel. OpenSearch introduced concurrent segment search as an optional flag, and does not use it by default, you must enable it using a special index setting index.search.concurrent_segment_search.enabled as detailed here , with some limitations . Elasticsearch on the other hand searches on segments concurrently out-of-the-box , therefore the comparisons we make in this blog will take into consideration, on top of the different vector engines and vector types, also the different configurations: Elasticsearch ootb: Elasticsearch out-of-the-box, with concurrent segment search; OpenSearch ootb: without concurrent segment search enabled; OpenSearch css: with concurrent segment search enabled Now, let’s dive into some detailed results for each vector data set tested: 2.5 million vectors, 1536 dimensions (openai_vector) Starting with the simplest track, but also the largest in terms of dimensions, openai_vector - which uses the NQ data set enriched with embeddings generated using OpenAI's text-embedding-ada-002 model . It is the simplest since it tests only Approximate KNN and has only 5 tasks. It tests in standalone (without indexing) as well as alongside indexing, and using a single client and 8 simultaneous clients. Tasks standalone-search-knn-10-100-multiple-clients : searching on 2.5 million vectors with 8 clients simultaneously, k: 10 and n:100 standalone-search-knn-100-1000-multiple-clients : searching on 2.5 million vectors with 8 clients simultaneously, k: 100 and n:1000 standalone-search-knn-10-100-single-client : searching on 2.5 million vectors with a single client, k: 10 and n:100 standalone-search-knn-100-1000-single-client : searching on 2.5 million vectors with a single client, k: 100 and n:1000 parallel-documents-indexing-search-knn-10-100 : searching on 2.5 million vectors while also indexing additional 100000 documents, k:10 and n:100 The averaged p99 performance is outlined below: Here we observed that Elasticsearch is between 3x-8x faster than OpenSearch when performing vector search alongside indexing (i.e. read+write) with k k k :10 and n n n :100 and 2x-3x faster without indexing for the same k and n. For k k k :100 and n n n :1000 ( standalone-search-knn-100-1000-single-client and standalone-search-knn-100-1000-multiple-clients Elasticsearch is 2x to 7x faster than OpenSearch, on average. The detailed results show the exact cases and vector engines compared: Recall knn-recall-10-100 knn-recall-100-1000 Elasticsearch-8.14.0@lucene-hnsw 0.969485 0.995138 Elasticsearch-8.14.0@lucene-int8_hnsw 0.781445 0.784817 OpenSearch-2.14.0@lucene-hnsw 0.96519 0.995422 OpenSearch-2.14.0@faiss 0.984154 0.98049 OpenSearch-2.14.0@faiss-sq_fp16 0.980012 0.97721 OpenSearch-2.14.0@nmslib 0.982532 0.99832 10 million vectors, 96 dimensions (dense_vector) In dense_vector with 10M vectors and 96 dimensions. It is based on the Yandex DEEP1B image data set. The data set is created from the first 10 million vectors of the \"sample data\" file called learn.350M.fbin . The search operations use vectors from the \"query data\" file query. public.10K.fbin . Both Elasticsearch and OpenSearch perform very well on this data set, especially after a force merge which is usually done on read-only indices and it’s similar to defragmenting the index to have a single \"table\" to search on. Tasks Each task warms up for 100 requests and then 1000 requests are measured knn-search-10-100 : searching on 10 million vectors, k: 10 and n:100 knn-search-100-1000 : searching on 10 million vectors, k: 100 and n:1000 knn-search-10-100-force-merge : searching on 10 million vectors after a force merge, k: 10 and n:100 knn-search-100-1000-force-merge : searching on 10 million vectors after a force merge, k: 100 and n:1000 knn-search-100-1000-concurrent-with-indexing : searching on 10 million vectors while also updating 5% of the data set , k: 100 and n:1000 script-score-query : Exact KNN search of 2000 specific vectors . Both Elasticsearch and OpenSearch performed well for Approximate KNN. When the index is merged (i.e. has just a single segment) in knn-search-100-1000-force-merge and knn-search-10-100-force-merge , OpenSearch performs better than the others when using nmslib and faiss , even though they are all around 15ms and all very close. However, when the index has multiple segments (a typical situation where an index receives updates to its documents) in knn-search-10-100 and knn-search-100-1000 , Elasticsearch keeps the latency in about ~7ms and ~16ms, while all other OpenSearch engines are slower. Also when the index is being searched and written to at the same time ( knn-search-100-1000-concurrent-with-indexing ), Elasticsearch maintains the latency below 15ms (at 13.8ms), being almost 4x faster than OpenSearch out-of-the-box (49.3ms) and still faster when concurrent segment search is enabled (17.9ms), but too close to be significative. As for Exact KNN, the difference is much larger: Elasticsearch is 6x faster than OpenSearch (~260ms vs ~1600ms). Recall knn-recall-10-100 knn-recall-100-1000 Elasticsearch-8.14.0@lucene-hnsw 0.969843 0.996577 Elasticsearch-8.14.0@lucene-int8_hnsw 0.775458 0.840254 OpenSearch-2.14.0@lucene-hnsw 0.971333 0.996747 OpenSearch-2.14.0@faiss 0.9704 0.914755 OpenSearch-2.14.0@faiss-sq_fp16 0.968025 0.913862 OpenSearch-2.14.0@nmslib 0.9674 0.910303 2 million vectors, 768 dimensions (so_vector) This track , so_vector , is derived from a dump of StackOverflow posts downloaded on April, 21st 2022. It only contains question documents — all documents representing answers have been removed. Each question title was encoded into a vector using the sentence transformer model multi-qa-mpnet-base-cos-v1 . This data set contains the first 2 million questions. Unlike the previous track, each document here contains other fields besides vectors to support testing features like Approximate KNN with filtering and hybrid search. nmslib for OpenSearch is notably absent in this test since it does not support filters . Tasks Each task warms up for 100 requests and then 100 requests are measured. Note the tasks were grouped for sake of simplicity, since the test contains 16 search types * 2 different k values * 3 different n values. knn-10-50 : searching on 2 million vectors without filters, k:10 and n:50 knn-10-50-filtered : searching on 2 million vectors with filters , k:10 and n:50 knn-10-50-after-force-merge : searching on 2 million vectors with filters and after a force merge, k:10 and n:50 knn-10-100 : searching on 2 million vectors without filters, k:10 and n:100 knn-10-100-filtered : searching on 2 million vectors with filters , k:10 and n:100 knn-10-100-after-force-merge : searching on 2 million vectors with filters and after a force merge, k:10 and n:100 knn-100-1000 : searching on 2 million vectors without filters, k:100 and n:1000 knn-100-1000-filtered : searching on 2 million vectors with filters , k:100 and n:1000 knn-100-1000-after-force-merge : searching on 2 million vectors with filters and after a force merge, k:100 and n:1000 exact-knn : Exact KNN search with and without filters . Elasticsearch is consistently faster than OpenSearch out-of-the-box on this test, only in two cases OpenSearch is faster, and not by much ( knn-10-100 and knn-100-1000 ). Tasks involving knn-10-50 , knn-10-100 and knn-100-1000 in combination with filters show a difference of up to 7x (112ms vs 803ms). The performance of both solutions seems to even out after a \"force merge\", understandably, as evidenced by knn-10-50-after-force-merge , knn-10-100-after-force-merge and knn-100-1000-after-force-merge. On those tasks faiss is faster. The performance for Exact KNN once again is very different, Elasticsearch being 13 times faster than OpenSearch this time (~385ms vs ~5262ms). Recall knn-recall-10-100 knn-recall-100-1000 knn-recall-10-50 Elasticsearch-8.14.0@lucene-hnsw 1 1 1 Elasticsearch-8.14.0@lucene-int8_hnsw 1 0.986667 1 OpenSearch-2.14.0@lucene-hnsw 1 1 1 OpenSearch-2.14.0@faiss 1 1 1 OpenSearch-2.14.0@faiss-sq_fp16 1 1 1 OpenSearch-2.14.0@nmslib 0.9674 0.910303 0.976394 Elasticsearch and Lucene as clear victors At Elastic, we are relentlessly innovating Apache Lucene and Elasticsearch to ensure we are able to provide the premier vector database for search and retrieval use cases, including RAG (Retrieval Augmented Generation). Our recent advancements have dramatically increased performance, making vector search faster and more space efficient than before, building upon the gains from Lucene 9.10. This blog presented a study that shows when comparing up-to-date versions Elasticsearch is up to 12 times faster than OpenSearch. It's worth noting both products use the same version of Lucene ( Elasticsearch 8.14 Release Notes and OpenSearch 2.14 Release Notes ). The pace of innovation at Elastic will deliver even more not only for our on-premises and Elastic Cloud customers but those using our stateless platform . Features like support for scalar quantization to int4 will be offered with rigorous testing to ensure customers can utilize these techniques without a significant drop in recall, similar to our testing for int8 . Vector search efficiency is becoming a non-negotiable feature in modern search engines due to the proliferation of AI and machine learning applications. For organizations looking for a powerful search engine capable of keeping up with the demands of high-volume, high-complexity vector data, Elasticsearch is the definitive answer. Whether expanding an established platform or initiating new projects, integrating Elasticsearch for vector search needs is a strategic move that will yield tangible, long-term benefits. With its proven performance advantage, Elasticsearch is poised to underpin the next wave of innovations in search. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Up to 12x faster out-of-the-box Exact KNN and Approximate KNN Testing methodology Detailed findings Tasks Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch vs. OpenSearch: Vector Search Performance Comparison - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-opensearch-vector-search-performance-comparison","meta_description":"Elasticsearch is out-of-the-box 2x–12x faster than OpenSearch for vector search"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Making Elasticsearch and Lucene the best vector database: up to 8x faster and 32x efficient Discover the recent enhancements and optimizations that notably improve vector search performance in Elasticsearch & Lucene vector database. Vector Database Generative AI MS BT JF By: Mayya Sharipova , Benjamin Trent and Jim Ferenczi On April 26, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch and Lucene report card: noteworthy speed and efficiency investments Our mission at Elastic is to make Apache Lucene the best vector database out there, and to continue to make Elasticsearch the best retrieval platform out there for search and RAG. Our investments into Lucene are key to ensure that every release of Elasticsearch brings increasing faster performance and scale. Customers are already building the next generation of AI enabled search applications with Elastic’s vector database and vector search technology. Roboflow is used by over 500,000 engineers to create datasets, train models, and deploy computer vision models to production. Roboflow uses Elastic vector database to store and search billions of vector embeddings. In this blog we summarize recent enhancements and optimisations that significantly improve vector search performance in Elasticsearch and Apache Lucene, over and above performance gains delivered with Lucene 9.9 and Elasticsearch 8.12.x. The integration of vector search into Elasticsearch relies on Apache Lucene, the layer that orchestrates data storage and retrieval. Lucene's architecture organizes data into segments, immutable units that undergo periodic merging. This structure allows for efficient management of inverted indices, essential for text search. With vector search, Lucene extends its capabilities to handle multi-dimensional points, employing the hierarchical navigable small world (HNSW) algorithm to index vectors. This approach facilitates scalability, enabling data sets to exceed available RAM size while maintaining performance. Additionally, Lucene's segment-based approach offers lock-free search operations, supporting incremental changes and ensuring visibility consistency across various data structures. The integration however comes with its own engineering challenges. Merging segments requires recomputing HNSW graphs, incurring index-time overhead. Searches must cover multiple segments, leading to possible latency overhead. Moreover, optimal performance requires scaling RAM as data grows, which may raise resource management concerns. Lucene's integration into Elasticsearch comes with the benefit of robust vector search capabilities. This includes aggregations, document level security, geo-spatial queries, pre-filtering, to full compatibility with various Elasticsearch features. Imagine running vector searches using a geo bounding box, this is an example usecase enabled by Elasticsearch and Lucene. Lucene's architecture lays a solid foundation for efficient and versatile vector search within Elasticsearch. Let’s explore optimization strategies and enhancements we have implemented to integrate vector search into Lucene, which delivers a high performance and comprehensive feature-set for developers. Harnessing Lucene's architecture for multi-threaded search Lucene's segmented architecture enables the implementation of multi-threaded search capabilities. Elasticsearch’s performance gains come from efficiently searching multiple segments simultaneously. Latency of individual searches is significantly reduced by using the processing power of all available CPU cores. While it may not directly improve overall throughput, this enhancement prioritizes minimizing response times, ensuring that users receive their search results as swiftly as possible. Furthermore, this optimization is particularly beneficial for Hierarchical Navigable Small World (HNSW) searches, as each graph is independent of the others and can be searched in parallel, maximizing efficiency and speeding up retrieval times even further. The advantage of having multiple independent segments extends to the architectural level, especially in serverless environments. In this new architecture, the indexing tier is responsible for creating new segments, each containing its own HSNW graph. The search tier can simply replicate these segments without incurring the CPU cost of indexation. This separation allows a significant portion of compute resources to be dedicated to searches, optimizing overall system performance and responsiveness. Accelerating multi-graph vector search In spite of gains achieved with parallelization, each segment's searches would remain independent, unaware of progress made by other segment searches. So our focus shifted towards optimizing the efficiency of concurrent searches across multiple segments. The graph shows that the number queries per second increased from 104 queries/sec to 219 queries/sec. Recognizing the potential for further speedups, we leveraged our insights from optimizing lexical search, to enable information exchange among segment searches allowing for better coordination and efficiency in vector search. Our strategy for accelerating multi-graph vector search revolves around balancing exploration and exploitation within the proximity graph. By adjusting the size of the expanded match set, we control the trade-off between runtime and recall, crucial for achieving optimal search performance across multiple graphs. In multi-graph search scenarios, the challenge lies in efficiently navigating individual graphs, while ensuring comprehensive exploration to avoid local minima. While searching multiple graphs independently yields higher recall, it incurs increased runtime due to redundant exploration efforts. To mitigate this, we devised a strategy to intelligently share state between searches, enabling informed traversal decisions based on global and local competitive thresholds. This approach involves maintaining shared global and local queues of distances to closest vectors, dynamically adapting search parameters based on the competitiveness of each graph's local search. By synchronizing information exchange and adjusting search strategies accordingly, we achieve significant improvements in search latency while preserving recall rates comparable to single-graph searches. The impact of these optimizations is evident in our benchmark results. In concurrent search and indexing scenarios, we notice up to 60% reduction in query latencies! Even for queries conducted outside of indexing operations, we observed notable speedups and a dramatic decrease in the number of vector operations required. These enhancements, integrated into Lucene 9.10 and subsequently Elasticsearch 8.13, mark significant strides towards enhancing vector database performance for search while maintaining excellent recall rates. Harnessing Java's latest advancements for ludicrous speed In the area of Java development, automatic vectorization has been a boon, optimizing scalar operations into SIMD (Single Instruction Multiple Data) instructions through the HotSpot C2 compiler. While this automatic optimization has been beneficial, it has its limitations, particularly in scenarios where explicit control over code shape yields superior performance. Enter Project Panama Vector API, a recent addition to the JDK offering an API for expressing computations reliably compiled to SIMD instructions at runtime. Lucene's vector search implementation relies on fundamental operations like dot product, square, and cosine distance, both in floating point and binary variants. Traditionally, these operations were backed by scalar implementations, leaving performance enhancements to the JIT compiler. However, recent advancements introduce a paradigm shift, enabling developers to express these operations explicitly for optimal performance. Consider the dot product operation, a fundamental vector computation. Traditionally implemented in Java with scalar arithmetic, recent innovations leverage the Panama Vector API to express dot product computations in a manner conducive to SIMD instructions. This revised implementation iterates over input arrays, multiplying and accumulating elements in batches, aligning with the underlying hardware capabilities. By harnessing Panama Vector API, Java code now interfaces seamlessly with SIMD instructions, unlocking the potential for significant performance gains. The compiled code, when executed on compatible CPUs, leverages advanced vector instructions like AVX2 or AVX 512, resulting in accelerated computations. Disassembling the compiled code reveals optimized instructions tailored to the underlying hardware architecture. Microbenchmarks comparing traditional Java implementations to those leveraging Panama Vector API illustrate dramatic performance improvements. Across various vector operations and dimension sizes, the optimized implementations outperform their predecessors by significant margins, offering a glimpse into the transformative power of SIMD instructions. Micro-benchmark comparing dot product with the new Panama API (dotProductNew) and the scalar implementation (dotProductOld). Beyond microbenchmarks, the real-world impact of these optimizations is quite exciting to think about. Vector search benchmarks, such as SO Vector, demonstrate notable enhancements in indexing throughput, merge times, and query latencies. Elasticsearch, embracing these advancements, incorporates the faster implementations by default, ensuring users reap the performance benefits seamlessly. The graph shows indexing throughput increased from about 900 documents/sec to about 1300 documents/sec. Despite the incubating status of Panama Vector API, its quality and potential benefits are undeniable. Lucene's pragmatic approach allows for selective adoption of non-final JDK APIs, balancing the promise of performance improvements with maintenance considerations. With Lucene and Elasticsearch, users can leverage these advancements effortlessly, with performance gains translating directly to real-world workloads. The integration of Panama Vector API into Java development yields a new era of performance optimization, particularly in vector search scenarios. By embracing hardware-accelerated SIMD instructions, developers can unlock efficiency gains, visible both in microbenchmarks and macro-level benchmarks. As Java continues to evolve, leveraging its latest features promises to propel performance to new heights, enriching user experiences across diverse applications. Maximizing memory efficiency with scalar quantization Memory consumption has long been a concern for efficient vector database operations, particularly for searching large datasets. Lucene introduces a breakthrough optimization technique - scalar quantization - aimed at significantly reducing memory requirements without sacrificing search performance. Consider a scenario where querying millions of float32 vectors of high dimensions demands substantial memory, leading to significant costs. By embracing byte quantization, Lucene slashes memory usage by approximately 75%, offering a viable solution to the memory-intensive nature of vector search operations. For quantizing floats to bytes, Lucene implements Scalar quantization a lossy compression technique that transforms raw data into a compressed form, sacrificing some information for space efficiency. Lucene's implementation of scalar quantization achieves remarkable space savings with minimal impact on recall, making it an ideal solution for memory-constrained environments. Lucene's architecture, consisting of nodes, shards, and segments, which facilitates efficient distribution and management of documents for search. Each segment stores raw vectors, quantized vectors, and metadata, ensuring optimized storage and retrieval mechanisms. Lucene's vector quantization adapts dynamically over time, adjusting quantiles during segment merge operations to maintain optimal recall. By intelligently handling quantization updates and re-quantization when necessary, Lucene ensures consistent performance while accommodating changes in data distribution. Example of merged quantiles where segments A and B have 1000 documents and C only has 100. Experimental results demonstrate the efficacy of scalar quantization in reducing memory footprint while maintaining search performance. Despite minor differences in recall compared to raw vectors, Lucene's quantized vectors offer significant speed improvements and recall recovery with minimal additional vectors. Recall@10 for quantized vectors vs raw vectors. The search performance of quantized vectors is significantly faster than raw, and recall is quickly recoverable by gathering just 5 more vectors; visible by quantized@15. Lucene's scalar quantization presents a revolutionary approach to memory optimization in vector search operations. With no need for training or optimization steps, Lucene seamlessly integrates quantization into its indexing process, automatically adapting to changes in data distribution over time. As Lucene and Elasticsearch continue to evolve, widespread adoption of scalar quantization will revolutionize memory efficiency for vector database applications, paving the way for enhanced search performance at scale. Achieving seamless compression with minimal impact on recall To make compression even better, we aimed to reduce each dimension from 7 bits to just 4 bits. Our main goal was to compress data further while still keeping search results accurate. By making some improvements, we managed to compress data by a factor of 8 without making search results worse. Here's how we did it. We focused on keeping search results accurate while making data smaller. By making sure we didn't lose important information during compression, we could still find things well even with less detailed data. To make sure we didn't lose any important information, we added a smart error correction system. We checked our compression improvements by testing them with different types of data and real search situations. This helped us see how well our searches worked with different compression levels and what we might lose in accuracy by compressing more. Comparison of int4 dot product values to the corresponding float values for a random sample of 100 documents and their 10 nearest neighbors. These compression features were created to easily work with existing vector search systems. They help organizations and users save space without needing to change much in their setup. With this simple compression, organizations can expand their search systems without wasting resources. In short, moving to 4 bits per dimension for scalar quantization was a big step in making compression more efficient. It lets users compress their original vectors by 8 times. By optimizing carefully, adding error correction, testing with real data, and offering scalable deployment, organizations could save a lot of storage space without making search results worse. This opens up new chances for efficient and scalable search applications. Paving the way for binary quantization The optimization to reduce each dimension to 4 bits not only delivers significant compression gains but also lays the groundwork for further advancements in compression efficiency. Specifically, future advancements like binary quantization into Lucene, a development that has the potential to revolutionize vector storage and retrieval. In an ongoing effort to push the boundaries of compression in vector search, we are actively working on integrating binary quantization into Lucene using the same techniques and principles that underpin our existing optimization strategies. The goal is to achieve binary quantization of vector dimensions, thereby reducing the size of the vector representation by a factor of 32 compared to the original floating-point format. Through our iterations and experiments, we want to deliver the full potential of vector search while maximizing resource utilization and scalability. Stay tuned for further updates on our progress towards integrating binary quantization into Lucene and Elasticsearch, and the transformative impact it will have on vector database storage and retrieval. Multi-vector integration in Lucene and Elasticsearch Several real world applications rely on text embedding models and large text inputs. Most embedding models have token limits, which necessitate chunking of longer text into passages. Therefore, instead of a single document, multiple passages and embeddings must be managed, potentially complicating metadata preservation. Now instead of having a single piece of metadata indicating, for example the first chapter of the book “Little Women”, you have to index that information data for every sentence. Lucene's \"join\" functionality, integral to Elasticsearch's nested field type, offers a solution. This feature enables multiple nested documents within a top-level document, allowing searches across nested documents and subsequent joins with their parent documents. So, how do we deliver support for vectors in nested fields with Elasticsearch? The key lies in how Lucene joins back to parent documents when searching child vector passages. The parallel concept here is the debate around pre-filtering versus post-filtering in kNN methods, as the timing of joining significantly impacts result quality and quantity. To address this, recent enhancements to Lucene enable pre-joining against parent documents while searching the HNSW graph. Practically, pre-joining ensures that when retrieving the k nearest neighbors of a query vector, the algorithm returns the k nearest documents instead of passages. This approach diversifies results without complicating the HNSW algorithm, requiring only a minimal additional memory overhead per stored vector. Efficiency is improved by leveraging certain restrictions, such as disjoint sets of parent and child documents and the monotonicity of document IDs. These restrictions allow for optimizations using bit sets, providing rapid identification of parent document IDs. Searching through a vast number of documents efficiently required investing in nested fields and joins in Lucene. This work helps storage and search for dense vectors that represent passages within long texts, making document searches in Lucene more effective. Overall, these advancements represent an exciting step forward in the area of vector database retrieval within Lucene. Wrapping up (for now) We're dedicated to making Elasticsearch and Lucene the best vector database with every release. Our goal is to make it easier for people to search for things. With some of the investments we discuss in this blog, there is significant progress, but we're not done! To say that the gen AI ecosystem is rapidly evolving is an understatement. At Elastic, we want to give developers the most flexible and open tools to keep up with all the innovation—with features available across recent releases until 8.13 and serverless Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Elasticsearch and Lucene report card: noteworthy speed and efficiency investments Harnessing Lucene's architecture for multi-threaded search Accelerating multi-graph vector search Harnessing Java's latest advancements for ludicrous speed Maximizing memory efficiency with scalar quantization Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Making Elasticsearch and Lucene the best vector database: up to 8x faster and 32x efficient - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-lucene-vector-database-gains","meta_description":"Discover the recent enhancements and optimizations that notably improve vector search performance in Elasticsearch & Lucene vector database."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. Vector Database US By: Ugo Sangiorgi On April 15, 2025 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector search with binary quantization: Elasticsearch with BBQ is 5x faster than OpenSearch with FAISS . Elastic has received requests from our community to clarify performance differences between Elasticsearch and OpenSearch, particularly in the realm of Semantic Search/Vector Search, so we conducted these performance tests to provide clear, data-driven comparisons. Binary quantization showdown Storing high-dimensional vectors in their original form can be memory-intensive. Quantization techniques compress these vectors into a compact representation, drastically reducing the memory footprint. The search then operates in the compressed space, which reduces the computational complexity and makes searches faster, especially in large datasets. Elastic is committed to making Lucene a top-performing Vector Engine. We introduced Better Binary Quantization (BBQ) in Elasticsearch 8.16 on top of Lucene and evolved it further in 8.18 and 9.0. BBQ is built on a new approach in scalar quantization that reduces float32 dimensions to bits, delivering ~95% memory reduction while maintaining high ranking quality. OpenSearch on the other hand uses multiple vector engines: nmslib (now deprecated), Lucene and FAISS. In a previous blog , we compared Elasticsearch and OpenSearch for vector search. We used three different datasets and tested different combinations of engines and configurations on both products. This blog focuses on the binary quantization algorithms currently available in both products. We tested Elasticsearch with BBQ and OpenSearch with FAISS’s Binary Quantization using the openai_vector Rally track. The main objective was to evaluate the performance of both solutions under the same level of recall. What does recall mean? Recall is a metric that measures how many of the relevant results are successfully retrieved by a search system. In this evaluation, recall@k is particularly important, where k represents the number of top results considered. Recall@10 , Recall@50 and Recall@100 therefore measure how many of the true relevant results appear in the top 10, 50 and 100 retrieved items, respectively. Recall is expressed on a scale from 0 to 1 (or 0% to 100% precision). And that is important because we are talking about Approximate KNN (ANN) and not Exact KNN, where recall is always 1 (100%). For each value of k we also specified n, which is the number of candidates considered before applying the final ranking. This means that for Recall@10, Recall@50, and Recall@100, the system first retrieves n candidates using the binary quantization algorithm and then ranks them to determine whether the top k results contain the expected relevant items. By controlling n , we can analyze the trade-off between efficiency and accuracy. A higher n typically increases recall, as more candidates are available for ranking, but it also increases latency and decreases throughput. Conversely, a lower n speeds up retrieval but may reduce recall if too few relevant candidates are included in the initial set. In this comparison, Elasticsearch demonstrated lower latency and higher throughput than OpenSearch on identical setups. Methodology The full configuration, alongside Terraform scripts, Kubernetes manifests and the specific Rally track is available in this repository under openai_vector_bq . As with previous benchmarks, we used a Kubernetes cluster composed of: 1 Node pool for Elasticsearch 9.0 with 3 e2-standard-32 machines (128GB RAM and 32 CPUs) 1 Node pool for OpenSearch 2.19 with 3 e2-standard-32 machines (128GB RAM and 32 CPUs) 1 Node pool for Rally with 2 e2-standard-4 machines (16GB RAM and 4 CPUs) We set up one Elasticsearch cluster version 9.0 and one OpenSearch cluster version 2.19. Both Elasticsearch and OpenSearch were tested with the exact same setup: we used openai_vector Rally track with some modifications - which uses 2.5 million documents from the NQ data set enriched with embeddings generated using OpenAI's text-embedding-ada-002 model . The results report on measured latency and throughput at different recall levels (recall@10, recall@50 and recall@100) using 8 simultaneous clients for performing search operations. We used a single shard and no replicas. We ran the following combinations of k-n-rescore, e.g. 10-2000-2000, or k:10 , n:2000 and rescore:2000 would retrieve the top k (10) over n candidates (2000) applying a rescore over 2000 results (which is equivalent of an “oversample factor” of 1). Each search ran for 10.000 times with 1000 searches as warmup: Recall@10 10-40-40 10-50-50 10-100-100 10-200-200 10-500-500 10-750-750 10-1000-1000 10-1500-1500 10-2000-2000 Recall@50 50-150-150 50-200-200 50-250-250 50-500-500 50-750-750 50-1000-1000 50-1200-1200 50-1500-1500 50-2000-2000 Recall@100 100-200-200 100-250-250 100-300-300 100-500-500 100-750-750 100-1000-1000 100-1200-1200 100-1500-1500 100-2000-2000 To replicate the benchmark, the Kubernetes manifests for both rally-elasticsearch and rally-opensearch have all the relevant variables externalized in a ConfigMap, available here (ES) and here (OS). The search_ops parameter can be customized to test any combination of k, n and rescore. OpenSearch Rally configuration /k8s/rally-openai_vector-os-bq.yml Opensearch index configuration The variables from the ConfigMap are then used on the index configuration, some parameters are left unchanged. 1-bit quantization in OpenSearch is configured by setting the compression level to “32x” . index-vectors-only-mapping-with-docid-mapping.json Elasticsearch Rally configuration /k8s/rally-openai_vector-es-bq.yml Elasticsearch index configuration index-vectors-only-mapping-with-docid-mapping.json Results There are multiple ways to interpret the results. For both latency and throughput, we plotted a simplified and a detailed chart at each level of recall. It’s easy to see differences if we consider “higher is better” for each metric. However, latency is a negative one (lower is actually better), while throughput is a positive one. For the simplified charts, we used (recall / latency) * 10000 (called simply “speed”) and recall * throughput , so both metrics mean more speed and more throughput are better. Let’s get to it. Recall @ 10 - simplified At that level of recall Elasticsearch BBQ is up to 5x faster (3.9x faster on average) and has 3.2x more throughput on average than OpenSearch FAISS. Recall @ 10 - Detailed task latency.mean throughput.mean avg_recall Elasticsearch-9.0-BBQ 10-100-100 11.70 513.58 0.89 Elasticsearch-9.0-BBQ 10-1000-100 27.33 250.55 0.95 Elasticsearch-9.0-BBQ 10-1500-1500 35.93 197.26 0.95 Elasticsearch-9.0-BBQ 10-200-200 13.33 456.16 0.92 Elasticsearch-9.0-BBQ 10-2000-2000 44.27 161.40 0.95 Elasticsearch-9.0-BBQ 10-40-40 10.97 539.94 0.84 Elasticsearch-9.0-BBQ 10-50-50 11.00 535.73 0.85 Elasticsearch-9.0-BBQ 10-500-500 19.52 341.45 0.93 Elasticsearch-9.0-BBQ 10-750-750 22.94 295.19 0.94 OpenSearch-2.19-faiss 10-100-100 35.59 200.61 0.94 OpenSearch-2.19-faiss 10-1000-1000 156.81 58.30 0.96 OpenSearch-2.19-faiss 10-1500-1500 181.79 42.97 0.96 OpenSearch-2.19-faiss 10-200-200 47.91 155.16 0.95 OpenSearch-2.19-faiss 10-2000-2000 232.14 31.84 0.96 OpenSearch-2.19-faiss 10-40-40 27.55 249.25 0.92 OpenSearch-2.19-faiss 10-50-50 28.78 245.14 0.92 OpenSearch-2.19-faiss 10-500-500 79.44 97.06 0.96 OpenSearch-2.19-faiss 10-750-750 104.19 75.49 0.96 Recall @ 50 - simplified At that level of recall Elasticsearch BBQ is up to 5x faster (4.2x faster on average) and has 3.9x more throughput on average than OpenSearch FAISS. Detailed Results - Recall @ 50 Task Latency Mean Throughput Mean Avg Recall Elasticsearch-9.0-BBQ 50-1000-1000 25.71 246.44 0.95 Elasticsearch-9.0-BBQ 50-1200-1200 28.81 227.85 0.95 Elasticsearch-9.0-BBQ 50-150-150 13.43 362.90 0.90 Elasticsearch-9.0-BBQ 50-1500-1500 33.38 202.37 0.95 Elasticsearch-9.0-BBQ 50-200-200 12.99 406.30 0.91 Elasticsearch-9.0-BBQ 50-2000-2000 42.63 163.68 0.95 Elasticsearch-9.0-BBQ 50-250-250 14.41 373.21 0.92 Elasticsearch-9.0-BBQ 50-500-500 17.15 341.04 0.93 Elasticsearch-9.0-BBQ 50-750-750 31.25 248.60 0.94 OpenSearch-2.19-faiss 50-1000-1000 125.35 62.53 0.96 OpenSearch-2.19-faiss 50-1200-1200 143.87 54.75 0.96 OpenSearch-2.19-faiss 50-150-150 43.64 130.01 0.89 OpenSearch-2.19-faiss 50-1500-1500 169.45 46.35 0.96 OpenSearch-2.19-faiss 50-200-200 48.05 156.07 0.91 OpenSearch-2.19-faiss 50-2000-2000 216.73 36.38 0.96 OpenSearch-2.19-faiss 50-250-250 53.52 142.44 0.93 OpenSearch-2.19-faiss 50-500-500 78.98 97.82 0.95 OpenSearch-2.19-faiss 50-750-750 103.20 75.86 0.96 Recall @ 100 At that level of recall Elasticsearch BBQ is up to 5x faster (average 4.6x faster) and has 3.9x more throughput on average than OpenSearch FAISS. Detailed Results - Recall @ 100 task latency.mean throughput.mean avg_recall Elasticsearch-9.0-BBQ 100-1000-1000 27.82 243.22 0.95 Elasticsearch-9.0-BBQ 100-1200-1200 31.14 224.04 0.95 Elasticsearch-9.0-BBQ 100-1500-1500 35.98 193.99 0.95 Elasticsearch-9.0-BBQ 100-200-200 14.18 403.86 0.88 Elasticsearch-9.0-BBQ 100-2000-2000 45.36 159.88 0.95 Elasticsearch-9.0-BBQ 100-250-250 14.77 433.06 0.90 Elasticsearch-9.0-BBQ 100-300-300 14.61 375.54 0.91 Elasticsearch-9.0-BBQ 100-500-500 18.88 340.37 0.93 Elasticsearch-9.0-BBQ 100-750-750 23.59 285.79 0.94 OpenSearch-2.19-faiss 100-1000-1000 142.90 58.48 0.95 OpenSearch-2.19-faiss 100-1200-1200 153.03 51.04 0.95 OpenSearch-2.19-faiss 100-1500-1500 181.79 43.20 0.96 OpenSearch-2.19-faiss 100-200-200 50.94 131.62 0.83 OpenSearch-2.19-faiss 100-2000-2000 232.53 33.67 0.96 OpenSearch-2.19-faiss 100-250-250 57.08 131.23 0.87 OpenSearch-2.19-faiss 100-300-300 62.76 120.10 0.89 OpenSearch-2.19-faiss 100-500-500 84.36 91.54 0.93 OpenSearch-2.19-faiss 100-750-750 111.33 69.95 0.94 Improvements on BBQ BBQ has come a long way since its first release. On Elasticsearch 8.16, for the sake of comparison, we included a benchmark run from 8.16 alongside the current one, and we can see how recall and latency have improved since then. In Elasticsearch 8.18 and 9.0, we rewrote the core algorithm for quantizing the vectors. So, while BBQ in 8.16 was good, the newest versions are even better. You can read about it here and here . In short, every vector is individually quantized through optimized scalar quantiles. As a result, users benefit from higher accuracy in vector search without compromising performance, making Elasticsearch’s vector retrieval even more powerful. Conclusion In this performance comparison between Elasticsearch BBQ and OpenSearch FAISS, Elasticsearch significantly outperforms OpenSearch for vector search, achieving up to 5x faster query speeds and 3.9x higher throughput on average across various levels of recall. Key findings include: Recall@10 : Elasticsearch BBQ is up to 5x faster (3.9x faster on average) and has 3.2x more throughput on average compared to OpenSearch FAISS. Recall@50 : Elasticsearch BBQ is up to 5x faster (4.2x faster on average) and has 3.9x more throughput on average compared to OpenSearch FAISS. Recall@100 : Elasticsearch BBQ is up to 5x faster (4.6x faster on average) and has 3.9x more throughput on average compared to OpenSearch FAISS. These results highlight the efficiency and performance advantages of Elasticsearch BBQ, particularly in high-dimensional vector search scenarios. The Better Binary Quantization (BBQ) technique, introduced in Elasticsearch 8.16, provides substantial memory reduction (~95%) while maintaining high ranking quality, making it a superior choice for large-scale vector search applications. At Elastic, we are relentlessly innovating to improve Apache Lucene and Elasticsearch to provide the best vector database for search and retrieval use cases, including RAG (Retrieval Augmented Generation). Our recent advancements have dramatically increased performance, making vector search faster and more space efficient than before, building upon the gains from Lucene 10. This blog is another illustration of that innovation. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Jump to Binary quantization showdown Methodology OpenSearch Rally configuration Opensearch index configuration Elasticsearch Rally configuration Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-bbq-vs-opensearch-faiss","meta_description":"A performance comparison between Elasticsearch BBQ and OpenSearch FAISS."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Automatically updating your Elasticsearch index using Node.js and an Azure Function App Learn how to update your Elasticsearch index automatically using Node.js and an Azure Function App. Follow these steps to ensure your index stays current. Javascript Python How To JG By: Jessica Garson On June 4, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Maintaining an up-to-date Elasticsearch index is crucial, especially when dealing with frequently changing dynamic datasets. This blog post will guide you through automatically updating your Elasticsearch index using Node.js and an Azure Function App. First, we'll load the data using Node.js and ensure it remains current through regular updates. Then, we'll leverage the capabilities of Azure Function Apps to automate these updates, thereby ensuring your index is always fresh and reliable. For this blog post, we will be using the Near Earth Object Web Service (NeoWs ), a RESTful web service offering detailed information about near-earth asteroids. By integrating NeoWs with Node.js services integrated as Azure serverless functions, this example will provide you with a robust framework to handle the complexities of managing dynamic data effectively. This approach will help you minimize the risks of working with outdated information and maximize the accuracy and usefulness of your data. Prerequisites This example uses Elasticsearch version 8.13; if you are new to Elasticsearch, check out our Quick Start on Elasticsearch . Any 8.0 version should work for this blog post. Download the latest NPM and Node.js version . This tutorial uses Node v21.6.1 and npm 10.5.0. An API key for NASA's APIs. An active Azure account with access to create a Function App. Access to the Azure portal or Azure CLI Setting up locally Before you begin indexing and loading your data locally, setting up your environment is essential. First, create a directory and initialize it. Then, download the necessary packages and create a .env file to store your configuration settings. This preliminary setup ensures your local environment is prepared to handle the data efficiently. You will be using the Elasticsearch node client to connect to Elastic, Axios to connect to the NASA APIs and dotenv to parse your secrets. You will want to download the required packages running the following commands: After downloading the required packages, you can create a . env file at the root of the project directory. The . env file allows you to keep your credentials secure locally. Check out the example .env file to learn more. To learn more about connecting to Elasticsearch, be sure to take a look at the documentation on the subject . To create a .env file, you can use this command at the root of your project: In your .env , be sure to have the following entered in. Be sure to add your complete endpoint: You will also want to create a new JavaScript file as well: Creating your index and loading your data in Now that you have set up the proper file structure and downloaded the required packages, you are ready to create a script that creates an index and loads data into the index. If you get stuck along the way be sure to check out the full version of the file you are creating in this section. In the file loading_data_into_a_index.js, configure the dotenv package to use the keys and tokens stored in your . env file. You should also import the Elasticsearch client to connect to Elasticsearch and Axios and make HTTP requests. Since your keys and tokens are currently stored as environment variables, you will want to retrieve them and create a client to authenticate to Elasticsearch. You can develop a function to retrieve data from NASA's NEO (Near Earth Object) Web Service asynchronously. You will first configure the base URL for the NASA API request and create date objects for today and the previous week to establish the query period. After you format these dates in the YYYY-MM-DD format required for the API request, set up the dates as query parameters and execute the GET request to the NASA API. Additionally, the function includes error-handling mechanisms to aid debugging should any issues arise. Now, you can create a function to transform the raw data from the NASA API into a structured format. Since the data you get back is currently nested in a complex JSON response. A more straightforward array of objects makes handling data easier. You will want to create an index to store the data from the API. An index inside Elasticsearch is where you can store your data in documents. In this function, you will check to see if an index exists and create a new one if needed. You will also specify the proper mapping of fields for your index. This function also loads the data into the index as documents and maps the id field from the NASA data to the _id field in Elasticsearch. You will want to create a main function to fetch, structure, and index the data. This function will also print out the number of records being uploaded and log whether the data is indexed, whether there is no data to index, or whether it failed to get data back from the NASA API. After creating the run function, you will want to call the function and catch any errors that may come up. You can now run the file from your command line by running the following: To confirm that your index has been successfully loaded, you can check in the Elastic Dev Tools by executing the following API call: Keeping your index updated with an Azure Function App Now that you've successfully loaded your data into your index locally, this data can quickly become outdated. To ensure your information remains current, you can set up an Azure Function App to automatically fetch new data daily and upload it to your Elasticsearch index. The first step is to configure your Function app in Azure Portal. A helpful resource for getting started is the Azure quick start guide . After you've set up your function, you can ensure that you have environment variables set up for ELASTICSEARCH_ENDPOINT , ELASTICSEARCH_API_KEY , and NASA_API_KEY . In Function Apps, environment variables are called Application settings. Inside your function app, click on the \"Configuration\" option in the left panel under \"Settings.\" Under\" the \"Application settings\" tab, click on \"+ New application setting.\" You will want to make sure the required libraries are installed as well. If you go to your terminal on the Azure Portal, you can install the necessary packages by entering the following: The packages you are installing should look very similar to the previous install, except you will be using the moment to parse dates, and you no longer need to load an env file since you just set your secrets to be Application settings. You can click where it says create to create a new function inside your Function App select the template entitled “Timer trigger”. You will now have a file called function.json set for you. You will want to adjust it to look as follows to run this application every day at 10 am. You'll also want to upload your package.json file and ensure it appears as follows: The next step is to create a index.js file. This script is designed to automatically update the data daily. It accomplishes this by systematically fetching and parsing new data each day and then seamlessly updating the dataset accordingly. Elasticsearch can use the same method to ingest time series or immutable data, such as webhook responses. This method ensures the information remains current and accurate, reflecting the latest available data.You can can check out the full code as well. The main differences between the script you run locally and this one are as follows: You will no longer need to load a .env file, since you have already set your environment variables There is also different logging designed more towards creating a more sustainable script You keep your index updated based on the most recent close approach date There is an entry point for an Azure Function App You will first want to set up your libraries and authenticate to Elasticsearch as follows: Afterward, you will want to obtain the last date update date from Elasticsearch and configure a backup method to get data from the past day if anything goes wrong. The following function connects to NASA's NEO (Near Earth Object) Web Service to get the data to keep your index updated. There is also some additional error handling that can capture any API errors that might come up. Now, you will want to create a function to organize your data by iterating over the objects of each date. Now, you will want to load your data into Elasticsearch using the bulk indexing operation. This function should look similar to the one in the previous section. Finally, you will want to create an entry point for the function that will run according to the timer you set. This function is similar to a main function, as it calls the functions created previously in the file. There is also some additional logging, such as printing the number of records and informing you if the data was indexed correctly. Conclusion Using Node.js and Azure's Function App , you should be able to ensure that your Elasticsearch index is updated regularly. By utilizing Node.js's capabilities in conjunction with Azure's Function App, you can efficiently maintain your index's regular updates. This powerful combination offers a streamlined, automated process, reducing the manual effort involved in keeping your index regularly updated. Full code for this example can be found on Search Labs GitHub . Let us know if you built anything based on this blog or if you have questions on our forums and the community Slack channel . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Prerequisites Setting up locally Creating your index and loading your data in Keeping your index updated with an Azure Function App Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Automatically updating your Elasticsearch index using Node.js and an Azure Function App - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-index-node-js-automatic-updates","meta_description":"Learn how to update your Elasticsearch index automatically using Node.js and an Azure Function App. Follow these steps to ensure your index stays current."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog GenAI for Customer Support — Part 2: Building a Knowledge Library This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! Inside Elastic CM By: Cory Mangini On July 22, 2024 Part of Series GenAI for customer support Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog series reveals how our Field Engineering team used the Elastic stack with generative AI to develop a lovable and effective customer support chatbot. If you missed other installments in the series, be sure to check out part one , part three , part four , the launch blog , and part five . Retrieval-Augmented Generation (RAG) Over a Fine Tuned Model As an engineering team, we knew that Elastic customers would need to trust a generative AI-based Support Assistant to provide accurate and relevant answers. Our initial proof of concept showed that large language model (LLM) foundational training was insufficient on technologies as technically deep and broad as Elastic. We explored fine-tuning our own model for the Support Assistant and instead landed on an RAG-based approach for several reasons. Easier with unstructured data: Fine-tuning required question-answer pairing that did not match our data set and would be challenging to do at the scale of our data. Real-time updates: Immediately incorporates new information by accessing up-to-date documents, ensuring current and relevant responses. Role-Based Access Control: A single user experience across roles restricts specific documents or sources based on the allowed level of access. Less maintenance: Search on the Support Hub and the Support Assistant share much of the same underlying infrastructure. We improved search results and a chatbot from the same work effort. Understanding the Support Assistant as a Search Problem We then formed a hypothesis that drove the technical development and testing of the Support Assistant. Providing more concise and relevant search results as context for the LLM will lead to stronger positive user sentiment by minimizing the chance of the model using unrelated information to answer the user's question. In order to test our team hypothesis, we had to reframe our understanding of chatbots in the context of search. Think of a support chatbot as a librarian. The librarian has access to an extensive pool of books (via search) and innately knows (the LLM) a bit about a broad range of topics. When asked a question, the librarian might be able to answer from their own knowledge but may need to find the appropriate book(s) to address questions about deep domain knowledge. Search extends the ”librarian” ability to find passages within the book in ways that have never existed before. The Dewey Decimal Classification enabled a searchable index of books. Personal computers evolved into a better catalog with some limited text search. RAG via Elasticsearch + Lucene enables the ability to not only find key passages within books across an entire library, but also to synthesize an answer for the user, often in less than a minute. The system is infinitely scalable as by adding more books to the library, the chances are stronger of having the required knowledge to answer a given question. The phrasing of the user input, prompts and settings like temperature (degree of randomness) still matter but we found that we can use search as a way to understand user intent and better augment the context passed on to the large language model for higher user satisfaction. Elastic Support’s Knowledge Library The body of knowledge that we draw from for both search and the Elastic Support Assistant depends on three key activities: the technical support articles our Support Engineers author, the product documentation and blogs that we ingest, and the enrichment service, which increases the search relevancy for each document in our hybrid search approach. It is also important to note that the answers to user questions often come from specific passages across multiple documents. This is a significant driver for why we chose to offer the Support Assistant. The effort for a user to find an answer in a specific paragraph across multiple documents is substantial. By extracting this information and sending it to the large language model, we both save the user time and return an answer in natural language that is easy to understand Technical Support Articles Elastic Support follows a knowledge-centered service approach where Support Engineers document their solutions and insights to cases so that our knowledge base (both internal and external) grows daily. This is run entirely on Elasticsearch on the back end and the EUI Markdown Editor control on the front end and is one of the key information sources for the Elastic Support Assistant. The majority of our work for the Support Assistant was to enable semantic search so we could take advantage of ELSER . Our prior architecture had two separate storage methods for knowledge articles. One was a Swiftype instance used for customer facing articles and the other was through Elastic Appsearch for internal Support team articles. This was tech debt on the part of our engineering team as Elasticsearch had already brought parity to several of the features we needed. Our new architecture takes advantage of document level security to enable role-based access from a single index source. Depending on the Elastic Support Assistant user, we could retrieve documents appropriate for them to use as part of the context sent to OpenAI. This can scale in the future to new roles as required. At times, we also find a need to annotate external articles with information for the Support team. To accommodate this, we developed an EUI plugin called private context. This finds multiline private context tags within the article that begin a block of text, parses them using regex to find private context blocks and then processes them as special things called AST nodes, of type privateContext . The result of these changes resulted in an index that we could use with ELSER for semantic search and nuanced access to information based on the role of the user. We performed multiple index migrations, resulting in a single document structure for both use cases. Each document contains four broad categories of fields. By storing the metadata and article content in the same JSON document, we can efficiently leverage different fields as needed. For our hybrid search approach in the Support Assistant, we use the title and summary fields for semantic search with BM25 on the much larger content field. This enables the Support Assistant to have both speed and high relevance to the text we will pass as context to the OpenAI GPT. Ingesting Product Documentation, Blogs & Search Labs Content Even though our technical support knowledge base has over 2,800 articles, we knew that there would be questions that these would not answer for users of the Elastic Support Assistant. For example: What new features would be available if I upgraded from Elastic Cloud 8.11 to 8.14? wouldn’t be present in technical support articles since it’s not a break-fix question or in the OpenAI model since 8.14 is past the model training date cutoff. We elected to address this by including more official Elastic sources, such as product documentation across all versions, Elastic blogs, Search/Security/Observability Labs and Elastic onboarding guides as the source for our semantic search implementation, similar to this example . By using semantic search to retrieve these docs when they were relevant, we enabled the Support Assistant to answer a much broader range of questions. The ingest process includes several hundred thousand documents and deals with complex site maps across Elastic properties. We elected to use a scraping and automation library called Crawlee in order to handle the scale and frequency needed to keep our knowledge library up to date. Each of the four crawler jobs executes on Google Cloud Run . We chose this because jobs can have a timeout of 24 hours and they can be scheduled without the use of Cloud Tasks or PubSub. Our needs resulted in a total of four jobs running in parallel, each with a base url that would capture a specific category of documents. When crawling websites we recommend starting with base urls that do not have overlapping content so as to avoid the ingestion of duplicates. This must be balanced with crawling at too high of a level and ingesting documents that aren't helpful to your knowledge store. For example, we crawl https://elastic.com/blog and https://www.elastic.co/search-labs/blog rather than elastic.co/ since our objective was technical documents. Even with the correct base url, we needed to account for different versions of the Elastic product docs (we have 114 unique versions across major/minors in our knowledge library). First, we built the table of contents for a product page in order to load and cache the different versions of the product. Our tech stack is a combination of Typescript with Node.js and Elastic's EUI for the front end components. We then load the table of contents for a product page and cache the versions of the product. If the product versions are already cached, then the function will do nothing. If the product versions are not cached, then the function will also enqueue all of the versions of the product page so that it can crawl all versions of the docs for the product. Request Handlers Since the structure of the documents we crawl can vary widely, we created a request handler for each document type. The request handler tells the crawler which CSS to parse as the body of the document. This creates consistency in the documents we store in Elasticsearch and captures the text that would be relevant. This is especially important for our RAG methodology as any filler text would also be searchable and could be returned incorrectly as a result for the context we send to the LLM. Blogs Request Handler This example is our most straightforward request handler. We specify that the crawler should look for a div element that matches the provided parameter. Any text within that div will be ingested as the content for the resulting Elasticsearch document. Product Documentation Request Handler In this product docs example, multiple css selectors contain text we want to ingest, giving each selector a list of possibilities. Text in one or more of these matching parameters will be included in the resulting document. The crawler also allows us to configure and send an authorization header, which prevents it from being denied access to scrape pages of all Elastic product documentation versions. Since we needed to anticipate that users of the Support Assistant might ask about any version of Elastic, it was crucial to capture enough documentation to account for nuances in each release. The product docs do have some duplication of content as a given page may not change across multiple product versions. We handled this by fine-tuning our search queries to default to the most current product docs unless otherwise specified by the user. The fourth blog will cover this in detail. Enriching Document Sources Our entire knowledge library at Elastic consists of over 300,000 documents. The documents varied widely in the type of metadata they had, if any at all. This created a need for us to enrich these documents so search would accommodate a larger range of user questions against them. At this scale, the team needed the process of enriching documents to be automated, simple and able to both backfill existing documents and to run on demand as new documents are created. We chose to use Elastic as a vector database and enable ELSER to power our semantic search – and generative ai to fill in the metadata gaps. ELSER Elastic ELSER (Elastic Learned Sparse Embedding Retrieval) enriches Elastic documents by transforming them into enriched embeddings that enhance search relevance and accuracy. This advanced embedding mechanism leverages machine learning to understand the contextual relationships within the data, going beyond traditional keyword-based search methods. This transformation allows for faster retrieval of pertinent information, even from large and complex datasets such as ours. What made ELSER a clear choice for our team was the ease of setup. We downloaded and deployed the model, created an ingest pipeline and reindexed our data. The result were enriched documents. How to install and run the support diagnostics troubleshooting utility is a popular technical support article. ELSER computed the vector database embeddings for both the title and summary since we use those with semantic search as part of our hybrid search approach. The result was stored in an Elastic doc as the ml field. Vector Embeddings for How to Install and Run… The embeddings in the ml field are stored as a keyword and vector pair. When a search query is issued, it is also converted into an embedding. Documents that have embeddings close to the query embedding are considered relevant and are retrieved and ranked accordingly. The example below is what the ELSER embeddings look like for the title field How to install and run the support diagnostics troubleshooting utility . Although, only the title is shown below, the field will also have all the vector embeddings for the summary. Summaries & Questions Semantic search could only be as effective as the quality of the document summaries. Our technical support articles have a summary written by support engineers but other docs that we ingested did not. Given the scale of our ingested knowledge, we needed an automated process to generate these. The simplest approach was to take the first 280 characters of each document and use that as the summary. We tested this and found that it led to poor search relevancy. One of our team’s engineers had the idea to instead use AI to do this. We created a new service which leveraged OpenAI GPT3.5 Turbo to backfill all of our documents which lacked a summary upon ingestion. In the future, we intend to test the output of other models to find what improvements we might see in the final summaries. As we have a private instance for GPT3.5 Turbo, we chose to use it in order to keep costs lows at the scale required. The service itself is straightforward and a result of finding and fine tuning an effective prompt. The prompt provides the large language model with a set of overall directions and then a specific subset of directions for each task. While more complex, this enabled us to create a Cloud Run job that loops through each doc in our knowledge library. The loop does the following tasks before moving onto the next document. Sends an API call to the LLM with the prompt and the text from the document's content field. Waits for a completed response (or handles any errors gracefully). Updates the summary and questions fields in the document. Runs the next document. Cloud Run allows us to control the number of concurrent workers so that we don't use all of the allocated threads to our LLM instance. Doing so would result in a timeout for any users of the Support Assistant, so we elected to backfill the existing knowledge library over a period of weeks -- starting with the most current product docs. Create the Overall Summary This section of the prompt outputs a summary that is as concise as possible while still maintaining accuracy. We achieve this through asking the LLM to take multiple passes at the text it generates and check for accuracy against the source document. Specific guidelines are indicated so that each document's outputs will be consistent. Try this prompt for yourself with an article to see the type of results it can generate. Then change one or more guidelines and run the prompt in a new chat to observe the difference in output. Create the Second Summary Type We create a second summary which enables us to search for specific passages of the overall text that will represent the article. In this use case, we try to maintain a closer output to the key sentences already within the document. Create a Set of Relevant Questions In addition to the summaries, we asked the GPT to generate a set of questions that would be relevant to the document. This will be used in several ways, including semantic-search based suggestions for the user. We are also testing the relevancy of including the question set in our hybrid search approach for the Elastic Support Assistant so that we search the title, summary, body content and question set. Support Assistant Demo Despite a large number of tasks and queries that run in the back end, we elected to keep the chat interface itself simple to use. A successful Support Assistant will work without friction and provide expertise that the user can trust. Our alpha build is shown below. Key Learnings The process of building our current knowledge library has not been linear. As a team we test new hypotheses daily and observe the behaviors of our Support Assistant users to understand their needs. We push code often to production and measure the impact so that we have small failures rather than feature and project level ones. Smaller, more precise context makes the LLM responses significantly more deterministic. We initially passed larger text passages as context to the user question. This decreased the accuracy of the results as the large language model would often pass over key sentences in favor of ones that didn’t answer the question. This transitioned search to become a problem of both finding the right documents and how these aligned with user questions. An RBAC strategy is essential for managing what data a given persona can access. Document level security reduced our infrastructure duplication, drove down deployment costs and simplified the queries we needed to run. As a team, we realized early on that our tech debt would prevent us from achieving a lovable experience for the Support Assistant. We collaborated closely with our product engineers and came up with a blueprint for using the latest Elastic features. We will write an in-depth blog about our transition from Swiftype and Appsearch to elaborate on this learning. Stay tuned! One search query does not cover the potential range of user questions. More on this in Part 4 (search and relevancy tuning for a RAG chatbot). We measured the user sentiment of responses and learned to interpret user intent much more effectively. In effect, what is the search question behind the user question? Understanding what our users search for plays a key role in how we enrich our data. Even at the scale of hundreds of thousands of documents, we still find gaps in our documented knowledge. By analyzing our user trends we are able to determine when to add new types of sources and better enrich our existing data to allow us to package together a context from multiple sources that the LLM can use for further elaboration. What's next? At the time of writing, we have vector embeddings for the more than 300,000 documents in our indices and over 128,000 ai-generated summaries with an average of 8 questions per document. Given that we only have ~8,000 technical support articles with human-written summaries, this was a 10x improvement for our semantic search results. Field Engineering has a roadmap of new ways to expand our knowledge library and stretch what's technically possible with our explicit search interface and the Elastic Support Assistant. For example, we plan to create an ingest and search strategy for technical diagrams and ingest Github issues for Elastic employees. Creating the knowledge sources was just one step of our journey with the Elastic Support Assistant. Read about our initial GenAI experiments in the first blog here . In the third blog , we dive into the design and implementation of the user experience. Following that, our fourth blog discusses our strategies for tuning search relevancy to provide the best context to LLMs. Stay tuned for more insights and inspiration for your own generative AI projects! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau Inside Elastic February 12, 2025 Elasticsearch: 15 years of indexing it all, finding what matters Elasticsearch just turned 15-years-old! Take a look back at the last 15 years of indexing and searching, and turn to the next 15 years of relevance. SB PK By: Shay Banon and Philipp Krenn Inside Elastic January 13, 2025 Ice, ice, maybe: Measuring searchable snapshots performance Learn how Elastic’s searchable snapshots enable the frozen tier to perform on par with the hot tier, demonstrating latency consistency and reducing costs. US RO GK +1 By: Ugo Sangiorgi , Radovan Ondas , George Kobar and 1more Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Jump to Retrieval-Augmented Generation (RAG) Over a Fine Tuned Model Understanding the Support Assistant as a Search Problem Elastic Support’s Knowledge Library Technical Support Articles Ingesting Product Documentation, Blogs & Search Labs Content Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"GenAI for Customer Support — Part 2: Building a Knowledge Library - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/genai-customer-support-building-a-knowledge-library","meta_description":"This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time!"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog A quick introduction to vector search This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. Vector Database VC By: Valentin Crettaz On February 6, 2025 Part of Series Vector search introduction and implementation Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. This first part focuses on providing a general introduction to the basics of embedding vectors and how vector search works under the hood. Armed with all the knowledge learned in the first article, the second part will guide you through the meanders of how to set up vector search in Elasticsearch. In the third part , we’ll leverage what we’ve learned in the first two parts, build upon that knowledge, and delve into how to craft powerful hybrid search queries in Elasticsearch. Before we dive into the real matter of this article, let’s go back in time and review some of the history of vectors, which is a keystone concept in semantic search. Vectors are not new Pretty sure everyone would agree that since the advent of ChatGPT in November 2022, not a single day goes by without hearing or reading about “vector search.” It’s everywhere and so prevalent that we often get the impression this is a new cutting-edge technology that just came out, yet the truth is that this technology has been around for more than six decades! Research on the subject began in the mid-1960s, and the first research papers were published in 1978 by Gerard Salton, an information retrieval pundit, and his colleagues at Cornell University. Salton’s work on dense and sparse vector models constitutes the root of modern vector search technology. In the last 20 years, many different vector DBMS based on his research have been created and brought to market. These include Elasticsearch powered by the Apache Lucene project, which started working on vector search in 2019. Vectors are now everywhere and so pervasive that it is important to first get a good grasp of their underlying theory and inner workings before playing with them. Before we dive into that, let’s quickly review the differences between lexical search and vector search so we can better understand how they differ and how they can complement each other. Vector search vs. lexical search An easy way to introduce vector search is by comparing it to the more conventional lexical search that you’re probably used to. Vector search, also commonly known as semantic search, and lexical search work very differently. Lexical search is the kind of search that we’ve all been using for years in Elasticsearch. To summarize it very briefly, it doesn’t try to understand the real meaning of what is indexed and queried, instead, it makes a big effort to lexically match the literals of the words or variants of them (think stemming, synonyms, etc.) that the user types in a query with all the literals that have been previously indexed into the database using similarity algorithms, such as TF-IDF Figure 1: A simple example of a lexical search As we can see, the three documents to the top left are tokenized and analyzed. Then, the resulting terms are indexed in an inverted index, which simply maps the analyzed terms to the document IDs containing them. Note that all of the terms are only present once and none are shared by any document. Searching for “nice german teacher” will match all three documents with varying scores, even though none of them really catches the true meaning of the query. As can be seen in Figure 2, below, it gets even trickier when dealing with polysemy or homographs, i.e., words that are spelled the same but have different meanings (right, palm, bat, mean, etc.) Let’s take the word “right” which can mean three different things, and see what happens. Figure 2: Searching for homographs Searching for “I’m not right” returns a document that has the exact opposite meaning as the first returned result. If you search for the exact same terms but order them differently to produce a different meaning, e.g., “turn right” and “right turn,” it yields the exact same result (i.e., the third document “Take a right turn”). Granted, our queries are overly simplified and don’t make use of the more advanced queries such as phrase matching, but this helps illustrate that lexical search doesn’t understand the true meaning behind what’s indexed and what’s searched. If that’s not clear, don’t fret about it, we’ll revisit this example in the third article to see how vector search can help in this case. To do some justice to lexical search, when you have control over how you index your structured data (think mappings, text analysis, ingest pipelines, etc.) and how you craft your queries (think cleverly crafted DSL queries, query term analysis, etc.), you can do wonders with lexical search engines, there’s no question about it! The track records of Elasticsearch regarding its lexical search capabilities are just amazing. What it has achieved and how much it has popularized and improved the field of lexical search over the past few years is truly remarkable. However, when you are tasked to provide support for querying unstructured data (think images, videos, audios, raw text, etc.) to users who need to ask free-text questions, lexical search falls short. Moreover, sometimes the query is not even text, it could be an image, as we’ll see shortly. The main reason why lexical search is inadequate in such situations is that unstructured data can neither be indexed nor queried the same way as structured data. When dealing with unstructured data, semantics comes into play. What does semantics mean? Very simply, the meaning! Let’s take the simple example of an image search engine (e.g., Google Image Search or Lens). You drag and drop an image, and the Google semantic search engine will find and return the most similar images to the one you queried. In Figure 3, below, we can see on the left side the picture of a German shepherd and to the right all the similar pictures that have been retrieved, with the first result being the same picture as the provided one (i.e., the most similar one). Figure 3: Searching for a picture. Source: Google Image Search, https://www.google.com/imghp Even if this sounds simple and logical for us humans, for computers it’s a whole different story. That’s what vector search enables and helps to achieve. The power unlocked by vector search is massive, as the world has recently witnessed. Let’s now lift the hood and discover what hides underneath. Embedding vectors As we’ve seen earlier, with lexical search engines, structured data such as text can easily be tokenized into terms that can be matched at search time, regardless of the true meaning of the terms. Unstructured data, however, can take different forms, such as big binary objects (images, videos, audios, etc.), and are not at all suited for the same tokenizing process. Moreover, the whole purpose of semantic search is to index data in such a way that it can be searched based on the meaning it represents. How do we achieve that? The answer lies in two words: Machine Learning ! Or more precisely Deep Learning! Deep Learning is a specific area of machine learning that relies on models based on artificial neural networks made of multiple layers of processing that can progressively extract the true meaning of the data. The way those neural network models work is heavily inspired by the human brain. Figure 4, below, shows what a neural network looks like, with its input and output layers as well as multiple hidden layers: Figure 4: Neural network layers. Source: IBM, https://www.ibm.com/topics/neural-networks The true feat of neural networks is that they are capable of turning a single piece of unstructured data into a sequence of floating point values, which are known as embedding vectors or simply embeddings . As human beings, we can pretty well understand what vectors are as long as we visualize them in a two or three-dimensional space. Each component of the vector represents a coordinate in a 2D x-y plane or a 3D x-y-z space. However, the embedding vectors on which neural network models work can have several hundreds or even thousands of dimensions and simply represent a point in a multi-dimensional space. Each vector dimension represents a feature , or a characteristic, of the unstructured data. Let’s illustrate this with a deep learning model that turns images into embedding vectors of 2048 dimensions. That model would turn the German shepherd picture we used in Figure 3 into the embedding vector shown in the table below. Note that we only show the first and last three elements, but there would be 2,042 more columns/dimensions in the table. is_red is_dog blue_sky … no_gras german_shepherd is_tree German shepherd embeddings 0.0121 0.9572 0.8735 … 0.1198 0.9712 0.0512 Each column is a dimension of the model and represents a feature, or characteristic, that the underlying neural network seeks to modelize. Each input given to the model will be characterized depending on how similar that input is to each of the 2048 dimensions. Hence, the value of each element in the embedding vector denotes the similarity of that input to a specific dimension. In this example, we can see that the model detected a high similarity between dogs and German shepherds and also the presence of some blue sky. In contrast to lexical search, where a term can either be matched or not, with vector search we can get a much better sense of how similar a piece of unstructured data is to each of the dimensions supported by the model. As such, embedding vectors serve as a fantastic semantic representation of unstructured data. The secret sauce Now that we know how unstructured data is sliced and diced by deep learning neural networks into embedding vectors that capture the similarity of the data along a high number of dimensions, we need to understand how the matching of those vectors works. It turns out that the answer is pretty simple. Embedding vectors that are close to one another represent semantically similar pieces of data. So, when we query a vector database, the search input (image, text, etc.) is first turned into an embedding vector using the same model that has been used for indexing all the unstructured data, and the ultimate goal is to find the nearest neighboring vectors to that query vector. Hence, all we need to do is figure out how to measure the “distance” or “similarity” between the query vector and all the existing vectors indexed in the database, that’s pretty much it. Distance and similarity Luckily for us, measuring the distance between two vectors is an easy problem to solve thanks to vector arithmetics. So, let’s look at the most popular distance and similarity functions that are supported by modern vector search databases, such as Elasticsearch. Warning, math ahead! L1 distance The L1 distance, also called the Manhattan distance, of two vectors x and y is measured by summing up the pairwise absolute difference of all their elements. Obviously, the smaller the distance d, the closer the two vectors are. The formula is pretty simple, as can be seen below: Visually, the L1 distance can be illustrated as shown in Figure 5, below: Figure 5: Visualizing the L1 distance between two vectors Let’s take two vectors x and y, such as x = (1, 2) and y = (4, 3), then the L1 distance of both vectors would be | 1 - 4 | + | 2 - 3 | = 4. L2 distance The L2 distance, also called the Euclidean distance, of two vectors x and y is measured by first summing up the square of the pairwise difference of all their elements and then taking the square root of the result. It’s basically the shortest path between two points (also called hypotenuse). Similarly to L1, the smaller the distance d, the closer the two vectors are: The L2 distance is shown in Figure 6 below: Figure 6: Visualizing the L2 distance between two vectors Let’s reuse the same two sample vectors x and y as we used for the L1 distance, and we can now compute the L2 distance as ( 1 − 4 ) 2 + ( 2 − 3 ) 2 = 10 (1 - 4)^2 + (2 - 3)^2 = 10 ( 1 − 4 ) 2 + ( 2 − 3 ) 2 = 10 . Taking the square root of 10 would yield 3.16. Linf distance The Linf (for L infinity) distance, also called the Chebyshev or chessboard distance, of two vectors x and y is simply defined as the longest distance between any two of their elements or the longest distance measured along one of the axis/dimensions. The formula is very simple and shown below: A representation of the Linf distance is shown in Figure 7 below: Figure 7: Visualizing the Linf distance between two vectors Again, taking the same two sample vectors x and y, we can compute the L infinity distance as max ( | 1 - 4 | , | 2 - 3 | ) = max (3, 1) = 3. Cosine similarity In contrast to L1, L2, and Linf, cosine similarity does not measure the distance between two vectors x and y, but rather their relative angle, i.e., whether they are both pointing in roughly the same direction. The higher the similarity s, the “closer” the two vectors are. The formula is again very simple and shown below: A way to represent the cosine similarity between two vectors is shown in Figure 8 below: Figure 8: Visualizing the cosine similarity between two vectors Furthermore, as cosine values are always in the [-1, 1] interval, -1 means opposite similarity (i.e., a 180° angle between both vectors), 0 means unrelated similarity (i.e., a 90° angle), and 1 means identical (i.e., a 0° angle), as shown in Figure 9 below: Figure 9: The cosine similarity spectrum Once again, let’s reuse the same sample vectors x and y and compute the cosine similarity using the above formula. First, we can compute the dot product of both vectors as ( 1 ⋅ 4 ) + ( 2 ⋅ 3 ) = 10 (1 · 4) + (2 · 3) = 10 ( 1 ⋅ 4 ) + ( 2 ⋅ 3 ) = 10 . Then, we multiply the length (also called magnitude) of both vectors: ( 1 2 + 2 2 ) 1 / 2 + ( 4 2 + 3 2 ) 1 / 2 = 11.18034. (1^2 + 2^2)^{1/2}+ (4^2 + 3^2)^{1/2} = 11.18034. ( 1 2 + 2 2 ) 1/2 + ( 4 2 + 3 2 ) 1/2 = 11.18034. Finally, we divide the dot product by the multiplied length 10 / 11.18034 = 0.894427 (i.e., a 26° angle), which is quite close to 1, so both vectors can be considered pretty similar. Dot product similarity One drawback of cosine similarity is that it only takes into account the angle between two vectors but not their magnitude (i.e., length), which means that if two vectors point roughly in the same direction but one is much longer than the other, both will still be considered similar. Dot product similarity, also called scalar or inner product, improves that by taking into account both the angle and the magnitude of the vectors, which provides for a much more accurate similarity metric. Two equivalent formulas are used to compute dot product similarity. The first is the same as we’ve seen in the numerator of cosine similarity earlier: The second formula simply multiplies the length of both vectors by the cosine of the angle between them: Dot product similarity is visualized in Figure 10, below: Figure 10: Visualizing the dot product similarity between two vectors One last time, we take the sample x and y vectors and compute their dot product similarity using the first formula, as we did for the cosine similarity earlier, as (1 ⋅ · ⋅ 4) + (2 ⋅ · ⋅ 3) = 10. Using the second formula, we multiply the length of both vectors: ( 1 2 + 2 2 ) 1 / 2 + ( 4 2 + 3 2 ) 1 / 2 = 11.18034 (1^2 + 2^2)^{1/2}+ (4^2 + 3^2)^{1/2} = 11.18034 ( 1 2 + 2 2 ) 1/2 + ( 4 2 + 3 2 ) 1/2 = 11.18034 and multiply that by the cosine of the 26° angle between both vectors, and we get 11.18034 ⋅ · ⋅ cos(26°) = 10. One thing worth noting is that if all the vectors are normalized first (i.e., their length is 1), then the dot product similarity becomes exactly the same as the cosine similarity (because |x| |y| = 1), i.e., the cosine of the angle between both vectors. As we’ll see later, normalizing vectors is a good practice to adopt in order to make the magnitude of the vector irrelevant so that the similarity simply focuses on the angle. It also speeds up the distance computation at indexing and query time, which can be a big issue when operating on billions of vectors. Quick recap Wow, we’ve been through a LOT of information so far, so let’s halt for a minute and make a quick recap of where we stand. We’ve learned that… …semantic search is based on deep learning neural network models that excel at transforming unstructured data into multi-dimensional embedding vectors. …each dimension of the model represents a feature or characteristic of the unstructured data. …an embedding vector is a sequence of similarity values (one for each dimension) that represent how similar to each dimension a given piece of unstructured data is. …the “closer” two vectors are (i.e., the nearest neighbors), the more they represent semantically similar concepts. …distance functions (L1, L2, Linf) allow us to measure how close two vectors are. …similarity functions (cosine and dot product) allow us to measure how much two vectors are heading in the same direction. Now, the last remaining piece that we need to dive into is the vector search engine itself. When a query comes in, the query is first vectorized, and then the vector search engine finds the nearest neighboring vectors to that query vector. The brute-force approach of measuring the distance or similarity between the query vector and all vectors in the database can work for small data sets but quickly falls short as the number of vectors increases. Put differently, how can we index millions, billions, or even trillions of vectors and find the nearest neighbors of the query vector in a reasonable amount of time? That’s where we need to get smart and figure out optimal ways of indexing vectors so we can zero in on the nearest neighbors as fast as possible without degrading precision too much. Vector search algorithms and techniques Over the years, many different research teams have invested a lot of effort into developing very clever vector search algorithms. Here, we’re going to briefly introduce the main ones. Depending on the use case, some are better suited than others. Linear search We briefly touched upon linear search, or flat indexing, earlier when we mentioned the brute-force approach of comparing the query vector with all vectors present in the database. While it might work well on small datasets, performance decreases rapidly as the number of vectors and dimensions increase (O(n) complexity). Luckily, there are more efficient approaches called approximate nearest neighbor (ANN) where the distances between embedding vectors are pre-computed and similar vectors are stored and organized in a way that keeps them close together, for instance using clusters, trees, hashes, or graphs. Such approaches are called “approximate” because they usually do not guarantee 100% accuracy. The ultimate goal is to either reduce the search scope as much and as quickly as possible in order to focus only on areas that are most likely to contain similar vectors or to reduce the vectors’ dimensionality . K-Dimensional trees A K-Dimensional tree, or KD tree, is a generalization of a binary search tree that stores points in a k-dimensional space and works by continuously bisecting the search space into smaller left and right trees where the vectors are indexed. At search time, the algorithm simply has to visit a few tree branches around the query vector (the red point in Figure 11) in order to find the nearest neighbor (the green point in Figure 11). If more than k neighbors are requested, then the yellow area is extended until the algorithm finds more neighbors. Figure 11: KD tree algorithm. Source: https://salzi.blog/2014/06/28/kd-tree-and-nearest-neighbor-nn-search-2d-case/ The biggest advantage of the KD tree algorithm is that it allows us to quickly focus only on some localized tree branches, thus eliminating most of the vectors from consideration. However, the efficiency of this algorithm decreases as the number of dimensions increases because many more branches need to be visited than in lower-dimensional spaces. Inverted file index The inverted file index (IVF) approach is also a space-partitioning algorithm that assigns vectors close to each other to their shared centroid. In the 2D space, this is best visualized with a Voronoi diagram as shown in Figure 12: Figure 12: Voronoi representation of an inverted file index in the 2D space. Source: https://docs.zilliz.com/docs/vector-index-basics-and-the-inverted-file-index We can see that the above 2D space is partitioned into 20 clusters, each having its centroid denoted as black dots. All embedding vectors in the space are assigned to the cluster whose centroid is closest to them. At search time, the algorithm first figures out the cluster to focus on by finding the centroid that is closest to the query vector, and then it can simply zero in on that area, and the surrounding ones as well if needed, in order to find the nearest neighbors. This algorithm suffers from the same issue as KD trees when used in high-dimensional spaces. This is called the curse of dimensionality, and it occurs when the volume of the space increases so much that all the data seems sparse and the amount of data that would be required to get more accurate results grows exponentially. When the data is sparse, it becomes harder for these space-partitioning algorithms to organize the data into clusters. Luckily for us, there are other algorithms and techniques that alleviate this problem, as detailed below. Quantization Quantization is a compression -based approach that allows us to reduce the total size of the database by decreasing the precision of the embedding vectors. This can be achieved using scalar quantization (SQ) by converting the floating point vector values into integer values. This not only reduces the size of the database by a factor of 8 but also decreases memory consumption and speeds up the distance computation between vectors at search time. Another technique is called product quantization (PQ), which first divides the space into lower-dimensional subspaces, and then vectors that are close together are grouped in each subspace using a clustering algorithm (similar to k-means). Note that quantization is different from dimensionality reduction , where the number of dimensions is reduced, i.e., the vectors simply become shorter. Hierarchical Navigable Small Worlds (HNSW) If it looks complex just by reading the name, don’t worry, it’s not really! In short, Hierarchical Navigable Small Worlds is a multi-layer graph-based algorithm that is very popular and efficient. It is used by many different vector databases, including Apache Lucene. A conceptual representation of HNSW can be seen in Figure 13, below. Figure 13: Hierarchical Navigable Small Worlds. Source: https://towardsdatascience.com/similarity-search-part-4-hierarchical-navigable-small-world-hnsw-2aad4fe87d37 On the top layer, we can see a graph of very few vectors that have the longest links between them, i.e., a graph of connected vectors with the least similarity. The more we dive into lower layers, the more vectors we find and the denser the graph becomes, with more and more vectors closer to one another. At the lowest layer, we can find all the vectors, with the most similar ones being located closest to one another. At search time, the algorithm starts from the top layer at an arbitrary entry point and finds the vector that is closest to the query vector (shown by the gray point). Then, it moves one layer below and repeats the same process, starting from the same vector that it left in the above layer, and so on, one layer after another, until it reaches the lowest layer and finds the nearest neighbor to the query vector. Locality-sensitive hashing (LSH) In the same vein as all the other approaches presented so far, locality-sensitive hashing seeks to drastically reduce the search space in order to increase the retrieval speed. With this technique, embedding vectors are transformed into hash values, all by preserving the similarity information, so that the search space ultimately becomes a simple hash table that can be looked up instead of a graph or tree that needs to be traversed. The main advantage of hash-based methods is that vectors containing an arbitrary (big) number of dimensions can be mapped to fixed-size hashes, which enormously speeds up retrieval time without sacrificing too much precision. There are many different ways of hashing data in general, and embedding vectors in particular, but this article will not dive into the details of each of them. Conventional hashing methods usually produce very different hashes for data that seem very similar. Since embedding vectors are composed of float values, let’s take two sample float values that are considered to be very close to one another in vector arithmetic (e.g., 0.73 and 0.74) and run them through a few common hashing functions. Looking at the results below, it’s pretty obvious that common hashing functions do not retain the similarity between the inputs. Hashing function 0.73 0.74 MD5 1342129d04cd2924dd06cead4cf0a3ca 0aec1b15371bd979cfa66b0a50ebecc5 SHA1 49d2c3e0e44bff838e1db571a121be5ea874e8d9 a534e76482ade9d9fe4bff3035a7f31f2f363d77 SHA256 99d03fc3771fe6848d675339fc49eeb1cb8d99a12e6358173336b99a2ec530ea 5ecbc825ba5c16856edfdaf0abc5c6c41d0d8a9c508e34188239521dc7645663 While conventional hashing methods try to minimize hashing collisions between similar data pieces, the main objective of locality-sensitive hashing is to do exactly the opposite, i.e., to maximize hashing collisions so that similar data falls within the same bucket with a high probability. By doing so, embedding vectors that are close together in a multi-dimensional space will be hashed to a fixed-size value falling in the same bucket. Since LSH allows those hashed vectors to retain their proximity, this technique comes in very handy for data clustering and nearest neighbor searches. All the heavy lifting happens at indexing time when the hashes need to be computed, while at search time we only need to hash the query vector in order to look up the bucket that contains the closest embedding vectors. Once the candidate bucket is found, a second round usually takes place to identify the nearest neighboring vectors to the query vector. Let’s conclude In order to introduce vector search, we had to cover quite some ground in this article. After comparing the differences between lexical search and vector search, we’ve learned how deep learning neural network models manage to capture the semantics of unstructured data and transcode their meaning into high-dimensional embedding vectors, a sequence of floating point numbers representing the similarity of the data along each of the dimensions of the model. It is also worth noting that vector search and lexical search are not competing but complementary information retrieval techniques (as we’ll see in the third part of this series when we’ll dive into hybrid search). After that, we introduced a fundamental building block of vector search, namely the distance (and similarity) functions that allow us to measure the proximity of two vectors and assess the similarity of the concepts they represent. Finally, we’ve reviewed different flavors of the most popular vector search algorithms and techniques, which can be based on trees, graphs, clusters, or hashes, whose goal is to quickly narrow in on a specific area of the multi-dimensional space in order to find the nearest neighbors without having to visit the entire space like a linear brute-force search would do. If you like what you’re reading, make sure to check out the other parts of this series: Part 2: How to Set Up Vector Search in Elasticsearch Part 3: Hybrid Search Using Elasticsearch Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Vectors are not new Vector search vs. lexical search Embedding vectors The secret sauce Distance and similarity Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"A quick introduction to vector search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/introduction-to-vector-search","meta_description":"Learn about vector search (aka semantic search), including the basics of vectors, how vector search works, and how it differs from lexical search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog What is semantic reranking and how to use it? Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines. ML Research Search Relevance TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou On October 29, 2024 Part of Series Semantic reranking & the Elastic Rerank model Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this series of blogs we'll introduce Elastic's new semantic reranker. Semantic reranking often improves relevance, particularly in a zero-shot setting. It can also be used to trade-off indexing compute cost for querying compute cost by significantly improving lexical retrieval relevance. In this first blog we set the scene with some background on semantic reranking and how it can fit into your search and RAG pipelines. In the second installment, we introduce you to Elastic Rerank: Elastic's new semantic re-ranker model we've trained and released in technical preview. Retrieval Typically, text search is broken down into multiple stages, which gradually filter the result set into the final list that is presented to a user (or an LLM). The first stage is called retrieval and must be able to scale to efficiently compare the query text with a very large corpus of candidate matches. This limits the set of approaches that one can consider. For many years, the only paradigm available for retrieval was lexical. Here documents and queries are treated as bags of words and a statistical model is used to deduce relevance. The most popular option in this camp is BM25. For this choice, the query can be efficiently compared with a huge document corpus using inverted indices together with clever optimisations to prune non-competitive candidates. It remains a useful option since many queries, such as keyword searches and exact phrase matching, are well aligned with this model and it is easy to efficiently apply filtering predicates at the same time. The scoring is also tailored to the corpus characteristics which makes it a strong baseline when no tuning is applied. Finally, it is particularly efficient from an indexing perspective: no model inference needs to run, updating index data structures is very efficient and a lot of state can permanently reside on disk. In recent years, semantic retrieval has seen a surge in popularity. There are multiple flavors of this approach; for example, dense passage , learned sparse and late interaction retrieval. In summary, they use a transformer model to independently create representations of the query and each document and define a distance function on these representations to capture semantic similarity. For example, the query and document might be both embedded into a high dimensional vector space where queries and their relevant documents have low angular separation. These sorts of approaches have different strengths to BM25: they can find matches that require understanding synonyms, where the context is important to determine word meanings, where there are misspellings and so on. They also allow for wholly new relevance signals, such as embedding images and text in a common vector space. Queries can be efficiently compared against very large corpuses of documents by dropping the requirement that exact nearest neighbor sets are found. Data structures like HNSW can be used to find most of the best matches in logarithmic complexity in the corpus size. Intelligent compression schemes allow significant amounts of data to reside on disk. However, it is worth noting that model inference must be run on all documents before indexing and these data structures are relatively expensive to build in comparison to inverted indices. A lot of work has been done to improve the training of general purpose semantic retrieval models and indeed the best models significantly outperform BM25 in benchmarks that try and assess zero shot retrieval quality. Semantic reranking So far we have discussed methods that independently create representations of the query and document. This choice is necessary to scale retrieval. However, given the top N results returned by first stage retrieval we don't have the same constraint. The work that must be done to compare the query and these top N results is naturally much smaller, so we can consider new approaches for reordering them in order to improve relevance of the final result. This task is called reranking. We define semantic reranking as using a model to assess the semantic similarity of a query and one (or more) document text(s). This is to distinguish it from other reranking methods such as learn to rank , which typically use a variety of features to model user preference. Note that what constitutes semantic similarity can vary from task to task: for example, finding similar documents requires assessing similarity of two texts, whereas answering a question requires understanding if the necessary information is contained in the document text. In principle, any semantic first stage retrieval method can be used for reranking. For example, ELSER could be used to rerank the top results of a BM25 search. Keep in mind though that there may be blindspots in BM25 retrieval, which tends to have lower recall than semantic retrieval, and no reranking method will be able to fix these. It is therefore important to evaluate setups like these on your own data. From a performance standpoint, one is trading indexing compute cost for querying compute cost and possibly latency. Compared to semantic retrieval, in addition to the cost of embedding the query, you must also embed each document you want to rerank. This can be a good cost trade-off if you have a very large corpus and/or one which is frequently updated and relatively few queries per second. Furthermore, for GPUs the extra cost is partly amortized by the fact that the document inferences can be processed as a batch, which allows for better utilization. However, there is little cost benefit compared to a model which gets to see the query and the document at the same time. This approach is called cross-encoding, as opposed to bi-encoding which is used for semantic retrieval, and can bring significant benefits. Cross-encoders For cross-encoders both the query and a document text are presented together to the model concatenated with a special separation token. The model itself returns a similarity score. Schematically, the text is modeled something like the following: In a bi-encoder the query and the document are first embedded individually and then compared using a simple similarity function. Schematically, the text is modeled something like the following: Reranking for a fixed query with a cross-encoder is framed as a regression problem. The model outputs a numerical scores for each query-document pair. Then the documents are sorted in descending score order. We will return to the process by which this model is trained in the second blog in this series. Conceptually, it is useful to realize that this allows the model to attend to different parts of the query and document text and learn rich features for assessing relevance. It has been observed that this process allows the model to learn more robust representations for generally assessing relevance. It also potentially allows the model to capture more nuanced semantics. For example, bi-encoder models struggle with things like negation and instead tend to pick up on matches for the majority concepts in the text, independent of whether the query wants to include or exclude them. Cross-encoder models have the capacity to learn how negation should affect relevance judgments. Finally, cross-encoder scores are often better calibrated across a diverse range of query types and topics. This makes choosing a score at which to drop documents significantly more reliable. Connection with RAG Improving the content supplied to an LLM improves the quality of RAG. Indeed search quality is often the bottleneck for RAG performance. For example, if the information needed to respond correctly to a question is contained exclusively in a specific document, that document must be provided in the LLM context window. Furthermore, whilst the current generation of long context models are excellent at extracting information from long contexts, the cost to process the extra input tokens is significant and money spent on search typically yields large overall cost efficiencies . RAG use cases generally also have looser latency constraints, so some extra time spent in reranking is less of an issue. Indeed, the latency can also be offset by reducing the generation time if fewer passages need to be supplied in the prompt to achieve the same recall. This makes semantic reranking especially well suited to be applied in RAG scenarios. Wrapping Up In this post we introduced the concept of semantic reranking and discussed how model architecture can be tailored to this use case to improve relevance, particularly in a zero shot setting. We discussed the performance trade-offs associated with semantic reranking as opposed to semantic retrieval. A crucial choice when discussing performance in this context is how many documents to rerank, which critically affects the trade-off between performance and relevance of reranking methods. We will pick up this topic again when we discuss how to evaluate reranking models and survey some state of the art open and closed reranking models. In the second installment of this series, we introduce you to Elastic Rerank: Elastic's new semantic re-ranker model we've trained and released in technical preview. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Jump to Retrieval Semantic reranking Cross-encoders Connection with RAG Wrapping Up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"What is semantic reranking and how to use it? - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-semantic-reranker-part-1","meta_description":"Learn about semantic reranking and how it can fit into your search & RAG pipelines. This blog also covers cross-encoders and semantic retrieval."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Chatting with your PDFs using Playground This blog showcases a practical example of chatting with PDFs in Playground. You'll learn how to upload PDF files into Kibana and interact with them using Elastic Playground. Integrations Ingestion How To TM By: Tomás Murúa On January 8, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Elasticsearch 8.16 has a new functionality that allows you to upload PDF files directly into Kibana and analyze them using Playground. In this article, we'll see how to use this functionality by uploading a resume in PDF format and then using Playground to interact with it. Playground is a low-code platform hosted in Kibana that allows you to create a RAG application and chat with your content. You can read more about it in this article and even test it using this link . Steps: Configure the Elasticsearch Inference Service Endpoint Upload PDFs to Kibana Interact with the data in Playground Configure the Elasticsearch Inference Service Endpoint To run semantic searches, we must first configure an inference endpoint. In this example, we'll use the Elasticsearch Inference Endpoint . This endpoint offers: rerank sparse embedding text embedding For this example, let's select sparse embedding : Once configured, confirm that the model was correctly loaded into Kibana by checking Search > Relevance > Inference Endpoint in the Kibana UI. Upload PDFs to Kibana We'll upload the resume of a junior developer to learn how to use the Kibana upload files functionality. Go to the Kibana UI and follow these steps: Next, for Import Data , we have two options: Simple: This is the default option and it allows us to quickly upload our PDF into the index and automatically creates a data view with the indexed info. Advanced: This option allows us to customize mappings or add ingest pipelines. Within these settings you can: Add a semantic text type of field. Index Settings : If you want to configure things like shards or analyzers. Index Mappings : If you want to change a field type or how you define your data. Ingest Pipeline : If you want to make changes to your data before indexing it. Go to \"Advanced\" and select \"Add additional field\": Select the field attachment.content ; in “copy to field” type \"content\" and make sure that the inference endpoint is my-elser-model : The field Copy to is used to copy the content from attachment.content to a new semantic_text field of (content), which automatically generates vector embeddings using the underlying Inference endpoint (Elastic’s ELSER in this case). This makes both the semantic and text fields available so you can run full-text , semantic , or hybrid searches. Once everything is configured, click on \"Import\": Now that the index is created, we can explore it using Playground. Interact with the data in Playground Connect to Playground After configuring the index and uploading the resumes, we now need to connect the index to Playground. Click Connect to an LLM and select one of the options. Configure the chatbot Once Playground has been configured and we have indexed Alex Johnson's resume, we can interact with the data. Using semantic search and LLMs we can ask questions using natural language and get answers even if the documents don't have the keywords we used in the query, like in the example below: Using the instructions menu, we can control the chatbot behavior and define features like the response format. It can also include citations, to make sure the answer is properly grounded. If we go to the \"Query\" tab, we can see the query generated by Playground and we add both a text and a semantic_text fields, Playground will automatically generate a hybrid query to normalize the score between different types of different types of queries. Playground not only answers questions but also helps us understand the internal components of a RAG system, like querying, retrieval phase, context and prompt instructions. Give it a try and chat with your PDFs! With the Elasticsearch 8.16 update, we can easily upload PDF/Word/Powerpoint files using the Kibana UI. It can automatically create an index in the simple mode, and you can use the advanced mode to customize your index and tailor it to your needs. Once your files are uploaded, you can access Playground and quickly and easily chat with them since Playground will handle the LLM interactions and provide the best query based on the type of fields you want to search. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Configure the Elasticsearch Inference Service Endpoint Upload PDFs to Kibana Interact with the data in Playground Connect to Playground Configure the chatbot Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Chatting with your PDFs using Playground - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/chat-with-pdf-elastic-playground","meta_description":"Learn how to chat with your PDFs using Elastic Playground. We'll upload PDF files into Kibana and then use Playground to chat with them."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Advanced RAG techniques part 1: Data processing Discussing and implementing techniques which may increase RAG performance. Part 1 of 2, focusing on the data processing and ingestion component of an advanced RAG pipeline. Vector Database Generative AI HC By: Han Xiang Choong On August 14, 2024 Part of Series Advanced RAG techniques Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This is Part 1 of our exploration into Advanced RAG Techniques. Click here for Part 2! The recent paper Searching for Best Practices in Retrieval-Augmented Generation empirically assesses the efficacy of various RAG enhancing techniques, with the goal of converging on a set of best-practices for RAG. The RAG pipeline recommended by Wang and colleagues. We'll implement a few of these proposed best-practices, namely the ones which aim to improve the quality of search (Sentence Chunking, HyDE, Reverse Packing) . For brevity, we will omit those techniques focused on improving efficiency (Query Classification and Summarization) . We will also implement a few techniques that were not covered, but which I personally find useful and interesting (Metadata Inclusion, Composite Multi-Field Embeddings, Query Enrichment) . Finally, we'll run a short test to see if the quality of our search results and generated answers has improved versus the baseline. Let's get to it! RAG overview RAG aims to enhance LLMs by retrieving information from external knowledge bases to enrich generated answers. By providing domain-specific information, LLMs can be quickly adapted for use cases outside the scope of their training data; significantly cheaper than fine-tuning, and easier to keep up-to-date. Measures to improve the quality of RAG typically focus on two tracks: Enhancing the quality and clarity of the knowledge base. Improving the coverage and specificity of search queries. These two measures will achieve the goal of improving the odds that the LLM has access to relevant facts and information, and is thus less likely to hallucinate or draw upon its own knowledge - which may be outdated or irrelevant. The diversity of methods is difficult to clarify in just a few sentences. Let's go straight to implementation to make things clearer. Figure 1: The RAG pipeline used by the author. Table of contents Overview Table of contents Set-up Ingesting, processing, and embedding documents Data ingestion Sentence-level, token-wise chunking Metadata inclusion and generation Keyphrases extracted by TextRank Potential questions generated by GPT-4o Entities extracted by Spacy Composite multi-field embeddings Indexing to Elastic Cat break Appendix Definitions Set-up All code may be found in the Searchlabs repo . First things first. You will need the following: An Elastic Cloud Deployment An LLM API - We are using a GPT-4o deployment on Azure OpenAI in this notebook Python Version 3.12.4 or later We will be running all the code from the main.ipynb notebook. Go ahead and git clone the repo, navigate to supporting-blog-content/advanced-rag-techniques, then run the following commands: Once that's done, create a .env file and fill out the following fields (Referenced in .env.example ). Credits to my co-author, Claude-3.5, for the helpful comments. Next, we'll choose the document to ingest, and place it in the documents folder. For this article, we'll be using the Elastic N.V. Annual Report 2023 . It's a pretty challenging and dense document, perfect for stress testing our RAG techniques. Elastic Annual Report 2023 Now we're all set, let's go to ingestion. Open main.ipynb and execute the first two cells to import all packages and intialize all services. Back to top Ingesting, processing, and embedding documents Data ingestion Personal note: I am stunned by LlamaIndex's convenience. In the olden days before LLMs and LlamaIndex, ingesting documents of various formats was a painful process of collecting esoteric packages from all over. Now it's reduced to a single function call. Wild. The SimpleDirectoryReader will load every document in the directory_path. For .pdf files, it returns a list of document objects, which I convert to Python dictionaries because I find them easier to work with. Each dictionary contains the key content in the text field. It also contains useful metadata such as page number, filename, file size, and type. Back to top Sentence-level, token-wise chunking The first thing to do is reduce our documents to chunks of a standard length (to ensure consistency and manageability). Embedding models have unique token limits (maximum input size they can process). Tokens are the basic units of text that models process. To prevent information loss (truncation or omission of content), we should provide text that does not exceed those limits (by splitting longer texts into smaller segments). Chunking has a significant impact on performance. Ideally, each chunk would represent a self-contained piece of information, capturing contextual information about a single topic. Chunking methods include word-level chunking, where documents are split by word count, and semantic chunking which uses an LLM to identify logical breakpoints. Word-level chunking is cheap, fast, and easy, but runs a risk of splitting sentences and thus breaking context. Semantic chunking gets slow and expensive, especially if you're dealing with documents like the 116-page Elastic Annual Report. Let's choose a middleground approach. Sentence level chunking is still simple, but can preserve context more effectively than word-level chunking while being significantly cheaper and faster. Additionally, we'll implement a sliding window to capture some of the surrounding context, and alleviate the impact of splitting paragraphs. The Chunker class takes in the embedding model's tokenizer to encode and decode text. We'll now build chunks of 512 tokens each, with an overlap of 20 tokens. To do this, we'll split the text into sentences, tokenize those sentences, and then add the tokenized sentences to our current chunk until we cannot add more without breaching our token limit. Finally, decode the sentences back to the original text for embedding, storing it in a field called original_text . Chunks are stored in a field called chunk . To reduce noise (aka useless documents), we will discard any documents smaller than 50 tokens in length. Let's run it over our documents: And get back chunks of text that look like this: Back to top Metadata inclusion and generation We've chunked our documents. Now it's time to enrich the data. I want to generate or extract additional metadata. This additional metadata can be used to influence and enhance search performance. We'll define a DocumentEnricher class, whose role is to take in a list of documents (Python dictionaries), and a list of processor functions. These functions will run over the documents' original_text column, and store their outputs in new fields. First, we extract keyphrases using TextRank . TextRank is a graph-based algorithm that extracts key phrases and sentences from text by ranking their importance based on the relationships between words. Next, we'll generate potential_questions using GPT-4o . Finally, we'll extract entities using Spacy . Since the code for each of these is quite lengthy and involved, I will refrain from reproducing it here. If you are interested, the files are marked in the code samples below. Let's run the data enrichment: And take a look at the results: Keyphrases extracted by TextRank These keyphrases are a stand-in for the chunk's core topics. If a query has to do with cybersecurity, this chunk's score will be boosted. Potential questions generated by GPT-4o These potential questions may directly match with user queries, offering a boost in score. We prompt GPT-4o to generate questions which can be answered using the information found in the current chunk. Entities extracted by Spacy These entities serve a similar purpose to the keyphrases, but capture organizations' and individuals' names, which keyphrase extraction may miss. Back to top Composite multi-field embeddings Now that we have enriched our documents with additional metadata, we can leverage this information to create more robust and context-aware embeddings. Let's review our current point in the process. We've got four fields of interest in each document. Each field represents a different perspective on the document's context, potentially highlighting a key area for the LLM to focus on. Metadata Enrichment Pipeline The plan is to embed each of these fields, and then create a weighted sum of the embeddings, known as a Composite Embedding. With luck, this Composite Embedding will allow the system to become more context aware, in addition to introducing another tunable hyperparameter from controlling the search behavior. First, let's embed each field and update each document in place, using our locally defined embedding model imported at the beginning of the main.ipynb notebook. Each embedding function returns the embedding's field, which is just the original input field with an _embedding postfix. Let's now define the weightings of our composite embedding: The weightings allow you to assign priorities to each component, based on your usecase and the quality of your data. Intuitively, the size of these weightings is dependent on the semantic value of each component. Since the chunk text itself is by far the richest, I assign a weighting of 70%. Since the entities are the smallest, being just a list of org or person names, I assign it a weighting of 5%. The precise setting for these values has to be determined empirically, on a use-case by use-case basis. Finally, let's write a function to apply the weightings, and create our composite embedding. We'll delete all the component embeddings as well to save space. With this, we've completed our document processing. We now have a list of document objects which look like this: Indexing to Elastic Let's bulk upload our documents to Elastic Search. For this purpose, I long-ago defined a set of Elastic Helper functions in elastic_helpers.py . It is a very lengthy piece of code so let's sticking to looking at the function calls. es_bulk_indexer.bulk_upload_documents works with any list of dictionary objects, taking advantage of Elasticsearch's convenient dynamic mappings. Head on over to Kibana and verify that all documents have been indexed. There should be 224 of them. Not bad for such a large document! Indexed Annual Report Documents in Kibana Back to top Cat break Let's take a break, article's a little heavy, I know. Check out my cat: look at how furious she is Adorable. The hat went missing and I half suspect she stole and hid it somewhere :( Congrats on making it this far :) Join me in Part 2 for testing and evaluation of our RAG pipeline! Appendix Definitions 1. Sentence Chunking A preprocessing technique used in RAG systems to divide text into smaller, meaningful units. Process: Input: Large block of text (e.g., document, paragraph) Output: Smaller text segments (typically sentences or small groups of sentences) Purpose: Creates granular, context-specific text segments Allows for more precise indexing and retrieval Improves the relevance of retrieved information in RAG systems Characteristics: Segments are semantically meaningful Can be independently indexed and retrieved Often preserves some context to ensure standalone comprehensibility Benefits: Enhances retrieval precision Enables more focused augmentation in RAG pipelines 2. HyDE (Hypothetical Document Embedding) A technique that uses an LLM to generate a hypothetical document for query expansion in RAG systems. Process: Input query to an LLM LLM generates a hypothetical document answering the query Embed the generated document Use the embedding for vector search Key difference: Traditional RAG: Matches query to documents HyDE: Matches documents to documents Purpose: Improve retrieval performance, especially for complex or ambiguous queries Capture richer semantic context than a short query Benefits: Leverages LLM's knowledge to expand queries Can potentially improve relevance of retrieved documents Challenges: Requires additional LLM inference, increasing latency and cost Performance depends on quality of generated hypothetical document 3. Reverse Packing A technique used in RAG systems to reorder search results before passing them to the LLM. Process: Search engine (e.g., Elasticsearch) returns documents in descending order of relevance. The order is reversed, placing the most relevant document last. Purpose: Exploits the recency bias of LLMs, which tend to focus more on the latest information in their context. Ensures the most relevant information is \"freshest\" in the LLM's context window. Example: Original order: [Most Relevant, Second Most, Third Most, ...] Reversed order: [..., Third Most, Second Most, Most Relevant] 4. Query Classification A technique to optimize RAG system efficiency by determining whether a query requires RAG or can be answered directly by the LLM. Process: Develop a custom dataset specific to the LLM in use Train a specialized classification model Use the model to categorize incoming queries Purpose: Improve system efficiency by avoiding unnecessary RAG processing Direct queries to the most appropriate response mechanism Requirements: LLM-specific dataset and model Ongoing refinement to maintain accuracy Benefits: Reduces computational overhead for simple queries Potentially improves response time for non-RAG queries 5. Summarization A technique to condense retrieved documents in RAG systems. Process: Retrieve relevant documents Generate concise summaries of each document Use summaries instead of full documents in the RAG pipeline Purpose: Improve RAG performance by focusing on essential information Reduce noise and interference from less relevant content Benefits: Potentially improves relevance of LLM responses Allows for inclusion of more documents within context limits Challenges: Risk of losing important details in summarization Additional computational overhead for summary generation 6. Metadata Inclusion A technique to enrich documents with additional contextual information. Types of metadata: Keyphrases Titles Dates Authorship details Blurbs Purpose: Increase contextual information available to the RAG system Provide LLMs with clearer understanding of document content and relevance Benefits: Potentially improves retrieval accuracy Enhances LLM's ability to assess document usefulness Implementation: Can be done during document preprocessing May require additional data extraction or generation steps 7. Composite Multi-Field Embeddings An advanced embedding technique for RAG systems that creates separate embeddings for different document components. Process: Identify relevant fields (e.g., title, keyphrases, blurb, main content) Generate separate embeddings for each field Combine or store these embeddings for use in retrieval Difference from standard approach: Traditional: Single embedding for entire document Composite: Multiple embeddings for different document aspects Purpose: Create more nuanced and context-aware document representations Capture information from a wider variety of sources within a document Benefits: Potentially improves performance on ambiguous or multi-faceted queries Allows for more flexible weighting of different document aspects in retrieval Challenges: Increased complexity in embedding storage and retrieval processes May require more sophisticated matching algorithms 8. Query Enrichment A technique to expand the original query with related terms to improve search coverage. Process: Analyze the original query Generate synonyms and semantically related phrases Augment the query with these additional terms Purpose: Increase the range of potential matches in the document corpus Improve retrieval performance for queries with specific or technical language Benefits: Potentially retrieves relevant documents that don't exactly match the original query terms Can help overcome vocabulary mismatch between queries and documents Challenges: Risk of query drift if not carefully implemented May increase computational overhead in the retrieval process Back to top Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to RAG overview Table of contents Set-up Ingesting, processing, and embedding documents Data ingestion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Advanced RAG techniques part 1: Data processing - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/advanced-rag-techniques-part-1","meta_description":"This blog explores and implements advanced RAG techniques which may increase performance, focusing on data processing & ingestion of an advanced RAG pipeline."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Advanced RAG techniques part 2: Querying and testing Discussing and implementing techniques which may increase RAG performance. Part 2 of 2, focusing on querying and testing an advanced RAG pipeline. Vector Database Generative AI HC By: Han Xiang Choong On August 15, 2024 Part of Series Advanced RAG techniques Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. All code may be found in the Searchlabs repo, in the advanced-rag-techniques branch . Welcome to Part 2 of our article on Advanced RAG Techniques! In part 1 of this series , we set-up, discussed, and implemented the data processing components of the advanced RAG pipeline: The RAG pipeline used by the author. In this part, we're going to proceed with querying and testing out our implementation. Let's get right to it! Table of contents Searching and retrieving, generating answers Enriching queries with synonyms HyDE (Hypothetical Document Embedding) Hybrid search Experiments Summary of results Test 1: Who audits Elastic? AdvancedRAG SimpleRAG Test 2: total revenue 2023 AdvancedRAG SimpleRAG Test 3: What product does growth primarily depend on? How much? AdvancedRAG SimpleRAG Test 4: Describe employee benefit plan AdvancedRAG SimpleRAG Test 5: Which companies did Elastic acquire? AdvancedRAG SimpleRAG Conclusion Appendix Prompts RAG question answering prompt Elastic query generator prompt Potential questions generator prompt HyDE generator prompt Sample hybrid search query Searching and retrieving, generating answers Let's ask our first query, ideally some piece of information found primarily in the annual report. How about: Now, let's apply a few of our techniques to enhance the query. Enriching queries with synonyms Firstly, let's enhance the diversity of the query wording, and turn it into a form that can be easily processed into an Elasticsearch query. We'll enlist the aid of GPT-4o to convert the query into a list of OR clauses. Let's write this prompt: When applied to our query, GPT-4o generates synonyms of the base query and related vocabulary. In the ESQueryMaker class, I've defined a function to split the query: Its role is to take this string of OR clauses and split them into a list of terms, allowing us do a multi-match on our key document fields: Finally ending up with this query: This covers many more bases than the original query, hopefully reducing the risk of missing a search result because we forgot a synonym. But we can do more. Back to top HyDE (Hypothetical Document Embedding) Let's enlist GPT-4o again, this time to implement HyDE . The basic premise of HyDE is to generate a hypothetical document - The kind of document that would likely contain the answer to the original query. The factuality or accuracy of the document is not a concern. With that in mind, let's write the following prompt: Since vector search typically operates on cosine vector similarity, the premise of HyDE is that we can achieve better results by matching documents to documents instead of queries to documents. What we care about is structure, flow, and terminology. Not so much factuality. GPT-4o outputs a HyDE document like this: It looks pretty believable, like the ideal candidate for the kinds of documents we'd like to index. We're going to embed this and use it for hybrid search. Back to top Hybrid search This is the core of our search logic. Our lexical search component will be the generated OR clause strings. Our dense vector component will be embedded HyDE Document (aka the search vector). We use KNN to efficiently identify several candidate documents closest to our search vector. We call our lexical search component Scoring with TF-IDF and BM25 by default. Finally, the lexical and dense vector scores will be combined using the 30/70 ratio recommended by Wang et al . Finally, we can piece together a RAG function. Our RAG, from query to answer, will follow this flow: Convert Query to OR Clauses. Generate HyDE document and embed it. Pass both as inputs to Hybrid Search. Retrieve top-n results, reverse them so that the most relevant score is the \"most recent\" in the LLM's contextual memory (Reverse Packing) Reverse Packing Example: Query: \"Elasticsearch query optimization techniques\" Retrieved documents (ordered by relevance): Reversed order for LLM context: By reversing the order, the most relevant information (1) appears last in the context, potentially receiving more attention from the LLM during answer generation. \"Use bool queries to combine multiple search criteria efficiently.\" \"Implement caching strategies to improve query response times.\" \"Optimize index mappings for faster search performance.\" \"Optimize index mappings for faster search performance.\" \"Implement caching strategies to improve query response times.\" \"Use bool queries to combine multiple search criteria efficiently.\" Pass the context to the LLM for generation. Let's run our query and get back our answer: Nice. That's correct. Back to top Experiments There's an important question to answer now. What did we get out of investing so much effort and additional complexity into these implementations? Let's do a little comparison. The RAG pipeline we've implemented versus baseline hybrid search, without any of the enhancements we've made. We'll run a small series of tests and see if we notice any substantial differences. We'll refer to the RAG we have just implemented as AdvancedRAG, and the basic pipeline as SimpleRAG. Simple RAG Pipeline without bells and whistles Summary of results This table summarizes the results of five tests of both RAG pipelines. I judged the relative superiority of each method based on answer detail and quality, but this is a totally subjective judgement. The actual answers are reproduced below this table for your consideration. With that said, let's take a look at how they did! SimpleRAG was unable to answer questions 1 & 5. AdvancedRAG also went into far greater detail on questions 2, 3, and 4. Based on the increased detail, I judged the quality of AdvancedRAG's answers better. Test Question AdvancedRAG Performance SimpleRAG Performance AdvancedRAG Latency SimpleRAG Latency Winner 1 Who audits Elastic? Correctly identified PwC as the auditor. Failed to identify the auditor. 11.6s 4.4s AdvancedRAG 2 What was the total revenue in 2023? Provided the correct revenue figure. Included additional context with revenue from previous years. Provided the correct revenue figure. 13.3s 2.8s AdvancedRAG 3 What product does growth primarily depend on? How much? Correctly identified Elastic Cloud as the key driver. Included overall revenue context & greater detail. Correctly identified Elastic Cloud as the key driver. 14.1s 12.8s AdvancedRAG 4 Describe employee benefit plan Gave a comprehensive description of retirement plans, health programs, and other benefits. Included specific contribution amounts for different years. Provided a good overview of benefits, including compensation, retirement plans, work environment, and the Elastic Cares program. 26.6s 11.6s AdvancedRAG 5 Which companies did Elastic acquire? Correctly listed recent acquisitions mentioned in the report (CmdWatch, Build Security, Optimyze). Provided some acquisition dates and purchase prices. Failed to retrieve relevant information from the provided context. 11.9s 2.7s AdvancedRAG Test 1: Who audits Elastic? AdvancedRAG SimpleRAG Summary : SimpleRAG did not identify PWC as the auditor Okay that's actually quite surprising. That looks like a search failure on SimpleRAG's part. No documents related to auditing were retrieved. Let's dial down the difficulty a little with the next test. Test 2: total revenue 2023 AdvancedRAG SimpleRAG Summary : Both RAGs got the right answer: $1,068,989,000 total revenue in 2023 Both of them were right here. It does seem like AdvancedRAG may have acquired a broader range of documents? Certainly the answer is more detailed and incorporates information from previous years. That is to be expected given the enhancements we made, but it's far too early to call. Let's raise the difficulty. Test 3: What product does growth primarily depend on? How much? AdvancedRAG SimpleRAG Summary : Both RAGs correctly identified Elastic Cloud as the key growth driver. However, AdvancedRAG includes more detail, factoring in subscription revenues and customer growth, and explicitly mentions other Elastic offerings. Test 4: Describe employee benefit plan AdvancedRAG SimpleRAG Summary : AdvancedRAG goes into much greater depth and detail, mentioning the 401K plan for US-based employees, as well as defining contribution plans outside of the US. It also mentions Health and Well-Being plans but misses the Elastic Cares program, which SimpleRAG mentions. Test 5: Which companies did Elastic acquire? AdvancedRAG SimpleRAG Summary : SimpleRAG does not retrieve any relevant info about acquisitions, leading to a failed answer. AdvancedRAG correctly lists CmdWatch, Build Security, and Optimyze, which were the key acquisitions listed in the report. Back to top Conclusion Based on our tests, our advanced techniques appear to increase the range and depth of the information presented, potentially enhancing quality of RAG answers. Additionally, there may be improvements in reliability, as ambiguously worded questions such as Which companies did Elastic acquire? and Who audits Elastic were correctly answered by AdvancedRAG but not by SimpleRAG. However, it is worth keeping in perspective that in 3 out of 5 cases, the basic RAG pipeline, incorporating Hybrid Search but no other techniques, managed to produce answers that captured most of the key information. We should note that due to the incorporation of LLMs at the data preparation and query phases, the latency of AdvancedRAG is generally between 2-5x larger that of SimpleRAG. This is a significant cost which may make AdvancedRAG suitable only for situations where answer quality is prioritized over latency. The significant latency costs can be alleviated using a smaller and cheaper LLM like Claude Haiku or GPT-4o-mini at the data preparation stage. Save the advanced models for answer generation. This aligns with the findings of Wang et al. As their results show, any improvements made are relatively incremental. In short, simple baseline RAG gets you most of the way to a decent end-product, while being cheaper and faster to boot. For me, it's an interesting conclusion. For use cases where speed and efficiency are key, SimpleRAG is the sensible choice. For use cases where every last drop of performance needs squeezing out, the techniques incorporated into AdvancedRAG may offer a way forward. Results of the study by Wang et al reveal that the use of advanced techniques creates consistents but incremental improvements. Back to top Appendix Prompts RAG question answering prompt Prompt for getting the LLM to generate answers based on query and context. Elastic query generator prompt Prompt for enriching queries with synonyms and converting them into the OR format. Potential questions generator prompt Prompt for generating potential questions, enriching document metadata. HyDE generator prompt Prompt for generating hypothetical documents using HyDE Sample hybrid search query Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Table of contents Searching and retrieving, generating answers Enriching queries with synonyms HyDE (Hypothetical Document Embedding) Hybrid search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Advanced RAG techniques part 2: Querying and testing - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/advanced-rag-techniques-part-2","meta_description":"This blog discusses and implements RAG techniques which may increase performance, focusing on querying and testing an advanced RAG pipeline."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search Learn about the Elastic Learned Sparse Encoder (ELSER), an AI model for high relevance semantic search across domains. ML Research AP GG By: Aris Papadopoulos and Gilad Gal On June 21, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Searching for meaning, not just words With 8.8, Elastic offers semantic search out of the box. Semantic search is designed to search with the intent or meaning of the text as opposed to a lexical match or keyword query. It is a qualitative leap compared to traditional lexical term-based search, offering breakthrough relevance. It captures relationships between words on the conceptual level, understanding the context and surfacing relevant results based on meanings, instead of simply query terms. Aiming to eliminate the barrier to AI-powered search, in 8.8 we are introducing a new semantic search model in technical preview, trained and optimized by Elastic. Use it to instantly leverage superior semantic relevance with vector and hybrid search, natively in Elastic. Introducing Elastic Learned Sparse Encoder, a new text expansion model for semantic search Elastic has been investing in vector search and AI for three years and released support for approximate nearest neighbor search in 8.0 (with HNSW in Lucene). Recognizing that the landscape of tools to implement semantic search is rapidly evolving, we have offered third-party model deployment and management, both programmatically and through the UI. With the combined capabilities, you can onboard your vector models (embeddings) and perform vector search through the familiar search APIs, which we enhanced with vector capabilities. The results from using vector search have been astonishing. But to achieve them, organizations need significant expertise and effort that go well beyond typical software productization. This includes annotating a sufficient number of queries within the domain in which search will be performed (typically in the order of tens of thousands), in-domain re-training the machine learning (so called “embedding”) model to achieve domain adaptation, and maintaining the models against drift. At the same time, you may not want to rely on third-party models due to privacy, support, competitiveness, or licensing concerns. As a result, AI-powered search is still outside the reach of the majority of users. With that in mind, in 8.8 we are introducing Elastic Learned Sparse Encoder — in technical preview. You can start using this new retrieval model with a click of a button from within the Elastic UI for a wide array of use cases, and you need exactly zero machine learning expertise or deployment effort. Superior semantic search out of the box Elastic’s Learned Sparse Encoder uses text-expansion to breathe meaning into simple search queries and supercharge relevance. It captures the semantic relationships between words in the English language and based on them, it expands search queries to include relevant terms that are not present in the query. This is more powerful than adding synonyms with lexical scoring (BM25) because it uses this deeper language-scale knowledge to optimize for relevance. And not only that, but context is also factored in, helping to eliminate ambiguity from words that may have different interpretations in different sentences. As a result, this model helps mitigate the vocabulary mismatch problem : Even if the query terms are not present in the documents, Elastic Learned Sparse Encoder will return relevant documents if they exist. Based on our comparison, this novel retrieval model outperforms lexical search in 11 out of 12 prominent relevance benchmarks , and the combination of both using hybrid search in all 12 relevance benchmarks. If you’ve already spent the effort to fine-tune lexical search in your domain, you can get an additional boost from hybrid scoring! Why choose Elastic’s Learned Sparse Encoder? Above all, you can use this new model out of the box, without domain adaptation — we’ll explain that more below; it is a sparse-vector model that performs well out-of-domain or zero-shot. Let’s break down how these terms directly translate to value for your search application. Our model is trained and architected in such a way that you do not need to fine tune it on your data. As an out-of-domain model, it outperforms dense vector models when no domain-specific retraining is applied. In other words, just click “deploy” on the UI and start using state-of-the-art semantic search with your data. Our model outperforms SPLADE (Sparse Lexical and Expansion Model), the previous out-of-domain, sparse-vector, text-expansion champion, as measured by the same benchmarks. In addition, you don’t have to worry about licensing, support, continuity of competitiveness, and extensibility beyond your Elastic license tier. For example, SPLADE is licensed for non-commercial use only. Our model is available on our Platinum subscription tier. As sparse-vector representation, it uses the Elasticsearch, Lucene-based inverted index. This means decades of optimizations are leveraged to provide optimal performance. As a result, Elastic offers one of the most powerful and effortless hybrid search solutions in the market. For the same reason, it is both more efficient and more interpretable. Fewer dimensions are activated than in dense representations, and they often directly map to words, in contrast with the opaqueness of dense representations. In a vocabulary mismatch scenario, this will clearly show you which words non-existing in the query triggered the results. Let’s speak to performance and Elasticsearch as a vector database Keeping vectors of tens of thousands of dimensions and performing vector similarity on them may sound like a scale and latency stretch. However, sparse vectors compress wonderfully well, and the Elasticsearch (and Lucene) inverted index is a strong technical approach to this use case. In addition, for Elastic, vector similarity is a less computationally intensive operation, due to some clever inverted index tricks that Elasticsearch hides up its sleeve. Overall, both the query performance and index size when using our sparse retrieval model are surprisingly good and require fewer resources compared to the typical dense vector index. That said, vector search, sparse or dense, has an inherently larger memory footprint and time complexity compared to lexical search universally, regardless of the platform. Elastic, as a vector database, is optimized and provides all gains possible on all levels (data structures and algorithmic). Although learned sparse retrieval might require more resources compared to lexical search, based on your application and data, the enhanced capabilities it offers could well be worth the investment. The future: The most powerful hybrid search in the market out of the box In this first tech preview release, we are limiting the length of the input to 512 tokens, which is approximately the first 300–400 words in each field going through an inference pipeline. This is sufficient for many use cases already, and we are working on methods for handling longer documents in a future version. For a successful early evaluation, we suggest using documents where most information is stored in the first 300–400 words. As we evaluated different models for relevance, it became clear that the best results are obtained from an ensemble of different ranking methods. You can combine vector search — with or without the new retrieval model — with Elastic’s lexical search through our streamlined search APIs. Linearly combining normalized scores from each method can provide excellent results. However, we want to push boundaries and offer the most powerful hybrid search out of the box, by eliminating any search science effort toward fine tuning based on the distribution of scores, data, queries, etc. To this aim, we are releasing Reciprocal Rank Fusion (RRF) in 8.8 for use initially with third-party models in Elastic and we are working toward integrating our sparse retrieval model and lexical search through RRF in the subsequent releases. This way, you will be able to leverage Elastic's innovative hybrid search architecture, combining semantic, lexical, and multimedia, through the Elastic search APIs that you are familiar with and trust through years of maturity. Finally, in working toward a GA production-ready version, we are exploring strategies for handling long documents and overall optimizations to further boost performance. Get started with Elastic’s AI-powered search today To try Elastic Learned Sparse Encoder, head to Machine Learning at the trained models view or Enterprise Search to start using semantic search with your data, in a simple click of a button. If you don't have access to Elastic yet, you can request access to the premium trial needed here . To learn more about our investments and trajectory in the vector search and AI space, watch this ElasticON Global spotlight talk by Matt Riley, general manager of Enterprise Search. For a deeper understanding of the new model’s architecture and training, read the blog by the creator machine learning scientists. To learn how you can use the model for semantic and hybrid search, head to our API and requirements documentation. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Searching for meaning, not just words Introducing Elastic Learned Sparse Encoder, a new text expansion model for semantic search Superior semantic search out of the box Why choose Elastic’s Learned Sparse Encoder? Let’s speak to performance and Elasticsearch as a vector database Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/introducing-elastic-learned-sparse-encoder-elser","meta_description":"Learn about the Elastic Learned Sparse Encoder (ELSER), an AI model for high relevance semantic search across domains."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing Learning To Rank (LTR) in Elasticsearch Discover how Learning To Rank (LTR) can help you to improve your search ranking and how to implement it in Elasticsearch. Search Relevance How To AF By: Aurélien Foucret On July 15, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Starting with Elasticsearch 8.13, we provide an implementation of Learning To Rank (LTR) natively integrated into Elasticsearch. LTR uses a trained machine learning (ML) model to build a ranking function for your search engine. Typically, the model is used as a second stage re-ranker, to improve the relevance of search results returned by a simpler, first stage retrieval algorithm. This blog post will explain how this new feature can help in improving your document ranking in text search and how to implement it in Elasticsearch. Whether you are trying to optimize an eCommerce search, build the best context for a Retrieval Augmented Generation(RAG) application or craft a question answering based search on millions of academic papers, you have probably realized how challenging it can be to accurately optimize document ranking in a search engine. That's where Learning to Rank comes in. Understanding relevance features and how to build a scoring function Relevance features are the signals to determine how well a document matches a user's query or interest, all of which impact search relevance . These features can vary significantly depending on the context, but they generally fall into several categories. Let’s take a look at some common relevance features used across different domains: Text Relevance Scores (e.g., BM25 , TF-IDF): Scores derived from text matching algorithms that measure the similarity of document content to the search query. These scores can be obtained from Elasticsearch. Document Properties (e.g., price of a product, publication date): Features that can be extracted directly from the stored document. Popularity Metrics (e.g., click-through rate, views): Indicators of how popular or frequently accessed a document is. Popularity metrics can be obtained with Search analytics tools, of which Elasticsearch provides out-of-the-box. The scoring function combines these features to produce a final relevance score for each document. Documents with higher scores are ranked higher in search results. When using the Elasticsearch Query DSL, you are implicitly writing a scoring function that weights relevance features and ultimately defines your search relevance Scoring in the Elasticsearch Query DSL Consider the following example query: This query translates into the following scoring function: While this approach works well, it has a few limitations: Weights are estimated : The weights assigned to each feature are often based on heuristics or intuition. These guesses may not accurately reflect the true importance of each feature in determining relevance. Uniform Weights Across Documents : Manually assigned weights apply uniformly to all documents, ignoring potential interactions between features and how their importance might vary across different queries or document types. For instance, the relevance of recency might be more significant for news articles but less so for academic papers. As the number of features and documents increases, these limitations become more pronounced, making it increasingly challenging to determine accurate weights. Ultimately, the chosen weights become a compromise, potentially leading to suboptimal ranking in many scenarios. A compelling alternative is to replace the scoring function that uses manual weights by a ML-based model that computes the score using relevance features. Hello Learning To Rank (LTR)! LambdaMART is a popular and effective LTR technique that uses gradient boosting decision trees (GBDT ) to learn the optimal scoring function from a judgment list. The judgment list is a dataset that contains pairs of queries and documents, along with their corresponding relevance labels or grades. Relevance labels are typically either binary, (e.g. relevant/irrelevant) or graded (e.g between 0 for completely irrelevant and 4 for highly relevant). Judgment lists can be created manually by humans or be generated from user engagement data, such as clicks or conversions. The example below uses a graded relevance judgment. LambdaMART treats the ranking problem as a regression task using a decision tree where the inner nodes of the tree are conditions over the relevance features, and the leaves are the predicted scores. LambdaMART uses a gradient boosted tree approach, and in the training process it builds multiple decision trees where each tree corrects errors of its predecessors. This process aims to optimize a ranking metric like NDCG, based on examples from the judgment list. The final model is a weighted sum of individual trees. XGBoost is a well known library that provides an implementation of LambdaMART, making it a popular choice to implement ranking based on gradient boosting decision trees. Getting started with LTR in Elasticsearch Starting with version 8.13, Learning To Rank is integrated directly into Elasticsearch and associated tooling as a technical preview feature. Train and deploy an LTR model to Elasticsearch Eland is our Python client and toolkit for DataFrames and machine learning in Elasticsearch. Eland is compatible with most of the standard Python data science tools like Pandas, scikit-learn and XGBoost. We highly recommend using it to train and deploy your LTR XGBoost model, as it provides features to simplify this process: The first step of the training process is to define the relevant features of the LTR model. Using the Python code below, you can specify the relevant features using the Elasticsearch Query DSL. The second step of the process is to build your training dataset. At this step you will compute and add relevance features for each rows of your judgment list: To help you with this task, Eland provides the FeatureLogger class: When the training dataset is built, the model is trained very easily (as also shown in the notebook ): Deploy your model to Elasticsearch once the training process is complete: To learn more about how our tooling can help you to train and deploy the model, check out this end-to-end notebook . Use your LTR model as a rescorer in Elasticsearch Once you deploy your model in Elasticsearch, you can enhance your search results through a rescorer . The rescorer allows you to refine a first-pass ranking of search results using the more sophisticated scoring provided by your LTR model: In this example: First-pass query: The multi_match query retrieves documents that match the query the quick brown fox in the title and content fields. This query is designed to be fast and capture a large set of potentially relevant documents. Rescore phase: The learning_to_rank rescorer refines the top results from the first-pass query using the LTR model. model_id : Specifies the ID of the deployed LTR model ( ltr-model-xgboost in our example). params : Provides any parameters required by the LTR model to extract features relevant to the query. Here query_text allows you to specify the query issued by the user that some of our features extractors expect. window_size : Defines the number of top documents from the search results issued by the first-pass query to be rescored. In this example, the top 100 documents will be rescored. By integrating LTR as a two stage retrieval process, you can can optimize both performance and accuracy of your retrieval process by combining: Speed of Traditional Search: The first-pass query retrieves a large number of documents with a broad match very quickly, ensuring fast response times. Precision of Machine Learning Models: The LTR model is applied only to the top results, refining their ranking to ensure optimal relevance. This targeted application of the model enhances precision without compromising overall performance. Try LTR yourself!? Whether you are struggling to configure search relevance for an eCommerce platform, aiming to improve the context relevance of your RAG application, or you are simply curious about enhancing your existing search engine's performance, you should consider LTR seriously. To start your journey with implementing LTR, make sure to visit our notebook detailing how to train, deploy, and use an LTR model in Elasticsearch and to read our documentation . Let us know if you built anything based on this blog post or if you have questions on our Discuss forums and the community Slack channel . Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to Understanding relevance features and how to build a scoring function Scoring in the Elasticsearch Query DSL Hello Learning To Rank (LTR)! Getting started with LTR in Elasticsearch Train and deploy an LTR model to Elasticsearch Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing Learning To Rank (LTR) in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-learning-to-rank-introduction","meta_description":"Discover how Learning To Rank (LTR) can help you to improve your search ranking and how to implement it in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Personalized search with learning-to-rank (LTR) Learn how to train ranking models that improve search relevance for individual users and personalize search through learning-to-rank (LTR) in Elasticsearch. Search Relevance MJ By: Max Jakob On August 30, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Today, users have come to expect search results that are tailored to their individual interests. If all the songs we listen to are rock songs, we would expect an Aerosmith song at the top of the results when searching for Crazy , not the one by Gnarls Barkley. In this article, we take a look at ways to personalize search before diving into the specifics of how to do this with learning-to-rank (LTR), using music preferences as an example. Ranking factors First, let's recap which factors are important in search ranking in general. Given a user query, a relevance function can take into account one or multiple of the following factors: Text similarity can be measured with a variety of methods including BM25, dense vector similarity, sparse vector similarity or through cross-encoder models. We can calculate similarity scores of the query string against multiple fields in a document (title, description, tags, etc.) to determine how well the input query matches a document. Query properties can be inferred from the query itself, for example the language, named entities or the user intent. The domain will influence which of these properties can be most helpful to improve relevance.. Document properties pertain to the document itself, for example its popularity or the price of the product represented by the document. These properties often have a big impact on the relevance when applied with the right weights. User and context properties refer to data that is not associated with the query or the document but with the context of the search request, for example the location of the user, past search behavior or user preferences. These are the signals that will help us personalize our search. Personalized results When looking at the last category of factors, user and context properties, we can distinguish between three types of systems: \"General\" search does not take into account any user properties. Only query input and document properties determine the relevance of search results. Two users that enter the same query see the same results. When you start Elasticsearch you have such a system out-of-the-box. Personalized search adds user properties to the mix. The input query is still important but it is now supplemented by user and/or context properties. In this setting users can get different results for the same query and hopefully the results are more relevant for individuals. Recommendations goes a step further and focuses exclusively on document, user and context properties. There is no actively supplied user query to these systems. Many platforms recommend content on the home page that is tailored to the user’s account, for example based on the shopping history or previously watched movies. If we look at personalization as a spectrum, personalized search sits in the middle. Both user input and user preferences are part of the relevance equation. This also means that personalization in search should be applied carefully. If we put too much weight on past user behavior and too little on the present search intent, we risk frustrating users with their favorite documents when they were specifically searching for something else. Maybe you too had the experience of watching that one folk dance video that your friend posted and subsequently found more of these when searching for dance music. The lesson here is that it's important to ensure sufficient amounts of historic data for a user in order to confidently skew search results in a certain direction. Also keep in mind that personalization is mainly going to make a difference for ambiguous user input and exploratory queries. Unambiguous navigational queries should already be covered by your general search mechanisms. There are many methods for personalization. There are rule-based heuristics in which developers hand-craft the matching of user properties onto sets of specific documents, for example manually boosting onboarding documents for new users. There are also low tech methods of sampling results from general and personal result lists. Many of the more principled approaches use vector representations trained either on item similarity or with collaborative filtering techniques (e.g. “customers also bought”). You can find many posts around these methods online. In this post we will focus on using learning-to-rank. Personalized search with LTR Learning-to-rank (LTR) is the process of creating statistical models for relevance ranking. You can think of it as automatically tuning the weights of different relevance factors. Instead of manually coming up with a structured query and weights for all text similarity, query properties and document properties, we train a model that finds an optimal trade-off given some data. The data comes in the form of a judgment list. Here we are going to look at behavior-based personalization using LTR, meaning that we will utilize past user behavior to extract user properties that will be used in our LTR training process. It's important to note that, in order to set yourself up for success, you should be already well underway in your LTR journey before you start with personalization: You should already have LTR in place. If you want to introduce LTR into your search, it's best to start by optimizing your general (non-personalized) search first. There might be some low-hanging fruit there and this will give you the chance to build a solid technical base before adding complexity. Dealing with user-dependent data means you need more of it during training and evaluation becomes trickier. We recommend waiting with personalization until your overall LTR setup is in a solid state. You should already be collecting usage data. Without it you would not have enough data for sensical improvements to your relevance: the cold start problem. It's also important that you have high confidence in the correctness of your usage tracking data. Incorrectly sent tracking events and erroneous data pipelines can often go undetected because they don’t throw any errors, but the resulting data ends up misrepresenting the actual user behavior. Subsequently basing personalization projects on this data will probably not succeed. You should already be creating your judgment list from usage data. This process is also known as click modeling and it is both a science and an art. Here, instead of manually labeling relevant and irrelevant documents in search results, you use click signals (clicks on search results, add-to-cart, purchases, listening to a whole song, etc.) to estimate the relevance of a document that a user was served as part of past search results. You probably need multiple experiments to get this right. Plus, there are some biases that are being introduced here (most notably position bias ). You should feel confident that your judgment list well represents the relevance for your search. If all these things are a given, then let's go ahead and add personalization. First, we are going to dive into feature engineering. Feature engineering In feature engineering we ask ourselves which concrete user properties can be used in your specific search to make results more relevant? And how can we encode these properties as ranking features? You should be able to imagine exactly how adding, say, the location of the user could improve the result quality. For instance code search is typically a use case that is independent of the user location. Music tastes on the other hand are influenced by local trends. If we know where the searcher is and we know to which geo location we can attribute a document, this can work out. It pays to be thoughtful about which user features and which document feature might work together. If you cannot imagine how this would work in theory, it might not be worth adding a new feature to your model. At any rate, you should always test the effectiveness of new features both offline after training and later in an online A/B test. Some properties can be directly collected from the tracking data, such as the location of the user or the upload location of a document. When it comes to representing user preferences, we have to do some more calculations (as we will see below). Furthermore we have to think about how to encode our properties as features because all features must be numeric. For example, we have to decide whether to represent categorical features as labels represented by integers or as one-hot encoding of multiple binary labels. To illustrate how user features might influence relevance ranking, consider the fictive example boosting tree below that could be part of an XGBoost model for a music search engine. The training process learned the importance of the location feature \"from France\" (on the left-hand side) and weighed them against the other features such as text similarity and document features. Note that these trees are typically much deeper and there are many more of them. We chose a one-hot encoding for the location feature both on the search and on the documents. Be aware that the more features are added, the more nodes in these trees are required to make use of them. Consequently more time and resources will be needed during training in order to reach convergence. Start small, measure improvements and expand step-by-step. Example of personalized search with LTR: music preferences How can we implement this in Elasticsearch? Let's again assume we have a search engine for a music website where users can look for songs and listen to them. Each song is categorized into a high-level genre. An example document could look like this: Further assume that we have an established way to extract a judgment list from usage data. Here we use relevance grades from 0 to 3 as an example, which could be computed from no interaction, clicking on a result, listening to the song and giving a thumbs-up rating for the song. Doing this introduces some biases in our data, including position bias (more on this in a future post). The judgment list could look like this: We track the songs that users listen to on our site, so we can build a dataset of music genre preferences for each user. For example, we could look back some time into the past and aggregate all genres that a user has listened to. Here we could experiment with different representations of genre preferences, including latent features, but for simplicity we'll stick to relative frequencies of listens. In this example we want to personalize for individual users but note that we could also base our calculations on segments of users (and use segment IDs). When calculating this, it would be wise to take the amount of activity of users into account. This goes back to the folk dance example above. If a user only interacted with one song, the genre preference would be completely skewed to its genre. To prevent the subsequent personalization putting too much weight on this, we could add the number of interactions as a feature so the model can learn when to put weight on the genre plays. We could also smooth the interactions and add a constant to all frequencies before normalizing so they don’t deviate from a uniform distribution for low counts. Here we assume the latter. The above data needs to be stored in a feature store so that we can look up the user preference values by user ID both during training and at search time. You can use a dedicated Elasticsearch index here, for example: With the user ID as the Elasticsearch document ID we can use the Get API (see below) to retrieve the preference values. This will have to be done in your application code as of Elasticsearch version 8.15. Also note that these separately stored feature values will need to be refreshed by a regularly running job in order to keep the values up-to-date as preferences change over time. Now we are ready to define our feature extraction. Here we one-hot-encode the genres. We plan to also enable representing categories as integers in future releases. Now when applying the feature extraction, we have to first look up the genre preference values and forward them to the feature logger. Depending on performance, it might be good to batch lookup these values. After feature extraction, we have our data ready for training. Please refer to the previous LTR post and the accompanying notebook for how to train and deploy the model (and make sure to not send the IDs as features). Once the model is trained and deployed, you can use it in a rescorer like this. Note that at search time you also need to look up the user preference values beforehand and add the values to the query. Now the users of our music website with different genre preferences can benefit from your personalized search. Both rock and pop lovers will find their favorite version of the song called Crazy at the top of the search results. Conclusion Adding personalization has the potential to improve relevance. One way to personalize search is through LTR in Elasticsearch. We have looked at some prerequisites that should be given and went through a hands-on example. However, in the name of a focused post, we left out several important details. How would we evaluate the model? There are offline metrics that can be applied during model development, but ultimately an online A/B test with real users will have to decide if the model improves relevance. How do we know if we are using enough data? Spending more resources at this stage can improve quality but we need to know under which conditions this is worth it. How would we build a good judgment list and deal with the different biases introduced by using behavioral tracking data? And can we forget about our personalized model after deployment or do we require repeated maintenance to address drift? Some of these questions will be answered in future posts on LTR, so stay tuned. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Jump to Ranking factors Personalized results Personalized search with LTR Feature engineering Example of personalized search with LTR: music preferences Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Personalized search with learning-to-rank (LTR) - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/personalized-search-elasticsearch-ltr","meta_description":"Learn how to implement personalized search through Learning to Rank (LTR) in Elasticsearch with a practical example."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Building RAG with Llama 3 open-source and Elastic Learn how to build a RAG system with Llama3 open source and Elastic. This blog provides practical examples of RAG using Llama3 as an LLM. Integrations Generative AI How To RR By: Rishikesh Radhakrishnan On June 20, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This blog will walk through implementing RAG using two approaches. Elastic, Llamaindex, Llama 3 (8B) version running locally using Ollama. Elastic, Langchain, ELSER v2, Llama 3 (8B) version running locally using Ollama. The notebooks are available at this GitHub location. Before we get started, let's take a quick dive into Llama 3. Llama 3 overview Llama 3 is an open source large language model recently launched by Meta. This is a successor to Llama 2 and based on published metrics, is a significant improvement. It has good evaluation metrics, when compared to some of the recently published models such as Gemma 7B Instruct, Mistral 7B Instruct, etc. The model has two variants, which are the 8 billion and 70 billion parameter. An interesting thing to note is that at the time of writing this blog, Meta was still in the process of training 400B+ variant of Llama 3. Meta Llama 3 Instruct Model Performance. (from https://ai.meta.com/blog/meta-llama-3/ ) The above figure shows data on Llama3 performance across different datasets as compared to other models. In order to be optimized for performance for real world scenarios, Llama3 was also evaluated on a high quality human evaluation set. Aggregated results of Human Evaluations across multiple categories and prompts (from https://ai.meta.com/blog/meta-llama-3/ ) How to build RAG with Llama 3 open-source and Elastic Dataset For the dataset, we will use a fictional organization policy document in json format, available at this location . Configure Ollama and Llama3 As we are using the Llama 3 8B parameter size model, we will be running that using Ollama. Follow the steps below to install Ollama. Browse to the URL https://ollama.com/download to download the Ollama installer based on your platform. Note: The Windows version is in preview at the moment. Follow the instructions to install and run Ollama for your OS. Once installed, follow the commands below to download the Llama3 model. This should take some time depending upon your network bandwidth. Once the run completes, you should end with the interface below. To test Llama3, run the following command from a new terminal or enter the text at the prompt itself. At the prompt, the output looks like below. We now have Llama3 running locally using Ollama. Elasticsearch setup We will use Elastic cloud setup for this. Please follow the instructions here . Once successfully deployed, note the API Key and the Cloud ID, we will require them as part of our setup. Application setup There are two notebooks, one for RAG implemented using Llamaindex and Llama3, the other one with Langchain, ELSER v2 and Llama3. In the first notebook, we use Llama3 as a local LLM as well as provide embeddings. For the second notebook, we use ELSER v2 for the embeddings and Llama3 as the local LLM. Method 1: Elastic, Llamaindex, Llama 3 (8B) version running locally using Ollama. Step 1 : Install required dependencies The above section installs the required llamaindex packages. Step 2: Import required dependencies We start with importing the required packages and classes for the app. We start with providing a prompt to the user to capture the Cloud ID and API Key values. If you are not familiar with obtaining the Cloud ID and API Key, please follow the links in the code snippet above to guide you with the process. Step 3: document processing We start with downloading the json document and building out Document objects with the payload. We now define the Elasticsearch vector store ( ElasticsearchStore ), the embedding created using Llama3 and a pipeline to help process the payload constructed above and ingest into Elasticsearch. The ingestion pipeline allows us to compose pipelines using different components, one of which allows us to generate embeddings using Llama3. ElasticsearchStore is defined with the name of the index to be created, the vector field and the content field. And this index is created when we run the pipeline. The index mapping created is as below: The pipeline is executed using the step below. Once this pipeline run completes, the index workplace_index is now available for querying. Do note that the vector field content_vector is created as a dense vector with dimension 4096 . The dimension size comes from the size of the embeddings generated from Llama3. Step 4: LLM configuration We now setup Llamaindex to use the Llama3 as the LLM. This as we covered before is done with the help of Ollama. Step 5: Semantic search We now configure Elasticsearch as the vector store for the Llamaindex query engine. The query engine is then used to answer your questions with contextually relevant data from Elasticsearch. The response I received with Llama3 as the LLM and Elasticsearch as the Vector database is below. This concludes the RAG setup based on using Llama3 as a local LLM and to generate embeddings. Let's now move to the second method, which uses Llama3 as a local LLM, but we use Elastic’s ELSER v2 to generate embeddings and for semantic search. Method 2: Elastic, Langchain, ELSER v2, Llama 3 (8B) version running locally using Ollama. Step 1: Install required dependencies The above section installs the required langchain packages. Step 2: Import required dependencies We start with importing the required packages and classes for the app. This step is similar to Step 2 in Method 1 above. Next, provide a prompt to the user to capture the Cloud ID and API Key values. Step 3: Document processing Next, we move to downloading the json document and building the payload. This step differs from the Method 1 approach, from how we use the LlamaIndex provided pipeline to process the document. Here we use the RecursiveCharacterTextSplitter to generate the chunks. We now define the Elasticsearch vector store ElasticsearchStore . The vector store is defined with the index to be created and the model to be used for embedding and retrieval. You can retrieve the model_id by navigating to Trained Models under Machine Learning. This also results in the creation of an ingest pipeline in Elastic, which generates and stores the embeddings as the documents are ingested into Elastic. We now add the documents processed above. Step 4: LLM configuration We set up the LLM to be used with the following. This is again different from method 1, where we used Llama3 for embeddings too. Step 5: Semantic search The necessary building blocks are all in place now. We tie them up together to perform semantic search using ELSER v2 and Llama3 as the LLM. Essentially, Elasticsearch ELSER v2 provides the contextually relevant response to the users question using its semantic search capabilities. The user's question is then enriched with the response from ELSER and structured using a template. This is then processed with Llama3 to generate relevant responses. The response with Llama3 as the LLM and ELSER v2 for semantic search is as below: This concludes the RAG setup based on using Llama3 as a local LLM and ELSER v2 for semantic search. Conclusion In this blog we looked at two approaches to RAG with Llama3 and Elastic. We explored Llama3 as an LLM and to generate embeddings. Next we used Llama3 as the local LLM and ELSER for embeddings and semantic search. We utilized two different frameworks, LlamaIndex and Langchain. You could implement the two methods using either of these frameworks. The notebooks were tested with the Llama3 8B parameter version. Both the notebooks are available at this GitHub location. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Llama 3 overview How to build RAG with Llama 3 open-source and Elastic Dataset Configure Ollama and Llama3 Elasticsearch setup Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Building RAG with Llama 3 open-source and Elastic - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-rag-with-llama3-opensource-and-elastic","meta_description":"Learn how to build a RAG system with Llama3 open source and Elastic. This blog provides practical examples of RAG using Llama3 as an LLM."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Retrieval Augmented Generation (RAG) Learn about Retrieval Augmented Generation (RAG) and how it can help improve the quality of an LLM's generated responses by providing relevant source knowledge as context. Generative AI JM By: Joe McElroy On November 13, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Generative AI has recently created enormous successes and excitement, with models that can generate fluent text, realistic images, and even videos. In the case of language, large language models, trained on vast amounts of data, are capable on understanding context and generating relevant responses to questions. This blog post explores the challenges associated with Generative AI, how Retrieval Augmented Generation (RAG) can help overcome those challenges, how RAG works, as well as the advantages and challenges of using RAG. Challenges with Generative AI However, its important to understand that these models are not perfect. The knowledge that these models possess is parametric knowledge that they learned during training and is a condensed representation of the entire training dataset. Lack of domain knowledge These models should be able to generate good responses to questions about general knowledge seen in their training data. But they cannot reliably answer questions about facts which are not in their training dataset. If the model is well aligned it will refuse to answer such out-of-domain questions. However, it is possible it will simply make up answers (also known as hallucinating). For example, a general purpose model will typically understand in general terms that each company will have a leave policy, but it will not have any knowledge of my particular company's leave policy. Frozen parametric knowledge An LLM's knowledge is frozen, which means it doesn't know anything about events that happen post-training. This means it will not be able to reliably answer questions about current events. Models are typically trained to qualify the answers they give for such questions. Hallucinations It has been suggested that LLMs capture in their parameters something like a knowledge graph representation of general ontology: representing facts about and relationships between entities. Common facts that appear frequently in the training data are well represented in the knowledge graph. However, niche knowledge which is unlikely to have many examples in the training data is only approximately represented. As such LLMs have noisy understanding of such facts. The alignment process, where models are calibrated about what they know, is essential. Mistakes often occur in the gray area between known and unknown information, highlighting the challenge of distinguishing relevant details. In the example above, the question about Fields Medal winners in the same year as Borcherds, is a prime example of this sort of niche knowledge. In this case we seeded the conversation with information about other mathematicians and ChatGPT appeared to get confused about what information to attend to. For example, it missed Tim Gowers and added Vladimir Voevodsky (who won in 2002). Expensive to train While LLMs are capable of generating relevant responses to questions when trained on data within a specific domain, they are expensive to train and require vast amounts of data and compute to develop. Similarly, fine-tuning models requires expertise and time and there is the risk that in the process they \"forget\" other important capabilities. How does RAG help solve this problem? Retrieval Augmented Generation (RAG) helps solve this problem by grounding the parametric knowledge of a generative model with an external source knowledge, from a information retrieval system like a database. This source knowledge is passed as additional context to the model and helps the model generate more relevant responses to questions. How does RAG work? A RAG pipeline typically has three main components: Data : A collection of data (e.g documents, webpages) that contain relevant information to answer questions. Retrieval : A retrieval strategy that can retrieve relevant source knowledge from the data. Generation : With the relevant source knowledge, generate a response with the help of an LLM. RAG pipeline flow When directly interacting with a model, the LLM is given a question and generates a response based on its parametric knowledge. RAG adds an extra step to the pipeline, using retrieval to find relevant data that builds additional context for the LLM. In the example below, we use a dense vector retrieval strategy to retrieve relevant source knowledge from the data. This source knowledge is then passed to the LLM as context to generate a response. RAG doesn't have to use dense vector retrieval, it can use any retrieval strategy that can retrieve relevant source knowledge from the data. It could be a simple keyword search or even a Google web search. We will cover other retrieval strategies in a future article. Retrieval of source knowledge Retrieval of relevant source knowledge is key to answering the question effectively. The most common approach for retrieval with Generative AI is using semantic search with dense vectors. Semantic search is a technique that requires an embedding model to transform natural language input into dense vectors which represent that source knowledge. We rely on these dense vectors to represent the source knowledge because they are able to capture the semantic meaning of the text. This is important because it allows us to compare the semantic meaning of the source knowledge with the question to determine if the source knowledge is relevant to the question. Given a question and its embedding, we can find the most relevant source knowledge. Semantic search with dense vectors isn't your only retrieval option, but it's one of the most popular approaches today. We will cover other approaches in a future article. Advantages of RAG After training, LLMs are frozen. The parametric knowledge of the model is fixed and cannot be updated. However, when we add data and retrieval to the RAG pipeline, we can update the source knowledge as the underlying data source changes, without having to retrain the model. Grounded in source knowledge The model's response can also be constrained to only use the source knowledge provided in-context, which helps limit hallucinations. This approach also opens up the option of using smaller, task-specific LLMs instead of large, general purpose models. This enables prioritizing the use of source knowledge to answer questions, rather than general knowledge acquired during training. Citing sources in responses In addition, RAG can provide clear traceability of the source knowledge used to answer a question. This is important for compliance and regulatory reasons and also helps spot LLM hallucinations. This is known as source tracking. RAG in action Once we have retrieved the relevant source knowledge, we can use it to generate a response to the question. To do this, we need to: Build a context A collection of source knowledge (e.g documents, webpages) that contain relevant information to answer questions. This provides the context for the model to generate a response. Prompt template A template written in natural language for a specific task (answer questions, summarize text). Used as the input to the LLM. Question A question that is relevant to the task. Once we have these three components, we can use the LLM to generate a response to the question. In the example below, we combine the prompt template with the user's question and the relevant passages retrieved. The prompt template builds the relevant source knowledge passages into a context. This example also includes source tracing where the source knowledge passages are cited in the response. Challenges with RAG Effective retrieval is the key to answering questions effectively. Good retrieval provides a diverse set of relevant source knowledge to the context. However, this is more of an art than a science, requires a lot of experimentation to get right, and is highly dependent on the use case. Precise dense vectors Large documents are difficult to represent as a single dense vector because they contain multiple semantic meanings. For effective retrieval, we need to break down the document into smaller chunks of text that can be accurately represented as a single dense vector. A common approach for generic text is to chunk by paragraphs and represent each paragraph as a dense vector. Depending on your use case, you may want to break the document down using titles, headings, or even sentences, as chunks. Large context When using LLMs, we need to be mindful of the size of the context we pass to the model. LLMs have a limit on the amount of tokens they can process at once. For example, GPT-3.5-turbo has a limit of 4096 tokens. Secondly, responses generated may degrade in quality as the context increases, increasing the risk of hallucinations. Larger contexts also require more time to process and, crucially, they increase LLM costs. This comes back to the art of retrieval. We need to find the right balance between chunking size and accuracy with embeddings. Conclusion Retrieval Augmented Generation is a powerful technique that can help improve the quality of an LLM's generated responses, by providing relevant source knowledge as context. But RAG isn't a silver bullet. It requires a lot of experimentation and tuning to get right and it's also highly dependent on your use case. In the next article, we will cover how to build a RAG pipeline using LangChain, a popular framework for working with LLMs. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Challenges with Generative AI Lack of domain knowledge Frozen parametric knowledge Hallucinations Expensive to train Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Retrieval Augmented Generation (RAG) - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/retrieval-augmented-generation-rag","meta_description":"Learn about Retrieval Augmented Generation (RAG) and how it can help improve the quality of an LLM's generated responses by providing relevant source knowledge as context."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Evaluating scalar quantization in Elasticsearch Learn how scalar quantization can be used to reduce the memory footprint of vector embeddings in Elasticsearch through an experiment. ML Research TP TV By: Thanos Papaoikonomou and Thomas Veasey On May 3, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Understanding scalar quantization in Elasticsearch In 8.13 we introduced scalar quantization to Elasticsearch . By using this feature an end-user can provide float vectors that are internally indexed as byte vectors while retaining the float vectors in the index for optional re-scoring. This means they can reduce their index memory requirement, which is its dominant cost, by a factor of four. At the moment this is an opt-in feature feature, but we believe it constitutes a better trade off than indexing vectors as floats. In 8.14 we will switch to make this our default. However, before doing this we wanted a systematic evaluation of the quality impact. Experimentation: Evaluating scalar quantization The multilingual E5-small is a small high quality multilingual passage embedding model that we offer out-of-the-box in Elasticsearch. It has two versions: one cross-platform version which runs on any hardware and one version which is optimized for CPU inference in the Elastic Stack (see here ). E5 represents a challenging case for automatic quantization because the vectors it produces have low angular variation and are relatively low dimension compared to state-of-the-art models. If we can achieve little to no damage enabling int8 quantization for this model we can be confident that it will work reliably. The purpose of this experimentation is to estimate the effects of scalar-quantized kNN search as described here across a broad range of retrieval tasks using this model. More specifically, our aim is to assess the performance degradation (if any) by switching from a full-precision to a quantized index. Overview of methodology For the evaluation we relied upon BEIR and for each dataset that we considered we built a full precision and an int8-quantized index using the default hyperparameters ( m: 16 , ef_construction: 100 ). First, we experimented with the quantized (weights only) version of the multilingual E5-small model provided by Elastic here with Table 1 presenting a summary of the nDCG@10 scores ( k:10 , num_candidates:100 ): Dataset Full precision Int8 quantization Absolute difference Relative difference Arguana 0.37 0.362 -0.008 -2.16% FiQA-2018 0.309 0.304 -0.005 -1.62% NFCorpus 0.302 0.297 -0.005 -1.66% Quora 0.876 0.875 -0.001 -0.11% SCIDOCS 0.135 0.132 -0.003 -2.22% Scifact 0.649 0.644 -0.005 -0.77% TREC-COVID 0.683 0.672 -0.011 -1.61% Average -0.005 -1.05% Table 1 : nDCG@10 scores for the full precision and int8 quantization indices across a selection of BEIR datasets Overall, it seems that there is a slight relative decrease of 1.05% on average. Next, we considered repeating the same evaluation process using the unquantized version of multilingual E5-small (see model card here ) and Table 2 shows the respective results. Dataset Full precision Int8 quantization Absolute difference Relative difference Arguana 0.384 0.379 -0.005 -1.3% Climate-FEVER 0.214 0.222 +0.008 +3.74% FEVER 0.718 0.715 -0.003 -0.42% FiQA-2018 0.328 0.324 -0.004 -1.22% NFCorpus 0.31 0.306 -0.004 -1.29% NQ 0.548 0.537 -0.011 -2.01% Quora 0.882 0.881 -0.001 -0.11% Robust04 0.418 0.415 -0.003 -0.72% SCIDOCS 0.134 0.132 -0.003 -1.49% Scifact 0.67 0.666 -0.004 -0.6% TREC-COVID 0.709 0.693 -0.016 -2.26% Average -0.004 -0.83% Table 2 : nDCG@10 scores of multilingual-E5-small on a selection of BEIR datasets Again, we observe a slight relative decrease in performance equal to 0.83%. Finally, we repeated the exercise for multilingual E5-base and the performance decrease was even smaller (0.59%) But this is not the whole story: The increased efficiency of the quantized HNSW indices and the fact that the original float vectors are still retained in the index allows us to recover a significant portion of the lost performance through rescoring . More specifically, we can retrieve a larger pool of candidates through approximate kNN search in the quantized index, which is quite fast, and then compute the similarity function on the original float vectors and re-score accordingly. As a proof of concept, we consider the NQ dataset which exhibited a large performance decrease (2.01%) with multilingual E5-small. By setting k=15 , num_candidates=100 and window_size=10 (as we are interested in nDCG@10) we get an improved score of 0.539 recovering about 20% of the performance. If we further increase the num_candidates parameter to 200 then we get a score that matches the performance of the full precision index but with faster response times. The same setup on Arguana leads to an increase from 0.379 to 0.382 and thus limiting the relative performance drop from 1.3% to only 0.52% Results The results of our evaluation suggest that scalar quantization can be used to reduce the memory footprint of vector embeddings in Elasticsearch without significant loss in retrieval performance. The performance decrease is more pronounced for smaller vectors (multilingual E5-small produces vectors of size equal to 384 while E5-base gives 768-dimensional embeddings), but this can be mitigated through rescoring. We are confident that scalar quantization will be beneficial for most users and we plan to make it the default in 8.14. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Understanding scalar quantization in Elasticsearch Experimentation: Evaluating scalar quantization Overview of methodology Results Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Evaluating scalar quantization in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/evaluating-scalar-quantization","meta_description":"Learn how scalar quantization can be used to reduce the memory footprint of vector embeddings in Elasticsearch through an experiment."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Bit vectors in Elasticsearch Discover what are bit vectors, their practical implications and how to use them in Elasticsearch. Vector Database BT By: Benjamin Trent On July 17, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. We have supported float values from the beginning of vector search in Elasticsearch. In version 8.6, we added support for byte encoded vectors. In 8.14, we added automatic quantization down to half-byte values. In 8.15, we are adding support for bit encoded vectors. But what are bit vectors and their practical implications? As stated on the tin, bit vectors are where each dimension of the vector is a single bit. When comparing data sizes for vectors with the typical float values, bit vectors provide a whopping 32x reduction in size. Every bit counts Some semantic embedding models natively output bit vectors such as Cohere . Additionally, some other kinds of data such as image hashing utilize bit vectors directly. However, most semantic embedding models output float vectors and do not support bit encoding directly. You can naively binarize vectors yourself since the math is simple. For each vector dimension, check if the value is > median . If it is, that is a 1 bit, and otherwise it is a 0 bit. Figure 0: Transforming 8 float values into individual bit values and then collapse to single byte , assuming the median value is 0 . Here is some simple Python code to binarize a vector: Obviously, this can lose a fair bit of information (pun intended). But for larger vectors or vectors specifically optimized to work well with bit encoding, the space savings can be worth it. Consider 1 million 1024 dimension floating point vectors. Each vector is 4KB in size and all vectors will require approximately 4GB. With binary quantization, each vector is now only 128 bytes and all vectors in total are only around 128MB. When you consider the cost of storage & memory, this is exceptionally attractive. Now, since we are no longer in float land, we cannot use typical distance functions like cosineSimilarity or dotProduct . Instead, we take advantage of each dimension being a single bit by using Hamming distance . hamming distance is fairly straight forward, for every individual bit , we calculate the xor with the corresponding bit in the other vector. Then we sum up the resulting bits. Figure 1: Hamming distance calculation between two bit elements. Let's think back to our 1 million 1024 dimension vectors. In addition to the space savings, using hamming distance over 128 bytes vs. dotProduct over 1024 floats is a significant reduction in computation time. For some simple benchmarking (this is not exhaustive), we indexed 1 million 1024 dimension vectors in Elasticsearch with a flat index. With only 2GB of off-heap, bit vectors take approximately 40ms to return, but float takes over 3000ms . If we increase the off-heap to 4GB, bit vectors continue to take the same amount of time (they fit into memory even before) and float vectors improve to 200ms . So hamming is still significantly faster than the floating point dot-product and requires significantly less memory. A bit of error bit vectors aren't perfect, it is obvious that it is a lossy encoding. The concern isn't that vectors will not be unique. Even when using a bit encoding, 386 dimensioned vectors still have 2 386 2^{386} 2 386 possible unique vectors. The main concerns are distance collisions and the size of the error the encoding introduces. Even if we assume a well distributed bit encoding, it's likely to have many distance collisions when gathering a large number of vectors. Intuitively, this makes sense as our distance measurement is summing the bits. For example, 00000001 and 10000000 are the same distance apart as 00000001 and 00000010 . Once you need to gather more than dimension documents, you will have collisions. In reality, it will occur much sooner than that. To illustrate, here is a small study. The focus here is finding out how many bit vectors would need gathering to get the true nearest top k k k vectors. For the first experiment, we used 1 million CohereV3 vectors from their Wikipedia dataset . We randomly sampled (without replacement) 50 query vectors and used those to determine true dotProduct and hamming distances. Here are the \"best\" and \"worst\" performing query vectors. With quality being the number of documents required to retrieve the correct 100 100 100 nearest neighbors (e.g. more being worse). Figure 2: The best performing CohereV3 query vector & its distances, you can see how the distances are actually aligning well. Figure 3: The worst performing CohereV3 query vector & its distances. Here, the nearer distances align well, but that correlation weakens as we start gathering vectors that are further away. Figure 4: The median number of vectors required to get the true k k k nearest neighbors over all 50 queries. CohereV3 is excellent here, showing that only around 10x oversampling is required, even for the 10 0 t h 100^{th} 10 0 t h nearest neighbor. Visually, however, we can see that the oversampling required increases exponentially. From this small study, CohereV3 does exceptionally well. The median case showing you can oversample by approximately 10x to achieve similar recall. However, in the worst case when gathering more than 50 nearest documents, it starts being problematic, requiring much more than 10x oversampling. Depending on the query and the dataset you can run into problems. So, how well does binarization do when a model and dataset combination are not optimized for bit vectors? We used e5-small-v2 and embedded the quora datset to test this. Randomly taking 500k vectors and then randomly sampled 50 query vectors from those vectors. Figure 5: The best performing e5-small query vector & its distances. The extremely near distances align fairly well, but still not exceptionally so. Figure 6: The worst performing e5-small query vector & its distances. The hamming and dotProduct distances are effectively uncorrelated. Figure 7: The median number of vectors required to get the true k k k nearest neighbors. The best e5-small vector does moderately well and its hamming distances are semi-correlated with the dotProduct . The worst case is a drastically different story. The distances are effectively uncorrelated. The median values show that you would need to oversample by approximately 800x to achieve the nearest 10 vectors and it only gets worse from there. In short, for models that do well with binary quantization and when the model is well adapted to the dataset, bit quantization is a great option. That said, keep in mind that the oversampling required can increase exponentially as you gather more vectors. For out-of-domain data sets where nearest vectors are not well distinguished for the model, or for models that are not optimized for binary quantization at all, bit vectors can be problematic, even with a small number of nearest vectors. Ok, but how do I use bit vectors? When using bit vectors in Elasticsearch, you can specify the bit encoding in the mapping. For example: Figure 8: Mapping a bit vector in Elasticsearch, allowing for bit encoding. The first document will statically set the bit dimensions Or if you do not want to index in the HNSW index , you can use the flat index type. Figure 9: Mapping a bit vector in Elasticsearch in a flat index type. Then, to index a document with a bit vector, you can use the following: Figure 10: A 1024 dimensioned bit vector in hexidecimal format. Now you can utilize a knn query Figure 11: Querying bit vectors with a 1024 dimensioned hexidecimal vector. Just a bit more Thank you for making it through all the 2-bit jokes. We are very excited about the possibilities that bit vectors bring to Elasticsearch in 8.15. Please try it out in Elastic Cloud once 8.15 is released or in Elasticsearch Serverless right now! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Every bit counts A bit of error Ok, but how do I use Just a bit more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Bit vectors in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/bit-vectors-in-elasticsearch","meta_description":"Discover what are bit vectors, their practical implications and how to use them in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series The ColPali model series Introducing the ColPali model, its implementation in Elasticsearch, and how to scale late interaction models for large-scale vector search. Part1 Search Relevance Vector Database +1 March 18, 2025 Searching complex documents with ColPali - part 1 The article introduces the ColPali model, a late-interaction model that simplifies the process of searching complex documents with images and tables, and discusses its implementation in Elasticsearch. PS BT By: Peter Straßer and Benjamin Trent Part2 Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"The ColPali model series - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/colpali-model-elasticsearch","meta_description":"Introducing the ColPali model, its implementation in Elasticsearch, and how to scale late interaction models for large-scale vector search.\n"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Lucene bug adventures: Fixing a corrupted index exception Sometimes, a single line of code takes days to write. Here, we get a glimpse of an engineer's pain and debugging over multiple days to fix a potential Apache Lucene index corruption. Lucene BT By: Benjamin Trent On December 27, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Be prepared: This particular blog is different than usual. It's not an explanation of a new feature or a tutorial. This is about a single line of code that took three days to write. We'll be fixing a potential Apache Lucene index corruption. Some takeaways I hope you will have: All flaky tests are repeatable, given enough time and the right tools Many layers of testing are key for robust systems. However, higher levels of tests become increasingly more difficult to debug and reproduce. Sleep is an excellent debugger How Elasticsearch tests At Elastic, we have a plethora of tests that run against the Elasticsearch codebase. Some are simple and focused functional tests, others are single node “happy path” integration tests, and yet others attempt to break the cluster to make sure everything behaves correctly in a failure scenario. When a test continually fails, an engineer or tooling automation will create a github issue and flag it for a particular team to investigate. This particular bug was discovered by a test of the last kind. These tests are tricky, sometimes only being repeatable after many runs. What is this test actually testing? This particular test is an interesting one. It will create a particular mapping and apply it to a primary shard. Then when attempting to create a replica. The key difference is that when the replica attempts to parse the document, the test injects an exception, thus causing the recovery to fail in a surprising (but expected) way. Everything was working as expected, however, with one significant catch. During the test cleanup, we validated consistency, and there, this test ran into a snag. This test was failing to fail an expected manner. During the consistency check we would verify that all the replicated and primary Lucene segment files were consistent. Meaning, uncorrupted and fully replicated. Having partial data or corrupted data is way worse than having something fail fully. Here is the scary and abbreviated stack trace of the failure. Somehow, during the forced replication failure the replicated shard ended up getting corrupted! Let me explain the key part of the error in plain english. Lucene is a segment based architecture, meaning each segment knows and manages its own read-only files. This particular segment was being validated via its SegmentCoreReaders to ensure everything was copacetic. Each core reader has metadata stored that indicates what field types and files exist for a given segment. However, when validating the Lucene90PointsFormat , certain expected files were missing. With the segments _0.cfs file we expected a point format file called kdi . cfs stands for \"compound file system\" into which Lucene will sometimes combine all field types and all tiny files into a single larger file for more efficient replication and resource utilization. In fact, all three of the point file extensions: kdd , kdi , and kdm were missing. How could we get into the place where a Lucene segment expects to find a point file but it's missing!?! Seems like a scary corruption bug! The first step for every bug fix, replicate it Replicating the failure for this particular bug was extremely painful. While we take advantage of randomized value testing in Elasticsearch, we are sure to provide every failure with a (hopefully) reproducible random seed to ensure all failures can be investigated. Well, this works great for all failures except for those caused by a race condition . No matter how many times I tried, the particular seed never repeated the failure locally. But, there are ways to exercise the tests and push towards a more repeatable failure. Our particular test suite allows for a given test to be run more than once in the same command via the -Dtests.iters parameter. But this wasn’t enough, I needed to make sure that the execution threads were switching and thus increasing the likelihood of this race condition occurring. Another wrench in the system was that the test ended up taking so long to run, the test runner would timeout. In the end, I used the following nightmare bash to repeatably run the test: In comes stress-ng . This allows you to quickly start a process that will just eat CPU cores for lunch. Randomly spamming stress-ng while running numerous iterations of the failing test finally allowed me to replicate the failure. One step closer. To stress the system, just open another terminal window and run: Revealing the bug Now that the test failure revealing the bug is mostly repeatable, time to try and find the cause. What makes this particular test strange is that Lucene is throwing because it expects point values, but none are added directly by the test. Only text values. This pushed me to consider looking at recent changes to our optimistic concurrency control fields: _seq_no and _primary_term . Both of these are indexed as points and exist in every Elasticsearch document. Indeed a commit did change our our _seq_no mapper! YES! This has to be the cause! But, my excitement was short-lived. This only changed the order of when fields got added to the document. Before this change, _seq_no fields were added last to the document. After, they were added first. No way the order of adding fields to a Lucene document would cause this failure... Yep, changing the order of when fields were added caused the failure. This was surprising and turns out to be a bug in Lucene itself! Changing the order of what fields are parsed shouldn’t change the behavior of parsing a document. The bug in Lucene Indeed, the bug in Lucene focused on following conditions: Indexing a points value field (e.g. _seq_no ) Trying to index a text field throw during analysis In this weird state, we open an Near Real Time Reader from the writer that experience the text index analysis exception But no matter how many ways I tried, I couldn’t fully replicate. I directly added pause points for debugging throughout the Lucene codebase. I attempted randomly opening readers during the exception path. I even printed out megabytes and megabytes of logs trying to find the exact path where this failure occurred. I just couldn’t do it. I spent a whole day fighting and losing. Then I slept. The next day I re-read the original stack trace and discovered the following line: In all my recreation attempts, I never specifically set the retention merge policy. The SoftDeletesRetentionMergePolicy is used by Elasticsearch so that we can accurately replicate deletions in replicas and ensure all our concurrency controls are in charge of when documents are actually removed. Otherwise, Lucene is in full control and will remove them at any merge. Once I added this policy and replicated the most basic steps mentioned above, the failure immediately replicated. I have never been more happy to open a bug in Lucene . While it presented itself as a race condition in Elasticsearch, it was simple to write a repeatably failing test in Lucene once all the conditions were met. In the end, like all good bugs, it was fixed with just 1 line of code. Multiple days of work, for just one line of code. But it was worth it. Not the end Hope you enjoyed this wild ride with me! Writing software, especially software as widely used and complex as Elasticsearch and Apache Lucene is rewarding. However, at times, it’s exceptionally frustrating. I both love and hate software. The bug fixing is never over! Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Be prepared: How Elasticsearch tests What is this test actually testing? The first step for every bug fix, replicate it Revealing the bug Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Lucene bug adventures: Fixing a corrupted index exception - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/lucene-corrupted-index-exception","meta_description":"Learn how to debug and fix a Lucene index corruption with a real-life example from the Elastic engineering team."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. Ingestion How To AL By: Andre Luiz On February 4, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. What is Logstash? Logstash is a widely used Elastic Stack tool for processing large volumes of log data in real-time. It acts as an efficient data pipeline, integrating information from various sources into a single structured flow. Its primary function is to reliably perform data extraction, transformation, and loading. Logstash offers several advantages, particularly its versatility in supporting multiple types of inputs, filters, and outputs, enabling integration with a wide range of sources and destinations. It processes data in real-time, capturing and transforming information. Its native integration with the Elastic Stack, especially Elasticsearch and Kibana, facilitates data analysis and visualization. Additionally, it includes advanced filters that enable efficient data normalization, enrichment, and transformation. How does Logstash work? Logstash is composed of inputs, filters, and outputs, which form the data processing pipeline. These components are configured in a .config file that defines the data ingestion flow. Inputs : Capture data from various sources. Filters : Process and transform the captured data. Outputs : Send the transformed data to defined destinations. The most common types of each component are presented below: Types of Inputs: File : Reads log files in various formats (text, JSON, CSV, etc.). Message Queues : Kafka, RabbitMQ. APIs : Webhooks or other data collection APIs. Databases : JDBC connections for relational data extraction. Types of Filters: Grok : For analyzing and extracting text patterns. Mutate : Modifies fields (renames, converts types, removes data). Date : Converts date and time strings into a readable date format. GeoIP : Enriches logs with geographic data. JSON : Parses or generates JSON data. Types of Outputs: Elasticsearch : The most common destination, Elasticsearch is a search and analytics engine that allows powerful searches and visualizations of data indexed by Logstash. Files : Stores processed data locally. Cloud Services : Logstash can send data to various cloud services, such as AWS S3, Google Cloud Storage, Azure Blob Storage, for storage or analysis. Databases : Logstash can send data to various other databases, such as MySQL, PostgreSQL, MongoDB, etc., through specific connectors. Logstash Elasticsearch data ingestion In this example, we implement data ingestion into Elasticsearch using Logstash. The steps configured in this example will have the following flow: Kafka will be used as the data source. Logstash will consume the data, apply filters such as grok, geoip, and mutate to structure it. The transformed data will be sent to an index in Elasticsearch. Kibana will be used to visualize the indexed data. Prerequisites We will use Docker Compose to create an environment with the necessary services: Elasticsearch, Kibana, Logstash, and Kafka. The Logstash configuration file, named l ogstash.conf , will be mounted directly into the Logstash container. Below we will detail the configuration of the configuration file. Here is docker-compose.yml: As mentioned above, the Logstash pipeline will be defined, in this step we will describe the Input, Filter and Output configurations. The logstash.conf file will be created in the current directory (where docker-compose.yml is located). In docker-compose.yml the logstash.conf file that is on the local file system will be mounted inside the container at the path /usr/share/logstash/pipeline/logstash.conf. Logstash pipeline configuration The Logstash pipeline is divided into three sections: input, filter, and output. Input: Defines where the data will be consumed from (in this case, Kafka). Filter: Applies transformations and structuring to the raw data. Output: Specifies where the processed data will be sent (in this case, Elasticsearch). Next, we will configure each of these steps in detail. Input configuration The data source is a Kafka topic and to consume the data from the topic it will be necessary to configure the Kafka input plugin. Below is the configuration for the Kafka plugin in Logstash, where we define: bootstrap_servers : Address of the Kafka server. topics : Name of the topic to be consumed. group_id : Consumer group identifier. With this, we are ready to receive the data. Filter configuration Filters are responsible for transforming and structuring data. Let's configure the following filters: Grok filter Extracts structured information from unstructured data. In this case, it extracts the timestamp, log level, client IP, URI, status, and the JSON payload. The example log: Extracted Fields: timestamp : Extracts the date and time (e.g., 2025-01-05T16:30:15). log_level : Captures the log level (e.g., INFO, ERROR). client_ip : Captures the client's IP address (e.g., 69.162.81.155). uri : Captures the URI path (e.g., /api/products). status : Captures the HTTP status code (e.g., 200). Date filter Converts the timestamp field into a format readable by Elasticsearch and stores it in @timestamp. GeoIP filter Next, we will use the geoip filter to retrieve geographic information, such as country, region, city, and coordinates, based on the value of the client_ip field. Mutate filter The mutate filter allows transformations on fields. In this case, we will use two of its properties: remove_field : Removes the timestamp and message fields, as they are no longer needed. convert : Converts the status field from a string to an integer. Output configuration The output defines where the transformed data will be sent. In this case, we will use Elasticsearch. We now have our configuration file defined. Below is the complete file: Send and ingest data With the containers running, we can start sending messages to the topic and wait for the data to be indexed. First, create the topic if you haven't already. To send the messages, execute the following command in the terminal: Messages to be sent: To view the indexed data, go to Kibana: Once the indexing has been successfully completed, we can view and analyze the data in Kibana. The mapping and indexing process ensures that the fields are structured according to the configurations defined in Logstash. Conclusion With the configuration presented, we created a pipeline using Logstash to index logs in a containerized environment with Elasticsearch and Kafka. We explored Logstash's flexibility to process messages using filters such as grok, date, geoip, and mutate, structuring the data for analysis in Kibana. Additionally, we demonstrated how to configure the integration with Kafka to consume messages and use them for processing and indexing the data. References Logstash https://www.elastic.co/guide/en/logstash/current/index.html Logstash docker https://www.elastic.co/guide/en/logstash/current/docker.html GeoIp plugin https://www.elastic.co/guide/en/logstash/current/plugins-filters-geoip.html Mutate plugin https://www.elastic.co/guide/en/logstash/current/plugins-filters-mutate.html Grok plugin https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html Kafka plugin https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is Logstash? How does Logstash work? Logstash Elasticsearch data ingestion Prerequisites Logstash pipeline configuration Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to ingest data to Elasticsearch through Logstash - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-logstash-ingest-data","meta_description":"A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Advanced RAG techniques In this series, we'll discuss and implement techniques that may increase RAG performance. Part1 Vector Database Generative AI August 14, 2024 Advanced RAG techniques part 1: Data processing Discussing and implementing techniques which may increase RAG performance. Part 1 of 2, focusing on the data processing and ingestion component of an advanced RAG pipeline. HC By: Han Xiang Choong Part2 Vector Database Generative AI August 15, 2024 Advanced RAG techniques part 2: Querying and testing Discussing and implementing techniques which may increase RAG performance. Part 2 of 2, focusing on querying and testing an advanced RAG pipeline. HC By: Han Xiang Choong Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Advanced RAG techniques - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/advanced-rag-techniques","meta_description":"In this series, we'll discuss and implement techniques that may increase RAG performance. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to use the ES|QL Helper in the Elasticsearch Ruby Client Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results. ES|QL Ruby How To FB By: Fernando Briano On October 24, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction The Elasticsearch Ruby client can be used to craft ES|QL queries and make it easier to work with the data returned from esql.query . ES|QL allows developers to filter, transform, and analyze data stored in Elasticsearch via queries. It uses \"pipes\" ( | ) to work with the data step by step. The esql.query API has been supported in the Elasticsearch Ruby Client since it was available as experimental in version 8.11.0 . You can execute an ES|QL request with the following code: The default response is parsed from JSON (you can also get a CSV or text by passing in the format parameter), and it looks like this: ES|QL Helper In Elasticsearch Ruby v8.13.0 , the client introduced the ES|QL Helper for the esql.query API. Instead of the default response, the helper returns an array of hashes with the columns as keys and the respective values instead of the default JSON value. Additionally, you can iterate through the response values and transform the data in by passing in a Hash of column => Proc values. You could use this for example to convert a @timestamp column value into a DateTime object. We'll take a look at how to use this with example data. Setup and ingesting data For this example, we're using the JSON dump from TheGamesDB, a community driven crowd-sourced games information website. Once we've downloaded the JSON file, we can ingest it into Elasticsearch by using another Helper form the Ruby client, the Bulk Helper . The data includes the list of all games in the database within the data.games keys. It also includes platforms and box art information, but for the purpose of this example, we're only going to use the games data. The BulkHelper provides a way to ingest a JSON file directly into Elasticsearch. To use the helper, we need to require it in our code, and instantiate it with a client and an index on which to perform the bulk action (we can change the index later on an already instantiated helper). We can use ingest_json and pass in the JSON file, the keys where it can find the data, and slice to separate the documents in batches before sending them to Elasticsearch: This will ingest all the game titles with their respective information into the videogames index. Using the ES|QL Helper With the data loaded, we can now query it with ES|QL: If we run this query with the esql.query API directly, we'll get the columns/values result: The helper however, returns an Array of Hashes with the columns as keys and the respective values. So we can work with the response, and access the value for each Hash in the Array with the name of a column as the key: The ESQLHelper also provides the ability to transform the data in the response. We can do this by passing in a Hash of column => Proc values. For example, let's say we want to format the release date in this previous query to show a more human friendly date. We can run this: If we run the same code from before, we'll get this result: You can pass in as many Procs as there are columns in the response. For example, the data includes a youtube field, where sometimes the URL for a video on YouTube is stored, other times just the video hash (e.g. U4bKxcV5hsg ). The URL for a YouTube video follows the convention https://youtube.com/watch?v=VIDEOHASH . So we could also add a parser to prepend the URL to the values that only include the hash: If we then run response.map { |a| a['youtube'] }.compact , we'll get just the URLs for YouTube videos for the videogames we're looking for. Conclusion As you can see, the ESQLHelper class can make it easier to work with the data returned from esql.query . You can learn more about the Elasticsearch Ruby Client and its helpers in the official documentation . And if you have any feedback, questions or requests, don't hesitate to create a new issue in the client's repository. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction ES|QL Helper Setup and ingesting data Using the ES|QL Helper Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to use the ES|QL Helper in the Elasticsearch Ruby Client - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-ruby-helper-elasticsearch","meta_description":"Explore the ES|QL helper and discover how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Build a multimodal image retrieval system using KNN search and CLIP embeddings Learn how to build a powerful semantic image search engine with Roboflow Inference and Elasticsearch. How To JG By: James Gallagher On January 27, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this guide, we are going to walk through how to build an image retrieval system using KNN clustering in Elasticsearch and CLIP embeddings computed with Roboflow Inference , a computer vision inference server. Roboflow Universe , the largest repository of computer vision data on the web with more than 100 million images hosted, uses CLIP embeddings to enable efficient, semantic queries for our dataset search engine. Without further ado, let’s get started! Introduction to CLIP and Roboflow Inference CLIP (Contrastive Language-Image Pretraining) is a computer vision model architecture and model developed by OpenAI. The model was released in 2021 under an MIT license. The model was trained “to predict the most relevant text snippet, given an image”. In so doing, CLIP learned to identify the similarity between images and text with the vectors the model uses. CLIP maps images and text into the vector space. This allows vectors to compare and find images similar to a text query, or images similar to another image. The advancement of multimodal models like CLIP has made it easier than ever to build a semantic image search engine. Models like CLIP can be used to create “embeddings” that capture semantic information about an image or a text query. Vector embeddings are a type of data representation that converts words, sentences, and other data into numbers that capture their meaning and relationships. Roboflow Inference is a high-performance computer vision inference server. Roboflow Inference supports a wide range of state-of-the-art vision models, from YOLO11 for object detection to PaliGemma for visual question answering to CLIP for multimodal embeddings. You can use Roboflow Inference with a Python SDK, or in a Docker environment. In this guide, we will use Inference to calculate CLIP embeddings, then store them in an Elasticsearch cluster for use in building an image retrieval system. Prerequisites To follow this guide, you will need: An Elasticsearch instance that supports KNN search A free Roboflow account Python 3.12+ We have prepared a Jupyter Notebook that you can run on your computer or on Google Colab for use in following along with this guide. Open the notebook . Step #1: Set up an Elasticsearch index with KNN support For this guide, we will use the Elasticsearch Python SDK . You can install it using the following code: If you don’t already have an Elasticsearch cluster set up, refer to the Elasticsearch documentation to get started . Once you have installed the SDK and set up your cluster, create a new Python file and add the following code to connect to your client: To run embedding searches in Elasticsearch, we need an index mapping that contains a dense_vector property type. For this guide, we will create an index with two fields: a dense vector that contains the CLIP embedding associated with an image, and a file name associated with an image. Run the following code to create your index: The output should look similar to the following: The default index type used with KNN search is L2 Norm, also known as Euclidean distance. This distance metric doesn’t work well for CLIP similarity. Thus, above we explicitly say we want to create a cosine similarity index. CLIP embeddings are best compared with cosine similarity. For this guide, we will use a CLIP model with 512 dimensions. If you use a different CLIP model, make sure that you set the dims value to the number of dimensions of the vector returned by the CLIP model. Step #2: Install Roboflow Inference Next, we need to install Roboflow Inference and supervision, a tool for working with vision model predictions. You can install the required dependencies using the following command: This will install both Roboflow Inference and the CLIP model extension that we will use to compute vectors. With Roboflow Inference installed, we can start to compute and store CLIP embeddings. Step #3: Compute and store CLIP embeddings For this guide, we are going to build a semantic search engine for the COCO 128 dataset. This dataset contains 128 images sampled from the larger Microsoft COCO dataset. The images in COCO 128 are varied, making it an ideal dataset for use in testing our semantic search engine. To download COCO 128, first create a free Roboflow account . Then, navigate to the COCO 128 dataset page on Roboflow Universe , Roboflow’s open computer vision dataset community. Click “Download Dataset”: Choose the “YOLOv8” format. Choose the option to show a download code: Copy the terminal command to download the dataset. The command should look something like this: When you run the command, the dataset will be downloaded to your computer and unzipped. We can now start computing CLIP embeddings. Add the following code from your Python file from earlier, then run the full file: This code will loop through all images in the train split of the COCO 128 dataset and run them through CLIP with Roboflow Inference. We then index the vectors in Elasticsearch alongside the file names related to each vector. It may take 1-2 minutes for the CLIP model weights to download. Your script will pause temporarily while this is done. The CLIP model weights are then cached on your system for future use. Note: When you run the code above, you may see a few warnings related to ExecutionProviders. This relates to the optimizations available in Inference for different devices. For example, if you deploy on CUDA the CoreMLExecutionProvide will not be available so a warning is raised. No action is required when you see these warnings. Step #4: Retrieve data from Elasticsearch Once you have indexed your data, you are ready to run a test query! To use a text as an input, you can use this code to retrieve an input vector for use in running a search: To use an image as an input, you can use this code: For this guide, let’s run a text search with the query “coffee”. We are going to use a k-nearest neighbours (KNN) search. This search type accepts an input embedding and finds values in our database whose embeddings are similar to the input. KNN search is commonly used for vector comparisons. KNN search always returns the top k nearest neighbours. If k = 3, Elasticsearch will return the three most similar documents to the input vector. With Elasticsearch, you can retrieve results from a large vector store in milliseconds. We can run a KNN search with the following code: The k value above indicates how many of the nearest vectors should be retrieved from each shard. The size parameter of a query determines how many results to return. Since we are working with one shard for this demo, the query will return three results. Our code returns: We have successfully run a semantic search and found images similar to our input query! Above, we can see the three most similar images: a photo of a coffee cup and a cake on a table outdoors, then two duplicate images in our index with coffee cups on tables. Conclusion With Elasticsearch and the CLIP features in Roboflow Inference, you can create a multimodal search engine. You can use the search engine for image retrieval, image comparison and deduplication, multimodal Retrieval Augmented Generation with visual prompts, and more. Roboflow uses Elasticsearch and CLIP extensively at scale. We store more than 100 million CLIP embeddings and index them for use in multimodal search for our customers who want to search through their datasets at scale. Through the growth of data on our platform from hundreds of images to hundreds of millions, Elasticsearch has scaled seamlessly. To learn more about using Roboflow Inference, refer to the Roboflow Inference documentation . To find data for your next computer vision project, check out Roboflow Universe . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction to CLIP and Roboflow Inference Prerequisites Step #1: Set up an Elasticsearch index with KNN support Step #2: Install Roboflow Inference Step #3: Compute and store CLIP embeddings Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Build a multimodal image retrieval system using KNN search and CLIP embeddings - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/multimodal-image-retrieval-with-roboflow","meta_description":"Learn how to build a semantic image search engine using CLIP embeddings and Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds Azure AI Studio support Elasticsearch open inference API now supports Azure AI Studio. Learn how to use Azure AI Studio capabilities with Elasticsearch in this blog. Integrations Generative AI Vector Database How To MH By: Mark Hoy On May 22, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. As part of our ongoing commitment to serve the Microsoft Azure developers with the tools of their choice, we are happy to announce that Elasticsearch now provides integration of the hosted model catalog on Microsoft Azure AI Studio into our open inference API. This complements the ability for developers to bring their Elasticsearch vector database to be used in Azure OpenAI . Developers can use the capabilities of the world's most downloaded vector database to store and utilize embeddings generated from OpenAI models from Azure AI studio or access the wide array of chat completion model deployments for quick access to conversational models like mistral-small . Just recently we've added support for Azure OpenAI text embeddings and completion , and now we've added support for utilizing Azure AI Studio. Microsoft Azure developers have complete access to Azure OpenAI & Microsoft Azure AI Studio service capabilities and can bring their Elasticsearch data to revolutionize conversational search . Let's walk you through just how easily you can use these capabilities with Elasticsearch. Deploying a model in Azure AI Studio To get started, you'll need a Microsoft Azure subscription as well as access to Azure AI Studio . Once you are set up, you'll need to deploy either a text embedding model or a chat completion model from the Azure AI Studio model catalog . Once your model is deployed, on the deployment overview page take note of the target URL and your deployment's API key - you'll need these later to create your inference endpoint in Elasticsearch. Furthermore, when you deploy your model, Azure offers two different types of deployment options - a “pay as you go” model (where you pay by the token), and a “realtime” deployment which is a dedicated VM that is billed by the hour. Not all models will have both deployment types available, so be sure to take note as well as which deployment type is used. Creating an Inference API Endpoint in Elasticsearch Once your model is deployed, we can now create an endpoint for your inference task in Elasticsearch. For the examples below we are using the Cohere Command R model to perform chat completion. In Elasticsearch, create your endpoint by providing the service as “azureaistudio”, and the service settings including your API key and target from your deployed model. You'll also need to provide the model provider, as well as the endpoint type from before (either “token” or “realtime”). In our example, we've deployed a Cohere model with a token type endpoint. When you send Elasticsearch the command, it should return back the created model to confirm that it was successful. Note that the API key will never be returned and is stored in Elasticsearch's secure settings. Adding a model for using text embeddings is just as easy. For reference, if we had deployed the Cohere-embed-v3-english model , we can create our inference model in Elasticsearch with the “text_embeddings” task type by providing the appropriate API key and target URL from that deployment's overview page: Let's perform some inference That's all there is to setting up your model. Now that that's out of the way, we can use the model. First, let's test the model out by asking it to provide some text given a simple prompt. To do this, we'll call the _inference API with our input text: And we should see Elasticsearch provide a response. Behind the scenes, Elasticsearch is calling out to Azure AI Studio with the input text and processes the results from the inference. In this case, we received the response: We've tried to make it easy for the end user to not have to deal with all the technical details behind the scenes, but we can also control our inference a bit more by providing additional parameters to control the processing such as sampling temperature and requesting the maximum number of tokens to be generated: That was easy. What else can we do? This becomes even more powerful when we are able to use our new model in other ways such as adding additional text to a document when it's used in an Elasticsearch ingestion pipeline. For example, the following pipeline definition will use our model and anytime a document using this pipeline is ingested, any text in the field “question_field” will be sent through the inference API and the response will be written to the “completed_text_answer” field in the document. This allows large batches of documents to be augmented. Limitless possibilities By harnessing the power of Azure AI Studio deployed models in your Elasticsearch inference pipelines, you can enhance your search experience's natural language processing and predictive analytics capabilities. In upcoming versions of Elasticsearch, users can take advantage of new field mapping types that simplify the process even further where designing an ingest pipeline would no longer be necessary. Also, as alluded to in our accelerated roadmap for semantic search the future will provide dramatically simplified support for inference tasks with Elasticsearch retrievers at query time. These capabilities are available through the open inference API in our stateless offering on Elastic Cloud. It'll also be soon available to everyone in an upcoming versioned Elasticsearch release. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Deploying a model in Azure AI Studio Creating an Inference API Endpoint in Elasticsearch Let's perform some inference That was easy. What else can we do? Limitless possibilities Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch open inference API adds Azure AI Studio support - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-azure-ai-studio-support","meta_description":"Elasticsearch open inference API now supports Azure AI Studio. Learn how to use Azure AI Studio capabilities with Elasticsearch in this blog."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API for Google AI Studio Elasticsearch open inference API adds support for Google AI Studio Integrations Python How To JV By: Jeff Vestal On September 27, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch's open inference API supports the Gemini Developer API. When using Google AI Studio, developers can chat with data in their Elasticsearch indexes, run experiments, and build applications using Google Cloud’s models, such as Gemini 1.5 Flash. AI Studio is where Google releases the latest models from Google DeepMind and is the fastest way to start building with Gemini. In this blog, we will create a new Google AI Studio project, create an Elasticsearch inference API endpoint to use Gemini 1.5 Flash, and implement a sample chat app to estimate how many ducks fit on an American football field! (Because why not?) AI Studio API key To get started, we need to create an API key for AI Studio. Head over to ai.google.dev/aistudio and click “Sign In to Google AI Studio.” If you aren’t already logged in, you will be prompted to do so. Once logged in, you are presented with two options: using AI Studio in the browser to test prompts with Gemini or creating an API key. We will create an API key to allow Elasticsearch to connect to AI Studio. You are prompted to accept Google Cloud’s terms and conditions the first time you create an API key. If you use a personal account, you will be given the option to create an API key in a new project. You may not see that option if you use an enterprise account, depending on your access roles. Either way, you can select an existing project to create the key. Select an existing project or create a new project. Copy the generated API key someplace safe for use in the next section. Elasticsearch Inference API We will use Python to configure the Inference API to connect to Google AI Studio and test the chat completion with Gemini. Create the Inference Endpoint Create an Elasticsearch connection. Create the Inference Endpoint to connect to Google AI Studio. For this blog, we will use the Gemini 1.5 Flash model. For a list of available models, consult the Gemini docs. Confirm the endpoint was created. The output should be similar to: Chat time! That's all it takes to create an Elasticsearch API Endpoint to access Google AI Studio! With that done, you can start using it. We will ask it to estimate how many ducks fit on an American football field. Why? Why not. Response Simple and powerful, at the same time With the addition of Google AI Studio , the Elastic open inference API provides access to a growing choice of powerful generative AI capabilities for developers. Google AI Studio is designed to enable simple, quick generative AI experiments to test your best ideas. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to AI Studio API key Elasticsearch Inference API Create the Inference Endpoint Chat time! Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch open inference API for Google AI Studio - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/google-ai-studio-elasticsearch-open-inference-api","meta_description":"Elasticsearch open inference API supports Google AI Studio. Learn how to create a Google AI Studio project, an Elasticsearch inference API endpoint and a chat app. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Semantic reranking & the Elastic Rerank model Introducing the concept of semantic reranking and Elastic Rerank, Elastic's new semantic re-ranker model. Part1 ML Research Search Relevance October 29, 2024 What is semantic reranking and how to use it? Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou Part2 ML Research November 25, 2024 Introducing Elastic Rerank: Elastic's new semantic re-ranker model Learn about how Elastic's new re-ranker model was trained and how it performs. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou Part3 ML Research December 5, 2024 Exploring depth in a 'retrieve-and-rerank' pipeline Select an optimal re-ranking depth for your model and dataset. TP TV QH By: Thanos Papaoikonomou , Thomas Veasey and Quentin Herreros Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Semantic reranking & the Elastic Rerank model - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/semantic-reranking-and-the-elastic-rerank-model","meta_description":"Introducing the concept of semantic reranking and Elastic Rerank, Elastic's new semantic re-ranker model."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Vector search in Elasticsearch: The rationale behind the design In this blog, you'll learn how vector search has been integrated into Elasticsearch and the trade-offs that we made. Vector Database ML Research AG By: Adrien Grand On July 24, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Are you interested to learn about the characteristics of Elasticsearch for vector search and what the design looks like? As always, design decisions come with pros and cons. This blog aims to break down how we chose to build vector search in Elasticsearch. Vector search is integrated in Elasticsearch through Apache Lucene Some background about Lucene first: Lucene organizes data into immutable segments that are merged periodically. Adding more documents requires adding more segments. Modifying existing documents requires atomically adding more segments and marking previous versions of these documents as deleted. Every document within a segment is identified by a doc ID, which is the index of this document within the segment, similar to indices of an array. The motivation for this approach is managing inverted indices, which aren't good at in-place modifications but can be merged efficiently. In addition to inverted indices, Lucene also supports stored fields (a document store), doc values (columnar storage), term vectors (per-document inverted indices), and multi-dimensional points in its segments. Vectors have been integrated the same way: New vectors get buffered into memory at index time. These in-memory buffers get serialized as part of segments when the size of the index-time buffer is exceeded or when changes must be made visible. Segments get periodically merged together in the background in order to keep the total number of segments under control and limit the overall per-segment search-time overhead. Since they are part of segments, vectors need to be merged too. Searches must combine top vector hits across all segments in the index. Searches on vectors must look at the set of live documents in order to exclude documents that are marked as deleted. The system above is driven by the way that Lucene works. Lucene currently uses the hierarchical navigable small world (HNSW) algorithm to index vectors. At a high level, HNSW organizes vectors into a graph where similar vectors are likely to be connected. HNSW is a popular choice for vector search because it is rather simple, performs well on comparative benchmarks for vector search algorithms, and supports incremental insertions. Lucene's implementation of HNSW follows Lucene's guideline of keeping the data on disk and relying on the page cache to speed up access to frequently accessed data. Approximate vector search is exposed in Elasticsearch's _search API through the knn section . Using this feature will directly leverage Lucene's vector search capabilities. Vectors are also integrated in Elasticsearch's scripting API, which allows performing exact brute-force search , or leveraging vectors for rescoring. Let's now dive into the pros and cons of integrating vector search through Apache Lucene. Cons The main cons of taking advantage of Apache Lucene for vector search come from the fact that Lucene ties vectors to segments. However, as we will see later in the pros section, tying vectors to segments is also what enables major features such as efficient pre-filtering, efficient hybrid search, and visibility consistency, among others. Merges need to recompute HNSW graphs Segment merges need to take N input segments, typically 10 with the default merge policy, and merge them into a single segment. Lucene currently creates a copy of the HNSW graph from the largest input segment that doesn't have deletes and then adds vectors from other segments to this HNSW graph. This approach incurs index-time overhead as segments are merged compared to mutating a single HNSW graph in-place over the lifetime of the index. Searches need to combine results from multiple segments Because an index is composed of multiple segments, searches need to compute the top-k vectors on every segment and then merge these per-segment top-k hits into global top-k hits. The impact on latency may be mitigated by searching segments in parallel, but this approach still incurs some overhead compared to searching a single HNSW graph. RAM needs to scale with the size of the data set to retain optimal performance Traversing the HNSW graph incurs lots of random access. To perform efficiently, data sets should fit into the page cache, which requires sizing RAM based on the size of the vector data set that is managed. There exist other algorithms than HNSW for vector search that have more disk-friendly access patterns, though they come with other downsides, like higher query latency or worse recall. Pros Data sets can scale beyond the total RAM size Because data is stored on disk, Elasticsearch will allow data sets that are larger than the total amount of RAM that is available on the local host, and performance will degrade as the ratio of the HNSW data that can fit in the page cache decreases. As described in the previous section, performance-savvy users will need to scale RAM size with the size of the data set to retain optimal performance. Lock-free search Systems that update data structures in place generally need to take locks in order to guarantee thread safety under concurrent indexing and search. Lucene's segment-based indices never require taking locks at search time, even in the case of concurrent indexing. Instead, the set of segments that the index is composed of is atomically updated on a regular basis. Support for incremental changes New vectors may be added, removed, or updated at any time. Some other approximate nearest-neighbor search algorithms require being fed with the entire data set of vectors. Then once all the vectors are provided, an index training step is executed. For these other algorithms, any significant update to the vector data set requires the training step to be completed again, and this can get computationally expensive. Visibility consistency with other data structures A benefit of integrating at such a low level into Lucene is that we get consistency with other data structures out of the box when looking at a point-in-time view of the index. If you perform an update of a document to update both its vector and some other keyword field, then concurrent searches are guaranteed to either see the old value of the vector field and the old value of the keyword field — if the point-in-time view was created prior to the update — or the new value of the vector field and the new value of the keyword field — if the point-in-time view was created after the update. Likewise for deletes, if a document gets marked as deleted, then either all data structures including the vector store will ignore it, or they will see it if they operate on a point-in-time view that was created prior to the deletion. Incremental snapshots The fact that vectors are part of segments helps snapshots remain incremental by taking advantage of the fact that two subsequent snapshots usually share the majority of their segments, especially the bigger ones. Incremental snapshots would not be possible with a single HNSW graph that gets mutated in-place. Filtering and hybrid support Integrating directly into Lucene also makes it possible to integrate efficiently with other Lucene features, such as pre-filtering vector searches with an arbitrary Lucene filter or combining hits coming from a vector query with hits coming from a traditional full-text query. By having its own HNSW graph that is tied to a segment and where nodes are indexed by doc ID, Lucene can make interesting decisions about how best to pre-filter vector searches: either by linearly scanning documents that match the filter if it is selective, or by traversing the graph and only considering nodes that match the filter as candidates for top-k vectors otherwise. Compatibility with other features Because the vector store is like any other Lucene data structure, many features are compatible with vectors and vector search automatically, including: Aggregations Document-level security Field-level security Index sorting Access to vectors through scripts (e.g., from a script_score query or a reranker) Looking ahead: Separation of indexing and search As discussed in another blog , future versions of Elasticsearch will run indexing and search workloads on different instances. The implementation will essentially look as if you were continuously creating snapshots on indexing nodes and restoring them on search nodes. This will help prevent the high cost of vector indexing from impacting searches. Such a separation of indexing and search wouldn't be possible with a single shared HNSW graph instead of multiple segments, short of sending the full HNSW graph over the wire every time changes need to be reflected on new searches. Conclusion In general, Elasticsearch provides excellent vector search capabilities that are integrated with other Elasticsearch features: Vector searches can be pre-filtered by any supported filter, including the most sophisticated ones. Vector hits can be combined with hits of arbitrary queries. Vector searches are compatible with aggregations, document-level security, field-level security, index sorting, and more. Indices that contain vectors still obey the same semantics as other indices, including for the _refresh, _flush and _snapshot APIs. They will also support separation of indexing and search in stateless Elasticsearch. This is done at the expense of some index-time and search-time overhead. That said, vector search still typically runs in the order of tens or hundreds of milliseconds and is much faster than a brute-force exact search. More generally, both the index-time and search-time overheads seem manageable compared to other vector stores in existing comparative benchmarks * (look for the \"luceneknn\" line). We also believe that a lot of the value of vector search gets unlocked through the ability to combine vector search with other functionality. Furthermore, we recommend checking out our tuning guide for KNN search , which lists a number of measures that help mitigate the negative impact of the aforementioned cons. I hope you enjoyed this blog. Don't hesitate to reach out via Discuss if you have questions. And feel free to try out vector search in your existing deployment, or spin up a free trial of Elasticsearch Service on Elastic Cloud (which always has the latest version of Elasticsearch). *At the time of this writing, these benchmarks do not yet take advantage of vectorization. For more information on vectorization, read this blog . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Jump to Vector search is integrated in Elasticsearch through Apache Lucene Cons Merges need to recompute HNSW graphs Searches need to combine results from multiple segments RAM needs to scale with the size of the data set to retain optimal performance Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Vector search in Elasticsearch: The rationale behind the design - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/vector-search-elasticsearch-rationale","meta_description":"In this blog, you'll learn how vector search has been integrated into Elasticsearch and the trade-offs that we made."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Implementing image search: vector search via image processing in Elasticsearch Learn how to implement image search with an example. This blog covers how to use vector search through image processing in Elasticsearch. Vector Database AS By: Alex Salgado On November 8, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Case study: Finding your puppy with image search Have you ever been in a situation where you found a lost puppy on the street and didn’t know if it had an owner? Using vector search through image processing in Elasticsearch, this task can be as simple as reading a comic strip. Imagine this scene: On a tumultuous afternoon, Luigi, a small and lively puppy, found himself wandering alone through busy streets after accidentally slipping out of his leash during a walk around Elastic. His desperate owner was searching for him at every corner, calling his name with a voice full of hope and anxiety. Meanwhile, somewhere in the city, an attentive person noticed the puppy with a lost expression and decided to help. Quickly, they took a photo of Luigi and, using the vector image search technology of the company they worked for, began a search in the database hoping to find some clue about the owner of the little runaway. If you want to follow and execute the code while reading, access the file Python code running on a Jupyter Notebook (Google Collab) . The architecture We'll solve this problem using a Jupyter Notebook. First, we download the images of the puppies to be registered, and then we install the necessary packages. *Note: To implement this sample, we will need to create an index in Elasticsearch before populating our vector database with our image data. Begin by deploying Elasticsearch (we have a 14-day free trial for you) . During the process, remember to store the credentials (username, password) to be used in our Python code. For simplicity, we will use Python code running on a Jupyter Notebook (Google Colab). Download the code zip file and install the necessary packages Let's create 4 classes to assist us in this task, and they are: Util class : responsible for handling preliminary tasks and Elasticsearch index maintenance. Dog class : responsible for storing the attributes of our little dogs. DogRepository class : responsible for data persistence tasks. DogService class : it will be our service layer. Util class The Util class provides utility methods for managing the Elasticsearch index, such as creating and deleting the index. Methods: create_index() : Creates a new index in Elasticsearch. delete_index() : Deletes an existing index from Elasticsearch. Dog class The Dog class represents a dog and its attributes, such as ID, image path, breed, owner name, and image embeddings. Attributes dog_id : The dog's ID. image_path : The path to the dog's image. breed : The dog's breed. owner_name : The dog's owner's name. image_embedding : The dog's image embedding. Methods __init__() : Initializes a new Dog object. generate_embedding() : Generates the dog's image embedding. to_dict() : Converts the Dog object to a dictionary. DogRepository Class The DogRepository class provides methods for persisting and retrieving dog data from Elasticsearch. Methods insert() : Inserts a new dog into Elasticsearch. bulk_insert() : Inserts multiple dogs into Elasticsearch in bulk. search_by_image() : Searches for similar dogs by image. DogService Class The DogService class provides business logic for managing dog data, such as inserting and searching for dogs. Methods insert_dog() : Inserts a new dog into Elasticsearch. search_dogs_by_image() : Searches for similar dogs by image. The classes presented above provide a solid foundation for building a dog data management system. The Util class provides utility methods for managing the Elasticsearch index. The Dog class represents the attributes of a dog. The DogRepository class offers methods for persisting and retrieving dog data from Elasticsearch. The DogService class provides the business logic for efficient dog data management. The main code We'll basically have 2 main flows or phases in our code: Register the Dogs with basic information and image. Perform a search using a new image to find the Dog in the vector database. Phase 01: Registering the Puppy To store the information about Luigi and the other company's little dogs, we'll use the Dog class. For this purpose, let's code the sequence: Start registering the puppies Output Registering Luigi Registering all the others puppies Visualizing the new dogs Output Phase 02: Finding the lost dog Now that we have all the little dogs registered, let's perform a search. Our developer took this picture of the lost puppy. Output Let's see if we find the owner of this cute little puppy? Get the results Let's see what we found... Output Voilà!! We found it!!! But who will be the owner and their name? Output Luigi Jack Russel/Rat Terrier Ully Happy end We found Luigi !!! Let's notify Ully. Output In no time, Ully and Luigi were reunited. The little puppy wagged his tail with pure delight, and Ully hugged him close, promising to never let him out of her sight again. They had been through a whirlwind of emotions, but they were together now, and that was all that mattered. And so, with hearts full of love and joy, Ully and Luigi lived happily ever after. Conclusion In this blog post, we have explored how to use vector search to find a lost puppy using Elasticsearch. We have demonstrated how to generate image embeddings for dogs, index them in Elasticsearch, and then search for similar dogs using a query image. This technique can be used to find lost pets, as well as to identify other objects of interest in images. Vector search is a powerful tool that can be used for a variety of applications. It is particularly well-suited for tasks that require searching for similar objects based on their appearance, such as image retrieval and object recognition. We hope that this blog post has been informative and that you will find the techniques we have discussed to be useful in your own projects. Resources Elasticsearch Guide Elasticsearch Python client Hugging Face - Sentence Transformers What is vector search? | Elastic Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Case study: Finding your puppy with image search The architecture Download the code zip file and install the necessary packages Util class Dog class Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Implementing image search: vector search via image processing in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/implementing-image-search-with-elasticsearch","meta_description":"Learn how to implement image search with an example. This blog covers how to use vector search through image processing in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Chunking large documents via ingest pipelines plus nested vectors equals easy passage search Learn how to chunk large documents using ingest pipelines and nested vectors in Elasticsearch for easy passage search in vector search. Vector Database How To MH By: Michael Heldebrant On November 15, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector search is a powerful way to search data based on meaning rather than exact or inexact token matching techniques. However the text embedding models that power vector search can only process short passages of text on the order of several sentences rather than BM25 based techniques that can work on arbitrarily large amounts of text. Combining large documents seamlessly with vector search is now possible with Elasticsearch. How does it work at a high level? The combination of Elasticsearch features such as ingest pipelines, the flexibility of a script processor and new support for nested documents with dense_vectors allows for a straightforward way to at ingest time chunk large documents into small enough passages that can then be processed by text embedding models to generate all the vectors needed to represent the full meaning of the large documents. Ingest your document data as you would normally, and add to your ingest pipeline a script processor to break the large text data into an array of sentence or other types of chunks followed by a for_each processor to run an inference processor on each chunk. Mappings for the index are defined such that the array of chunks is set up as a nested object with a dense_vector mapping as a subobject which will then properly index each of the vectors and make them searchable. How to chunk large documents via ingest pipelines & nested vectors Load a text embedding model The first thing you will need is a model to create the text embeddings out of the chunks, you can use whatever you would like, but this example will run end to end on the all-distilroberta-v1 model. With an Elastic Cloud cluster created or another Elasticsearch cluster ready, we can upload the text embedding model using the eland library. Mappings example Next step is to prepare the mappings to handle the array of sentences and vector objects that will be created during the ingest pipeline. For this particular text embedding model the dimensions are 384 and dot_product similarity will be used for nearest neighbor calculations: Ingest pipeline examples The last preparation step is to define an ingest pipeline to break up the body_content field into chunks of text stored in the passages field. This pipeline has two processors, the first script processor breaks up the body_content field into an array of sentences stored in the passages field via a regular expression. For further research read up on regular expression advanced features such as negative lookbehind and positive lookbehind to understand how it tries to properly split on sentence boundaries, not split on Mr. or Mrs. or Ms., and keep the punctuation with the sentence. It also tries to concatenate the sentence chunks back together as long as the total string length is under the parameter passed to the script. The next for each processor runs the text embedding model on each sentence via an inferrence processor: Add some documents Now we can add documents with large amounts of text in body_content and automatically have them chunked, and each chunk text embedded into vectors by the model: Search those documents To search the data and return what chunk matched the query best you use inner_hits with the knn clause to return just that best matching chunk of the document in the hits output from the query: Will return the best document and the relevant portion of the larger document text: Review The approach used here shows the power of leveraging the different capabilities of Elasticsearch to solve a larger problem. Ingest pipelines allow you to preprocess your documents before indexing, and while there are many processors that do specific targeted tasks, sometimes you need the power of a scripting language to be able to do things like break up text into an array of sentences. Because you can access the document before it is indexed you have the ability to remake the data in nearly any fashion you can imagine as long as all the information is within the document itself. The foreach processor allows us to wrap something that may run zero to N times without knowing in advance how many times it needs to execute. In this case we are using it to run over as many sentences as we extract to run the infer processor to make vectors. The mappings of the index are prepared to handle the array of now objects of text and vector that did not exist in the original document with a nested object which indexes the data in a way that we can properly search the document. Using knn with nested support for vectors allows the use of inner_hits to present the best scoring portion of the document which can substitute for what would be usually done via highlighting in a BM25 query. Conclusion This hopefuly shows how Elasticsearch can do what it does best, just bring your data and Elasticsearch will make it searchable for you. Take your skills to the next level and learn how to implement a recursive chunking strategy by watching this video. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to How does it work at a high level? How to chunk large documents via ingest pipelines & nested vectors Load a text embedding model Mappings example Ingest pipeline examples Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Chunking large documents via ingest pipelines plus nested vectors equals easy passage search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/chunking-via-ingest-pipelines","meta_description":"Learn how to chunk large documents using ingest pipelines and nested vectors in Elasticsearch for easy passage search in vector search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Steps to improve search relevance In this first blog post, we will list and explain the differences between the primary building blocks available in the Elastic Stack to do information retrieval. Generative AI GC QH TV By: Grégoire Corbière , Quentin Herreros and Thomas Veasey On July 13, 2023 Part of Series Improving information retrieval in the Elastic Stack Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Since 8.0 and the release of third-party natural language processing (NLP) models for text embeddings, users of the Elastic Stack have access to a wide variety of models to embed their text documents and perform query-based information retrieval using vector search. Given all these components and their parameters, and depending on the text corpus you want to search in, it can be overwhelming to choose which settings will give the best search relevance. In this series of blog posts, we will introduce a number of tests we ran using various publicly available data sets and information retrieval techniques that are available in the Elastic Stack. We’ll then provide recommendations of the best techniques to use depending on the setup. To kick off this series of blogs, we want to set the stage by describing the problem we are addressing and describe some methods we will dig further into in subsequent blogs. Background and terminology BM25: A sparse, unsupervised model for lexical search The classic way documents are ranked for relevance by Elasticsearch according to a text query uses the Lucene implementation of the Okapi BM25 model. Although a few hyperparameters of this model were fine-tuned to optimize the results in most scenarios, this technique is considered unsupervised as labeled queries and documents are not required to use it: it’s very likely that the model will perform reasonably well on any corpus of text, without relying on annotated data. BM25 is known to be a strong baseline in zero-shot retrieval settings. Under the hood, this kind of model builds a matrix of term frequencies (how many times a term appears in each document) and inverse document frequencies (inverse of how many documents contain each term). It then scores each query term for each document that was indexed based on those frequencies. Because each document typically contains a small fraction of all words used in the corpus, the matrix contains a lot of zeros. This is why this type of representation is called sparse . Also, this model sums the relevance score of each individual term within a query for a document, without taking into account any semantic knowledge (synonyms, context, etc.). This is called lexical search (as opposed to semantic search). Its shortcoming is the so-called vocabulary mismatch problem, that query vocabulary is slightly different to the document vocabulary. This motivates other scoring models that try to incorporate semantic knowledge to avoid this problem. Dense models: A dense, supervised model for semantic search More recently, transformer-based models have allowed for a dense, context aware representation of text, addressing the principal shortcomings mentioned above. To build such models, the following steps are required: 1. Pre-training We first need to train a neural network to understand the basic syntax of natural language. Using a huge corpus of text, the model learns semantic knowledge by training on unsupervised tasks (like Masked Word Prediction or Next Sentence Prediction). BERT is probably the best known example of these models — it was trained on Wikipedia (2.5B words) and BookCorpus (800M words) using Masked Word Prediction. This is called pre-training . The model learns vector representations of language tokens, which can be adapted for other tasks with much less training. Note that at this step, the model wouldn’t perform well on downstream NLP tasks. This step is very expensive, but many such foundational models exist that can be used off the shelf. 2. Task-specific training Now that the model has built a representation of natural language, it’ll train much more effectively on a specific task such as Dense Passage Retrieval (DPR) that allows Question Answering. To do so, we must slightly adapt the model’s architecture and then train it on a large number of instances of the task, which, for DPR, consists in matching a relevant passage taken from a relevant document. So this requires a labeled data set, that is, a collection of triplets : A query: \"What is gold formed in?\" A document or passage taken from a document: \"The core of large stars, especially during a nova\" Optionally, a score of degree of relevance for this (query, document) pair (If no score is given, we assume that the score is binary, and that all the other documents can be considered as irrelevant for the given query.) A very popular and publicly available data set to perform such a training for DPR is the MS MARCO data set. This data set was created using queries and top results from Microsoft’s Bing search engine. As such, the queries and documents it contains fall in the general knowledge linguistic domain, as opposed to specific linguistic domain (think about research papers or language used in law). This notion of linguistic domain is important, as the semantic knowledge learned by those models is giving them an important advantage “in-domain”: when BERT came out, it improved previous state of the art models on this MS MARCO data set by a huge margin. 3. Domain-specific training Depending on how different your data is from the data set used for task-specific training, you might need to train your model using a domain specific labeled data set. This step is also referred to as fine tuning for domain adaptation or domain-adaptation. The good news is that you don’t need as large a data set as was required for the previous steps — a few thousands or tens of thousands of instances of the tasks can be enough. The bad news is that these query-document pairs need to be built by domain experts, so it’s usually a costly option. The domain adaptation is roughly similar to the task-specific training. Having introduced these various techniques, we will measure how they perform on a wide variety of data sets. This sort of general purpose information retrieval task is of particular interest for us. We want to provide tools and guidance for a range of users, including those who don’t want to train models themselves in order to gain some of the benefits they bring to search. In the next blog post of this series, we will describe the methodology and benchmark suite we will be using. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid retrieval Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Background and terminology BM25: A sparse, unsupervised model for lexical search Dense models: A dense, supervised model for semantic search Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving information retrieval in the Elastic Stack: Steps to improve search relevance - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/improving-information-retrieval-elastic-stack-search-relevance","meta_description":"In this first blog post, we will list and explain the differences between the primary building blocks available in the Elastic Stack to do information retrieval."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Understanding sparse vector embeddings with trained ML models Learn about sparse vector embeddings, understand what they do/mean, and how to implement semantic search with them. Vector Database Search Relevance How To DS By: Dai Sugimori On February 24, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch provides a semantic search feature that allows users to query in natural language and retrieve relevant information. To achieve this, target documents and queries must first be transformed into vector representations through an embedding process, which is handled by a trained Machine Learning (ML) model running either inside or outside Elasticsearch. Since choosing a good machine learning model is not easy for most search users, Elastic introduced a house-made ML model called ELSER (Elastic Learned Sparse EncodeR). It is bundled with Elasticsearch, so they can use it out-of-the-box under a platinum license (refer to the subscription page ). It has been a widely used model for implementing semantic search. Unlike many other models that generate \"dense vectors,\" ELSER produces \"sparse vectors,\" which represent embeddings differently. Although sparse vector models work in many use cases, only dense vector models could be uploaded to Elasticsearch until Elasticsearch and Eland 8.16. However, starting from version 8.17, you can now upload sparse vector models as well using Eland’s eland_import_hub_model CLI. This means you can generate sparse vector embeddings using models from Hugging Face, not just ELSER. In this article, I’d like to recap what sparse vectors are compared to dense ones as well as introduce how to upload them from outside for use in Elasticsearch. What is the sparse vector? What is the difference between dense and sparse vectors? Let's start with dense vectors, which are more commonly used for search today. Dense vectors When text is embedded as a dense vector, it looks like this: Key characteristics of dense vectors: The vector has a fixed dimension. Each element is a numeric value (float by default). Every element represents some feature, but their meanings are not easily interpretable by humans. Most elements have nonzero values. The ML models, especially those based on transformers, will produce geometrically similar vectors if the meanings of the input text are similar. The similarity is calculated by some different functions such as cosine, l2_norm, etc. For example, if we embed the words \"cat\" and \"kitten\", their vectors would be close to each other in the vector space because they share similar semantic meaning. In contrast, the vector for \"car\" would be farther away, as it represents a completely different concept. Elastic has many interesting articles about vector search. Refer to them if you are keen to learn more: A quick introduction to vector search - Elasticsearch Labs Navigating an Elastic vector database - Elasticsearch Labs Vector similarity techniques and scoring - Elasticsearch Labs Sparse vectors In contrast, sparse vectors have a different structure. Instead of assigning a value to every dimension, they mostly consist of zeros, with only a few nonzero values. A sparse vector representation looks like this: {\"f1\":1.2,\"f2\":0.3,… } Key characteristics of sparse vectors: Most values in the vector are zero. Instead of a fixed-length array, they are often stored as key-value pairs, where only nonzero values are recorded. The feature which has zero value never appears in the sparse vector representation. The representation is more interpretable, as each key (feature) often corresponds to a meaningful term or concept. BM25 and sparse vector representation BM25, a well-known ranking function for lexical search, uses a sparse vector representation of text known as a term vector . In this representation, each term (known as token or word) in the document is assigned a weight based on its frequency and importance within the corpus (a set of documents). This approach allows efficient retrieval by matching query terms to document terms. Lexical vs. semantic search BM25 is an example of lexical search , where matching is based on exact terms found in the document. It relies on a sparse vector representation derived from the vocabulary of the corpus. On the other hand, semantic search goes beyond exact term matching. It uses vector embeddings to capture the meaning of words and phrases, enabling retrieval of relevant documents even if they don't contain the exact query terms. In addition, Elasticsearch can do more. You can combine these two searches in one query as a hybrid search. Refer to the links below to learn more about it. When hybrid search truly shines - Elasticsearch Labs Hybrid search with multiple embeddings: A fun and furry search for cats! - Elasticsearch Labs Hybrid Search: Combined Full-Text and kNN Results - Elasticsearch Labs Hybrid Search: Combined Full-Text and ELSER Results - Elasticsearch Labs Tutorial: hybrid search with semantic_text | Elasticsearch Guide | Elastic Dense and sparse vectors in semantic search Semantic search can leverage both dense and sparse vectors: Dense vector models (e.g., BERT-based models) encode semantic meaning into fixed-length vectors, enabling similarity search based on vector distances. Sparse vector models (e.g., ELSER) capture semantic meaning while preserving interpretability, often leveraging term-based weights. With Elasticsearch 8.17, you can now upload and use both dense and sparse vector models, giving you more flexibility in implementing semantic search tailored to your needs. Why sparse vectors? Sparse vectors offer several advantages in Elasticsearch, making them a powerful alternative to dense vectors in certain scenarios. For example, dense vector (knn) search requires the models to be learned with a good/enough corpus of the domain that users are working with. But it is not always easy to have such a model which is suitable for your use case, and fine-tuning the model is even harder for most users. In this case, the sparse vector models will help you. Let’s learn why. Good for zero-shot learning Sparse vectors, especially those generated by models like ELSER, can generalize well to new domains without requiring extensive fine-tuning. Unlike dense vector models that often need domain-specific training, sparse vectors rely on term-based representations, making them more effective for zero-shot retrieval—where the model can handle queries it hasn’t explicitly been trained on. Resource efficiency Sparse vectors are inherently more resource-efficient than dense vectors. Since they contain mostly zeros and only store nonzero values as key-value pairs, they require less memory and storage. Sparse vectors in Elasticsearch Elasticsearch initially supported sparse vector search using the rank_features query. However, with recent advancements, sparse vector search is now natively available through the sparse_vector query, providing better integration with Elasticsearch’s machine learning models. Integration with ML models The sparse_vector query is designed to work seamlessly with trained models running on Elasticsearch ML nodes. This allows users to generate sparse vector embeddings dynamically and retrieve relevant documents using efficient similarity search. Leveraging Lucene’s inverted index One of the key benefits of sparse vectors in Elasticsearch is that they leverage Lucene’s inverted index —the same core technology that powers Elasticsearch’s fast and scalable search. Resource efficiency: Since Lucene is optimized for term-based indexing, sparse vectors benefit from efficient storage and retrieval. Maturity: Elasticsearch has a well-established and highly optimized indexing system, making sparse vector search a natural fit within its architecture. By utilizing Lucene’s indexing capabilities, Elasticsearch ensures that sparse vector search remains fast, scalable, and resource-efficient , making it a strong choice for real-world search applications. Implementing sparse embedding with a preferred model from Hugging Face Starting from Elasticsearch 8.17, you can use any sparse vector model from Hugging Face as long as it employs a supported tokenization method. This allows greater flexibility in implementing semantic search with models tailored to your specific needs. Elasticsearch currently supports the following tokenization methods for sparse and dense vector embeddings: bert – For BERT-style models deberta_v2 – For DeBERTa v2 and v3-style models mpnet – For MPNet-style models roberta – For RoBERTa-style and BART-style models xlm_roberta – For XLM-RoBERTa-style models bert_ja – For BERT-style models trained specifically for Japanese For a full list of supported tokenization methods, refer to the official documentation: Elasticsearch Reference: PUT Trained Models If the tokenization of your model is available, you can select it even if it’s not for non-English languages! Below are some examples of available sparse models on Hugging Face: naver/splade-v3-distilbert hotchpotch/japanese-splade-v2 aken12/splade-japanese-v3 Steps to use a sparse vector model from Hugging Face We already have a good article about sparse vector search with ELSER. Most of the steps are the same, but if you’d like to use the sparse embedding model from Hugging Face, you need to upload it to Elasticsearch beforehand with Eland. Here is the step-by-step guide to use the external sparse model on Elasticsearch for semantic search. 1. Find a sparse vector model Browse Hugging Face ( huggingface.co ) for a sparse embedding model that fits your use case. Ensure that the model uses one of the supported tokenization methods listed above. Let’s select the “ naver/splade-v3-distilbert ” model as an example of the sparse embedding model. Note: Elastic’s ELSER model is heavily inspired by Naver’s SPLADE model. Visit their website to learn more about SPLADE. 2. Upload the model to Elasticsearch You need to install Eland, a Python client and toolkit for DataFrames and machine learning in Elasticsearch. Note that you need Eland 8.17.0 or later for uploading sparse vector models. Once it is installed on your computer, use Eland’s CLI tool ( eland_import_hub_model ) to import the model into Elasticsearch. Alternatively, you can do the same with Docker if you don’t want to install Eland locally. The key point here is that you need to set text_expansion as the task type for the sparse vector embeddings, unlike text_embedding for the dense vector embeddings. (JFYI, there is a discussion about the task name.) 3. Define index mapping with sparse_vector Create an index that has a sparse_vector field. It should be noted that the sparse_vector field type was formerly known as rank_features field type. Although there is no functional difference between them, you should use sparse_vector field type for the clarity of its usage. Note: Elastic recently introduced a semantic_text field. It is super useful and easy-to-implement semantic search. Refer to this article for the details. You can use semantic_text field for the same purpose, but to stay focused on the embedding part, let’s use sparse_vector field for now. 4. Create ingest pipeline with inference processor The text information needs to be embedded into the sparse vector before it is indexed. That can be done by the ingest pipeline with the inference processor. Create the ingest pipeline with an inference processor which refers to the model you have uploaded before. 5. Ingest the data with the pipeline Ingest text data into an index with the “sparse-test-pipeline” we created so that the content will be automatically embedded into the sparse vector representation. Once it is done, let’s check how it is indexed. It will return like this: As you can see, the input text \" Elasticsearch provides a semantic search feature that allows users to query in natural language and retrieve relevant information. \" is embedded as: As you can see, the input text doesn’t directly mention most of the words listed in the response, but they look semantically related. It means, the model knows that the concepts of these words are related to the input text based on the corpus the model was trained on. Therefore, the quality of these sparse embeddings depends on the ML model you configured. Search with a semantic query Now you can perform semantic search against the “sparse-test” index with sparse_vector query. I’ve ingested some Elastic’s blog content into sparse-test index, so let’s test it out. The response was: As you can see, the original content is embedded into the sparse vector representation, so you can easily understand how the model determines the meanings of those texts. The first result doesn’t contain most words that can be found in the query text, but still, the relevance score is high because the sparse vector representations are similar. For better precision, you can also try hybrid search with RRF so that you can combine lexical and semantic search within one query. Refer to the official tutorial to learn more. Conclusion Sparse vectors provide a powerful and efficient way to enhance search capabilities in Elasticsearch. Unlike dense vectors, sparse vectors offer key advantages such as better zero-shot performance and resource efficiency . They integrate seamlessly with Elasticsearch’s machine learning capabilities and leverage Lucene’s mature and optimized inverted index , making them a practical choice for many applications. Starting with Elasticsearch 8.17, users now have greater flexibility in choosing between dense and sparse vector models based on their specific needs. Whether you're looking for interpretable representations , scalable search performance , or efficient memory usage , sparse vectors provide a compelling option for modern search applications. As Elasticsearch continues to evolve, sparse vector search is set to play an increasingly important role in the future of information retrieval. Now, you can take advantage of ELSER and also other Hugging Face models to explore new possibilities in semantic search. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to What is the sparse vector? Dense vectors Sparse vectors BM25 and sparse vector representation Lexical vs. semantic search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Understanding sparse vector embeddings with trained ML models - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/sparse-vector-embedding","meta_description":"Learn about sparse vector embeddings, understand what they do/mean, and how to implement semantic search with them."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to create custom connectors for Elasticsearch Learn how to create custom connectors for Elasticsearch to simplify your data ingestion process. Ingestion Python How To JB By: Jedr Blaszyk On October 4, 2023 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Elasticsearch has a library of ingestion tools to bring data from several sources. However, sometimes your data sources might not be compatible with Elastic’s existing ingestion tools. In this case, you may need to create a custom connector to connect your data with Elasticsearch. There are several reasons to use Elastic connectors for your apps. For example, you may want to: Bring data from custom or legacy applications to Elasticsearch Introduce a semantic search for your organizational data Extract textual content from files like PDFs, MS Office documents and more Use Kibana UI to manage your data sources (including configuration, filtering rules, setting up periodic sync schedule rules) You want to deploy Elastic connectors on your own infrastructure (some of Elastic-supported connectors are available as native connectors in the Elastic Cloud) Framework to create custom connectors If creating your own connector is the solution to your requirements, the connectors framework will help you create one. We created the framework to enable the creation of custom connectors and help users connect unique data sources to Elasticsearch. Code for connectors is available on GitHub , and we have documentation that can help you get started. The framework is designed to be simple and performant. It is meant to be developer-friendly, hence it is open-code and highly customizable. The connectors you create can be self-managed on your own infrastructure. The goal is to enable developers to integrate their own data sources very easily with Elasticsearch. What you need to know before you use connectors framework The framework is written in async-python There are several courses to learn async-python. In case you want a recommendation, we thought this LinkedIn learning course was really good, but it requires a subscription. A free alternative we liked was this one . Why did we choose async python? Ingestion is IO bound (not CPU bound) so async programming is the optimal approach when building a connector from a resource-utilization perspective. In I/O bound applications, the majority of the time is spent waiting for external resources, such as reading from files, making network requests, or querying databases. During these waiting periods, traditional synchronous code would block the entire program, leading to inefficient resource utilization. Any other pre-requisites? This is not a pre-requisite. It’s definitely worth going through the Connectors Developer's Guide before you get started! Hope you find this useful. Using connectors framework to create a custom connector Getting started is easy. In terminology related to the framework, we refer to a custom connector as a source. We implement a new source by creating a new class, and the responsibility of this class is to send documents to Elasticsearch from the custom data source. As an optional way to get started, users can also check out this example of a directory source . This is a good but basic example that can help you figure out how you can write a custom connector. Outline of steps Once you know which custom data source you want to create a connector for, here’s an outline of steps to write a new source : add a module or a directory in connectors/sources declare your dependencies in requirements.txt . Make sure you pin these dependencies implement a class that implements methods described in connectors.source.BaseDataSource (optional, when contributing to repo) add a unit test in connectors/sources/tests with +90% coverage declare your connector connectors/config.py in the sources section That’s it. We’re done! Now you should be able to run the connector What you need to know before writing your custom connector To enable Elasticsearch users to ingest data and build a search experience on top of that data, we provide a lightweight Connector Protocol . This protocol allows users to easily ingest data, use Enterprise Search features to manipulate that data, and create a search experience while providing them with a seamless user experience in Kibana. To be compatible with Enterprise Search and take full advantage of the connector features available in Kibana, a connector must adhere to the protocol. What you need to know about connectors protocol This documentation page provides a good overview of the protocol. Here’s what you need to know: All communication between connectors and other parts of the system happen asynchronously through an Elasticsearch index Connectors communicate their status to Elasticsearch and Kibana so that users can provide it with configuration and diagnose any issues This allows for simple, developer-friendly deployment of connectors. connectors service is stateless, and doesn’t care where your Elastic deployment runs, as long as it can connect to it over the network it works well. The service is also fault-tolerant, and it can resume operation on a different host after a restart or a failure. Once it reestablishes a connection with Elasticsearch, it will continue its normal operation. Under the hood, the protocol uses Elasticsearch indices to track connector state .elastic-connectors and .elastic-connectors-sync-jobs (described in the docs linked above) Where custom connectors are hosted The connector itself is not tied to Elasticsearch and it can be hosted in your own environment If you have an Elasticsearch deployment, regardless of whether it is self-managed or lives in Elastic Cloud: You, as a developer/company can write a customized connector for your data source Manage the connector on your own infrastructure and configure the connector service for your needs As long as the connector can discover Elasticsearch over the network it is able to index the data You, as the admin can control the connector through Kibana Example: Google Drive connector using connectors framework We wrote a simple connector for Google Drive using the connectors framework. We implemented a new source by creating a new class whose responsibility is to send documents to Elasticsearch from the targeted source. Note: This tutorial is compatible with Elastic stack version 8.10 . For later versions, always check the connectors release notes for updates and refer to the Github repository . We start with a GoogleDriveDataSource class with expected method signatures of BaseDataSource to configure the data source, check its availability (pinging), and retrieve documents. In order to make this connector functional we need to implement those methods. This GoogleDriveDataSource class is a starting point for writing Google Drive source. By following these steps, you will implement the logic needed to sync data with Google Drive: We need to add this file in connectors/sources Set your new connector name and service_type e.g. Google Drive as name and google_drive as service type To get your connector sync data from the source, you need to implement: get_default_configuration - This function should return a collection of RichConfigurableFields . These fields allow you to configure the connector from the Kibana UI. This includes passing authentication details, credentials, and other source-specific settings. Kibana smartly renders these configurations. For example, if you flag a field with \"sensitive\": True Kibana will mask it for security. ping - A simple call to the data source that verifies its status, think of it as a health check. get_docs - This method needs to implement the logic to actually fetch the data from the source. This function should return an async iterator that returns a tuple containing: ( document , lazy_download ), where: document - is a JSON representation of an item in the remote source. (like name, location, table, author, size, etc) lazy_download - is a coroutine to download the object/attachments for content extraction handled by the framework (like text extraction from a PDF document) There are other abstract methods in BaseDataSource class. Note that these methods don’t need to be implemented, if you only want to support content syncing (e.g. fetching all data from google drive). They refer to other connector functionalities such as: Document level security ( get_access_control , access_control_query ) Advanced filtering rules ( advanced_rules_validators ) Incremental syncs ( get_docs_incrementally ) Other functionalities may be added in the future How we approached writing the official Elasticsearch Google Drive connector Start by implementing the methods expected by the BaseDataSource class We needed to implement the methods get_default_configuration , ping and get_docs to have the connector synchronize the data. So let’s dive deeper into the implementation. The first consideration is: How to “talk” to Google Drive to get data? Google provides an official python client , but it is synchronous, so it’s likely to be slow for syncing content. We think a better option is the aiogoogle library, which offers full client functionality written in async python. This might not be intuitive at first, but it is really important to use async operations for performance. So, here in this example, we opted not to use the official google library as it doesn't support async mode. If you use synchronous or blocking code within an asynchronous framework, it can have a significant impact on performance. The core of any async framework is the event loop. The event loop allows for the concurrent execution of asynchronous tasks by continuously polling for completed tasks and scheduling new ones. If a blocking operation is introduced, it would halt the loop's execution, preventing it from managing other tasks. This essentially negates the concurrency benefits provided by the asynchronous architecture. The next concern is the connector authentication We authenticate the Google Drive connector as a service account . More information about authentication can be found in these connector docs pages . Service account can authenticate using keys We pass the authentication key to the service account through the Kibana UI in Elasticsearch Let’s look at the get_default_configuration implementation that allows an end user to pass a credential key that will be stored in the index for authentication during syncs: Next, let’s implement a simple ping method We will make a simple call to google drive api, e.g. /about endpoint. For this step, let's consider a simplified representation of the GoogleDriveClient . Our primary goal here is to guide you through connector creation, so we're not focusing on implementation details of the Google Drive client. However, a minimal client code is essential for the connector's operation, so we will rely on pseudo-code for the GoogleDriveClient class representation. Async iterator to return files from google drive for content extraction The next step is to write get_docs async iterator that will return the files from google drive and coroutines for downloading them for content extraction. From personal experience, it is often simpler to start implementing get_docs as a simple stand-alone python script to get this working and fetch some data. Once the get_docs code is working, we can move it to the data source class. Let’s look at api docs, we can: Use files/list endpoint to iterate over docs in drive with pagination Use files/get and files/export for downloading the files (or exporting google docs to a specific file format) So what is happening in this bit of code? list_files paginates over files in drive. prepare_files formats the file metadata to expected schema get_content is a coroutine that downloads the file and Base64 encodes its content (compatible format for content extraction) Some code details have been omitted for brevity. For a complete implementation, see the actual connector implementation on GitHub . Let’s run the connector! To integrate your custom connector into the framework, you'll need to register its implementation. Do this by adding an entry for your custom connector in the sources section in connectors/config.py . For the Google Drive example, the addition would appear as: Now in the Kibana interface: Go to Search -> Indices -> Create a new index -> Use a Connector Select Customized connector (when using a custom connector) Configure your connector. Generate the Elasticsearch API key and connector ID, and put these details in config.yml as instructed, and start your connector. At this point, your connector should be detected by Kibana! Schedule a recurring data sync or just click “Sync” to start a full sync. A connector can be configured to use Elasticsearch’s ingestion pipelines to perform transformations on data before storing it in an index. A common use case is document enrichment with machine learning . For example, you can: analyze text fields using a Text embedding model that will generate a dense vector representation of your data run text classification for sentiment analysis extract key information from text with Named Entitiy Recogintion (NER) Once your sync finishes, your data will be available in a search-optimized Elasticsearch index. At this point, you can dive into building search experiences or delve into analytics. Do you want to create and contribute a new connector? If you create a custom connector for a source that may help the Elasticsearch community, consider contributing it. Here are the promotion path guidelines to get a customized connector to become an Elastic-supported connector. Acceptance criteria for contributing connectors Also, before you start spending some time developing a connector, you should create an issue and reach out to get some initial feedback on the connector and what libraries it will use. Once your connector idea has some initial feedback, ensure your project meets a few acceptance criteria: add a module or a directory in connectors/sources implement a class that implements all methods described in connectors.source.BaseDataSource add a unit test in connectors/sources/tests with +90% coverage declare your connector in connectors/config.py in the sources section declare your dependencies in requirements.txt . Make sure you pin these dependencies for each dependency you are adding, including indirect dependencies, list all the licences and provide the list in your patch. make sure you use an async lib for your source. If not possible, make sure you don't block the loop when possible, provide a docker image that runs the backend service, so we can test the connector. If you can't provide a docker image, provide the credentials needed to run against an online service. the test backend needs to return more than 10k documents due to 10k being the default size limit for Elasticsearch pagination. Having more than 10k documents returned from the test backend will help test the connector Supporting tools to test your connector We also have some supporting tools that profile the connector code and run performance tests. You can find those resources here: Perf8 - Performance library and dashboard, to profile the quality of python code to assess resource utilization and detect blocking calls E-2-E functional tests that make use of perf8 library to profile each connector Wrap up We hope this blog and the example were useful for you. Here’s the complete list of available native connectors and connector clients for Elasticsearch. If you don’t find your data source listed, perhaps create a custom connector? Here are some useful resources relevant to this article: connectors GitHub repository and documentation page Async Python learning course New custom connector community guidelines Licensing details for Elastic’s connector-framework (search for Connector Framework at this link ) If you don’t have an Elastic account, you can always spin up a trial account to get started! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Framework to create custom connectors What you need to know before you use connectors framework The framework is written in async-python Why did we choose async python? Any other pre-requisites? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to create custom connectors for Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/how-to-create-customized-connectors-for-elasticsearch","meta_description":"Learn how to create custom connectors for Elasticsearch to simplify your data ingestion process."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Designing for large scale vector search with Elasticsearch Explore the cost, performance and benchmarking for running large-scale vector search in Elasticsearch, with a focus on high-fidelity dense vector search. Vector Database JF By: Jim Ferenczi On June 12, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Part 1: High-fidelity dense vector search Introduction When designing for a vector search experience, the sheer number of available options can feel overwhelming. Initially, managing a small number of vectors is straightforward, but as applications scale, this can quickly become a bottleneck. In this blog post series, we’ll explore the cost and performance of running large-scale vector search with Elasticsearch across various datasets and use cases. We begin the series with one of the largest publicly available vector datasets: the Cohere/msmarco-v2-embed-english-v3 . This dataset includes 138 million passages extracted from web pages in the MSMARCO-passage-v2 collection , embedded into 1024 dimensions using Cohere's latest embed-english-v3 model . For this experiment, we defined a reproducible track that you can run on your own Elastic deployment to help you benchmark your own high fidelity dense vector search experience. It is tailored for real-time search use cases where the latency of a single search request must be low (<100ms). It uses Rally , our open source tool, to benchmark across Elasticsearch versions. In this post, we use our default automatic quantization for floating point vectors . This reduces the RAM cost of running vector searches by 75% without compromising retrieval quality. We also provide insights into the impact of merging and quantization when indexing billions of dimensions. We hope this track serves as a useful baseline, especially if you don’t have vectors specific to your use case at hand. Notes on embeddings Picking the right model for your needs is outside the scope of this blog post but in the next sections we discuss different techniques to compress the original size of your vectors. Matryoshka Representation Learning (MRL) By storing the most important information in earlier dimensions, new methods like Matryoshka embeddings can shrink dimensions while keeping good search accuracy. With this technique, certain models can be halved in size and still maintain 90% of their NDCG@10 on MTEB retrieval benchmarks. However, not all models are compatible. If your chosen model isn't trained for Matryoshka reduction or if its dimensionality is already at its minimum, you'll have to manage dimensionality directly in the vector database. Fortunately, the latest models from mixedbread or OpenAI come with built-in support for MRL. For this experiment we choose to focus on a use case where the dimensionality is fixed (1024 dimensions), playing with the dimensionality of other models will be the topic for another time. Embedding quantization learning Model developers are now commonly offering models with various trade-offs to address the expense of high-dimensional vectors. Rather than solely focusing on dimensionality reduction, these models achieve compression by adjusting the precision of each dimension. Typically, embedding models are trained to generate dimensions using 32-bit floating points. However, training them to produce dimensions with reduced precision helps minimize errors. Developers usually release models optimized for well-known precisions that directly align with native types in programming languages. For example, int8 represents a signed integer ranging from -127 to 127, while uint8 denotes an unsigned integer ranging from 0 to 255. Binary, the simplest form, represents a bit (0 or 1) and corresponds to the smallest possible unit per dimension. Implementing quantization during training allows for fine-tuning the model weights to minimize the impact of compression on retrieval performance. However, delving into the specifics of training such models is beyond the scope of this blog. In the following section, we will introduce a method for applying automatic quantization if the chosen model lacks this feature. Adaptive embedding quantization In cases where models lack quantization-aware embeddings, Elasticsearch employs an adaptive quantization scheme that defaults to quantizing floating points to int8. This generic int8 quantization typically results in negligible performance loss. The benefit of this quantization lies in its adaptability to data drift . It utilizes a dynamic scheme where quantization boundaries can be recalculated from time to time to accommodate any shifts in the data. Large scale benchmark Back-of-the-envelope estimation With 138.3 million documents and 1024-dimensional vectors, the raw size of the MSMARCO-v2 dataset to store the original float vectors exceeds 520GB. Using brute force to search the entire dataset would take hours on a single node. Fortunately, Elasticsearch offers a data structure called HNSW (Hierarchical Navigable Small World Graph), designed to accelerate nearest neighbor search . This structure allows for fast approximate nearest neighbor searches but requires every vector to be in memory. Loading these vectors from disk is prohibitively expensive, so we must ensure the system has enough memory to keep them all in memory. With 1024 dimensions at 4 bytes each, each vector requires 4 kilobytes of memory. Additionally, we need to account for the memory required to load the Hierarchical Navigable Small World (HNSW) graph into memory. With the default setting of 32 neighbors per node in the graph, an extra 128 bytes (4 bytes per neighbor) of memory per vector is necessary to store the graph, which is equivalent to approximately 3% of the memory cost of storing the vector dimensions. Ensuring sufficient memory to accommodate these requirements is crucial for optimal performance. On Elastic Cloud , our vector search-optimized profile reserves 25% of the total node memory for the JVM (Java Virtual Machine), leaving 75% of the memory on each data node available for the system page cache where vectors are loaded. For a node with 60GB of RAM, this equates to 45GB of page cache available for vectors. The vector search optimized profile is available on all Cloud Solution Providers (CSP) AWS, Azure and GCP . To accommodate the 520GB of memory required, we would need 12 nodes, each with 60GB of RAM, totaling 720GB. At the time of this blog this setup can be deployed in our Cloud environment for a total cost of $14.44 per hour on AWS: (please note that the price will vary for Azure and GCP environments): By leveraging auto-quantization to bytes, we can reduce the memory requirement to 130gb, which is just a quarter of the original size. Applying the same 25/75 memory allocation rule, we can allocate a total of 180 gb of memory on Elastic Cloud. At the time of this blog this optimized setup results in a total cost of $3.60 per hour on Elastic Cloud (please note that the price will vary for Azure and GCP environments): Start a free trial on Elastic Cloud and simply select the new Vector Search optimized profile to get started. In this post, we'll explore this cost-effective quantization using the benchmark we created to experiment with large-scale vector search performance. By doing so, we aim to demonstrate how you can achieve significant cost savings while maintaining high search accuracy and efficiency. Benchmark configuration The msmarco-v2-vector rally track defines the default mapping that will be used. It includes one dense vector field with 1024 dimensions, indexed with auto int8 quantization, and a doc_id field of type keyword to uniquely identify each passage. For this experiment, we tested with two configurations: Default : This serves as the baseline, using the track on Elasticsearch with default options. Aggressive Merge : This configuration provides a comparison point with different trade-offs. As previously explained , each shard in Elasticsearch is composed of segments. A segment is an immutable division of data that contains the necessary structures to directly lookup and search the data. Document indexing involves creating segments in memory, which are periodically flushed to disk. To manage the number of segments, a background process merges segments to keep the total number under a certain budget. This merge strategy is crucial for vector search since HNSW graphs are independent within each segment. Each dense vector field search involves finding the nearest neighbors in every segment, making the total cost dependent on the number of segments. By default, Elasticsearch merges segments of approximately equal size, adhering to a tiered strategy controlled by the number of segments allowed per tier. The default value for this setting is 10, meaning each level should have no more than 10 segments of similar size. For example, if the first level contains segments of 50MB, the second level will have segments of 500MB, the third level 5GB, and so on. The aggressive merge configuration adjusts the default settings to be more assertive: It sets the segments per tier to 5, enabling more aggressive merges. It increases the maximum merged segment size from 5GB to 25GB to maximize the number of vectors in a single segment. It sets the floor segment size to 1GB, artificially starting the first level at 1GB. With this configuration, we expect faster searches at the expense of slower indexing. For this experiment, we kept the default settings for m , ef_construction , and confidence_interval options of the HNSW graph in both configurations. Experimenting with these indexing parameters will be the subject of a separate blog. In this first part, we chose to focus on varying the merge and search parameters. When running benchmarks, it's crucial to separate the load driver, which is responsible for sending documents and queries, from the evaluated system (Elasticsearch deployment). Loading and querying hundreds of millions of dense vectors require additional resources that would interfere with the searching and indexing capabilities of the evaluated system if run together. To minimize latency between the system and the load driver, it's recommended to run the load driver in the same region of the Cloud provider as the Elastic deployment, ideally in the same availability zone. For this benchmark, we provisioned an im4gn.4xlarge node on AWS with 16 CPUs, 64GB of memory, and 7.5TB of disk in the same region as the Elastic deployment. This node is responsible for sending queries and documents to Elasticsearch. By isolating the load driver in this manner, we ensure accurate measurement of Elasticsearch's performance without the interference of additional resource demands. We ran the entire benchmarks with the following configuration: The initial_indexing_bulk_indexing_clients value of 12 indicates that we will ingest data from the load driver using 12 clients. With a total of 23.9 vCPUs in the Elasticsearch data nodes, using more clients to send data increases parallelism and enables us to fully utilize all available resources in the deployment. For search operations, the standalone_search_clients and parallel_indexing_search_clients values of 8 mean that we will use 8 clients to query Elasticsearch in parallel from the load driver. The optimal number of clients depends on multiple factors; in this experiment, we selected the number of clients to maximize CPU usage across all Elasticsearch data nodes. To compare the results, we ran a second benchmark on the same deployment, but this time we set the parameter aggressive_merge to true. This effectively changes the merge strategy to be more aggressive, allowing us to evaluate the impact of this configuration on search performance and indexing speed. Indexing performance In Rally, a challenge is configured with a list of scheduled operations to execute and report. Each operation is responsible for performing an action against the cluster and reporting the results. For our new track, we defined the first operation as initial-documents-indexing , which involves bulk indexing the entire corpus. This is followed by wait-until-merges-finish-after-index , which waits for background merges to complete at the end of the bulk loading process. This operation does not use force merge; it simply waits for the natural merge process to finish before starting the search evaluation. Below, we report the results of these operations of the track , they correspond to the initial loading of the dataset in Elasticsearch. The search operations are reported in the next section. With Elasticsearch 8.14.0, the initial indexing of the 138M vectors took less than 5 hours, achieving an average rate of 8,000 documents per second. Please note that the bottleneck is typically the generation of the embeddings, which is not reported here. Waiting for the merges to finish at the end added only 2 extra minutes: Total Indexing performance (8.14.0 default int8 HNSW configuration) For comparison, the same experiment conducted on Elasticsearch 8.13.4 required almost 6 hours for ingestion and an additional 2 hours to wait for merges: Total Indexing performance (8.13.4 default int8 HNSW configuration) Elasticsearch 8.14.0 marks the first release to leverage native code for vector search . A native Elasticsearch codec is employed during merges to accelerate similarities between int8 vectors, leading to a significant reduction in overall indexing time. We're currently exploring further optimizations by utilizing this custom codec for searches, so stay tuned for updates! The aggressive merge run completed in less than 6 hours, averaging 7,000 documents per second. However, it required nearly an hour to wait for merges to finish at the end. This represents a 40% decrease in speed compared to the run with the default merge strategy: Total Indexing performance (8.14.0 aggressive merge int8 HNSW configuration) This additional work performed by the aggressive merge configuration can be summarized in the two charts below. The aggressive merge configuration merges 2.7 times more documents to create larger and fewer segments. The default merge configuration reports nearly 300 million documents merged from the 138 million documents indexed. This means each document is merged an average of 2.2 times. Total number of merged documents per node (8.14.0 default int8 HNSW configuration) Total number of merged documents per node (8.14.0 aggressive merge int8 HNSW configuration) In the next section we’ll analyze the impact of these configurations on the search performance. Search evaluation For search operations, we aim to capture two key metrics: the maximum query throughput and the level of accuracy for approximate nearest neighbor searches. To achieve this, the standalone-search-knn-* operations evaluate the maximum search throughput using various combinations of approximate search parameters. This operation involves executing 10,000 queries from the training set using parallel_indexing_search_clients in parallel as rapidly as possible. These operations are designed to utilize all available CPUs on the node and are performed after all indexing and merging tasks are complete. To assess the accuracy of each combination, the knn-recall-* operations compute the associated recall and Normalized Discounted Cumulative Gain (nDCG). The nDCG is calculated from the 76 queries published in msmarco-passage-v2/trec-dl-2022/judged , using the 386,000 qrels annotations. All nDCG values range from 0.0 to 1.0, with 1.0 indicating a perfect ranking. Due to the size of the dataset, generating ground truth results to compute recall is extremely costly. Therefore, we limit the recall report to the 76 queries in the test set, for which we computed the ground truth results offline using brute force methods. The search configuration consists of three parameters: k : The number of passages to return. num_candidates : The size of the queue used to limit the search on the nearest neighbor graph. num_rescore : The number of passages to rescore using the full fidelity vectors. Using automatic quantization, rescoring slightly more than k vectors with the original float vectors can significantly boost recall. The operations are named according to these three parameters. For example, knn-10-100-20 means k=10, num_candidates=100, and num_rescore=20 . If the last number is omitted, as in knn-10-100 , then num_rescore defaults to 0. See the track.py file for more information on how we create the search requests. The chart below illustrates the expected Queries Per Second (QPS) at different recall levels. For instance, the default configuration (the orange series) can achieve 50 QPS with an expected recall of 0.922. Recall versus Queries Per Second (Elasticsearch 8.14.0) The aggressive merge configuration is 2 to 3 times more efficient for the same level of recall. This efficiency is expected since the search is conducted on larger and fewer segments as demonstrated in the previous section. The full results for the default configuration are presented in the table below: Queries per second, latencies (in milliseconds), recall and NDCG@10 with different parameters combination (8.14 default int8 HNSW configuration) The %best column represents the difference between the actual NDCG@10 for this configuration and the best possible NDCG@10, determined using the ground truth nearest neighbors computed offline with brute force. For instance, we observe that the knn-10-20-20 configuration, despite having a recall@10 of 67.4%, achieves 90% of the best possible NDCG for this dataset. Note that this is just a point result and results may vary with other models and/or datasets. The table below shows the full results for the aggressive merge configuration: Queries per second, latencies (in milliseconds), recall and NDCG@10 with different parameters combination (8.14 aggressive merge int8 HNSW configuration) Using the knn-10-500-20 search configuration, the aggressive merge setup can achieve > 90% recall at 150 QPS. Conclusion In this post, we described a new rally track designed to benchmark large-scale vector search on Elasticsearch. We explored various trade-offs involved in running an approximate nearest neighbor search and demonstrated how in Elasticsearch 8.14 we've reduced the cost by 75% while increasing index speed by 50% for a realistic large scale vector search workload. Our ongoing efforts focus on optimization and identifying opportunities to enhance our vector search capabilities. Stay tuned for the next installment of this series, where we will delve deeper into the cost and efficiency of vector search use cases, specifically examining the potential of int4 and binary compression techniques. By continually refining our approach and releasing tools for testing performance at scale, we aim to push the boundaries of what is possible with Elasticsearch, ensuring it remains a powerful and cost-effective solution for large-scale vector search. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Part 1: High-fidelity dense vector search Introduction Notes on embeddings Matryoshka Representation Learning (MRL) Embedding quantization learning Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Designing for large scale vector search with Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-vector-large-scale-part1","meta_description":"Explore the cost, performance and benchmarking for running large-scale vector search in Elasticsearch, with a focus on high-fidelity dense vector search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to migrate your Ruby app from OpenSearch to Elasticsearch A guide to migrate a Ruby codebase from the OpenSearch client to the Elasticsearch client. Integrations Ruby How To FB By: Fernando Briano On December 13, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The OpenSearch Ruby client was forked from the Elasticsearch Ruby Client in version 7.x , so the codebases are relatively similar. This means when migrating a Ruby codebase from OpenSearch to Elasticsearch, the code from the respective client libraries will look very familiar. In this blog post, I'm going to show an example Ruby app that uses OpenSearch and the steps to migrate this code to Elasticsearch. Both clients are released under the popular Apache License 2.0, so they're open source and free software. Elasticsearch's license was recently updated and the core of Elasticsearch and Kibana are published under the OSI approved Open Source license AGPL since version 8.16. Considering Elasticsearch version when migrating Ruby app One consideration when migrating is which version of Elasticsearch is going to be used. We recommend using the latest stable release, which at the time of writing this is 8.17.0 . The Elasticsearch Ruby Client minor versions follow the Elasticsearch minor versions. So for Elasticsearch 8.17.x , you can use version 8.17.x of the Ruby gem. OpenSearch was forked from Elasticsearch 7.10.2. So the APIs may have changed and different features could be used on either. But that's out of scope for this post, and I'm only going to look into the most common operations in an example app. For Ruby on Rails, you can use the official Elasticsearch client, or the Rails integration libraries . We recommend migrating to the latest stable version of Elasticsearch and client respectively. The elasticsearch-rails gem version 8.0.0 support Rails 6.1 , 7.0 and 7.1 and Elasticsearch 8.x . The code For this example, I followed the steps to install OpenSearch from a tarball . After downloading and extracting the tarball, I needed to set an initial admin password which I'm going to use later to instantiate the client. I created a directory with a Gemfile that looks like this: After running bundle install , the gem is installed for my project. This installed opensearch-ruby version 3.4.0 and the version of OpenSearch I'm running is 2.18.0 . I wrote the code in an example_code.rb file in the same directory. The initial code in this file is the instantiation of an OpenSearch client: The transport option ssl: { verify: false} parameter is being passed as per the user guide to make things easier for testing. In production, this should be set up depending on the deployment of OpenSearch. Since version 2.12.0 of OpenSearch, the OPENSEARCH_INITIAL_ADMIN_PASSWORD environment variable must be set to a strong password when running the install script. Following the steps to install OpenSearch from a tarball, I exported the variable in my console and now it's available for my Ruby script. A simple API to make sure the client is connecting to OpenSearch is using the cluster.health API: And indeed it works: I tested some of the common examples we have on the Elasticsearch Ruby client documentation, and they work as expected: Migrating Ruby app to Elasticsearch The first step is to add elasticsearch-ruby in the Gemfile. After running bundle install , the Elasticsearch Ruby client gem will be installed. If you want to test your code before fully migrating, you can initially leave the opensearch-ruby gem there. The next important step is going to be the client instantiation. This is going to depend on how you're running Elasticsearch. To keep a similar approach for these examples, I am following the steps in Download Elasticsearch and running it locally. When running bin/elasticsearch , Elasticsearch will start with security features automatically configured. Make sure you copy the password for the elastic user (but you can reset it by running bin/elasticsearch-reset-password -u elastic ). If you're following this example, make sure you stop OpenSearch before starting Elasticsearch, since they run on the same port. At the beginning of example_code.rb , I commented out the OpenSearch client instantiation and added the instantiation for an Elasticsearch client: As you can see, the code is almost identical in this testing scenario. It will differ according to the deployment of Elasticsearch and how you decide to connect and authenticate with it. The same applies here as in OpenSearch regarding security, the option to not verify ssl is just for testing purposes and should not be used in production. Once the client is set up, I run the code again with: bundle exec ruby example_code.rb . And everything just works! Debugging migration issues Depending on the APIs your application is using, there is a possibility that you receive an error when running your code against Elasticsearch if the APIs from OpenSearch diverge. The REST APIs documentation is an essential reference for detailed information on how to use the APIs. Make sure to check the documentation for the version of Elasticsearch that you're using. You can also refer to the Elasticsearch::API reference. Some errors you may encounter from Elasticsearch could be: ArgumentError: Required argument '<ARGUMENT>' missing - This is a Client error and it will be raised when a request is missing a required parameter. Elastic::Transport::Transport::Errors::BadRequest: [400] {\"error\":{\"root_cause\":[{\"type\":\"illegal_argument_exception\",\"reason\":\"request [/example/_doc] contains unrecognized parameter: [test]\"}]... This error comes from Elasticsearch and it means the client code is using a parameter that Elasticsearch doesn't recognize for the API being used. The Elasticsearch client will raise errors from Elasticsearch with the detailed error message sent by the server. So for unsupported parameters or endpoints even, the error should inform you what is different. Conclusion As we demonstrated with this example code, the migration of a Ruby app from OpenSearch to Elasticsearch is not too complex from the Ruby side of things. You need to be aware of the versioning and any potential divergent APIs between the search engines. But for the most common actions, the main change when migrating clients is in the instantiation. They're both similar in that respect, but the way the host and credentials are defined varies in relation to how the Stack is being deployed. Once the client is set up, and you verify it's connecting to Elasticsearch, you can replace the OpenSearch client seamlessly with the Elasticsearch client. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Considering Elasticsearch version when migrating Ruby app The code Migrating Ruby app to Elasticsearch Debugging migration issues Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to migrate your Ruby app from OpenSearch to Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/ruby-opensearch-elasticsearch-migration","meta_description":"Learn how to migrate your Ruby app from OpenSearch to Elasticsearch. This blog includes step-by-step instructions, debugging tips, and best practices."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Improving text expansion performance using token pruning Learn about token pruning and how it boosts the performance of text expansion queries by making them more efficient without sacrificing recall. Vector Database How To KD By: Kathleen DeRusso On April 2, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. This blog talks about token pruning, an exciting enhancement to ELSER performance released with Elasticsearch 8.13.0! The strategy behind token pruning We've already talked in great detail about lexical and semantic search in Elasticsearch and text similarity search with vector fields . These articles offer great, in-depth explanations of how vector search works. We've also talked in the past about reducing retrieval costs by optimizing retrieval with ELSER v2 . While Elasticsearch is limited to 512 tokens per inference field ELSER can still produce a large number of unique tokens for multi-term queries. This results in a very large disjunction query, and will return many more documents than an individual keyword search would - in fact, queries with a large number of resulting queries may match most or all of the documents in an index! Now, let's take a more detailed look into an example using ELSER v2. Using the infer API we can view the predicted values for the phrase \"Is Pluto a planet?\" This returns the following inference results: These are the inference results that would be sent as input into a text expansion search. When we run a text expansion query, these terms eventually get joined together in one large weighted boolean query, such as: Speed it up by removing tokens Given the large number of tokens produced by ELSER text expansion, the quickest way to realize a performance improvement is to reduce the number of tokens that make it into that final boolean query. This reduces the total work that Elasticsearch invests when performing the search. We can do this by identifying non-significant tokens produced by the text expansion and removing them from the final query. Non-significant tokens can be defined as tokens that meet both of the following criteria: The weight/score is so low that the token is likely not very relevant to the original term The token appears much more frequently than most tokens, indicating that it is a very common word and may not benefit the overall search results much. We started with some default rules to identify non-significant tokens, based on internal experimentation using ELSER v2: Frequency : More than 5x more frequent than the average token frequency for all tokens in that field Score : Less than 40% of the best scoring token Missing : If we see documents with a frequency of 0, that means that it never shows up at all and can be safely pruned If you're using text expansion with a model other than ELSER, you may need to adjust these values in order to return optimal results. Both the token frequency threshold and weight threshold must show the token is non-significant in order for the token to be pruned. This lets us ensure we keep frequent tokens that are very high scoring or very infrequent tokens that may not have as high of a score. Performance improvements with token pruning We benchmarked these changes using the MS Marco Passage Ranking benchmark . Through this benchmarking, we observed that enabling token pruning with the default values described above resulted in a 3-4x improvement in 99th pctile latency and above! Relevance impact of token pruning Once we measured a real performance improvement, we wanted to validate that relevance was still reasonable. We used a small dataset against the MS Marco passage ranking dataset. We did observe an impact on relevance when pruning the tokens; however, when we added the pruned tokens back in a rescore block the relevance was close to the original non-pruned results with only a marginal increase in latency. The rescore, adding in the tokens that were previously pruned, queries the pruned tokens only against the documents that were returned from the previous query. Then it updates the score including the dimensions that were previously left behind. Using a sample of 44 queries with judgments against the MS Marco Passage Ranking dataset: Top K Rescore Window Size Avg rescored recall vs control Control NDCG@K Pruned NDCG@K Rescored NDCG@K 10 10 0.956 0.653 0.657 0.657 10 100 1 0.653 0.657 0.653 10 1000 1 0.653 0.657 0.653 100 100 0.953 0.51 0.372 0.514 100 1000 1 0.51 0.372 0.51 Now, this is only one dataset - but it's encouraging to see this even at smaller scale! How to use: Pruning configuration Pruning configuration will launch in our next release as an experimental feature. It's an optional, opt-in feature so if you perform text expansion queries without specifying pruning, there will be no change to how text expansion queries are formulated - and no change in performance. We have some examples of how to use the new pruning configuration in our text expansion query documentation . Here's an example text expansion query with both the pruning configuration and rescore: Note that the rescore query sets only_score_pruned_tokens to false, so it only adds those tokens that were originally pruned back into the rescore algorithm. This feature was released as a technical preview feature in 8.13.0. You can try it out in Cloud today! Be sure to head over to our discuss forums and let us know what you think. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to The strategy behind token pruning Speed it up by removing tokens Performance improvements with token pruning Relevance impact of token pruning How to use: Pruning configuration Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving text expansion performance using token pruning - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/text-expansion-pruning","meta_description":"Learn about token pruning and how it boosts the performance of text expansion queries by making them more efficient without sacrificing recall."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Using hybrid search for gopher hunting with Elasticsearch and Go Learn how to achieve hybrid search by combining keyword and vector search using Elasticsearch and the Elasticsearch Go client. Vector Database How To CR LS By: Carly Richmond and Laurent Saint-Félix On November 2, 2023 Part of Series Using the Elasticsearch Go client for keyword search, vector search & hybrid search Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. In the previous parts of this series, it was demonstrated how to use the Elasticsearch Go client for traditional keyword search and vector search . This third part covers hybrid search. We'll share examples of how you can combine both vector search and keyword search using Elasticsearch and the Elasticsearch Go client . Prerequisites Just like part one in this series, the following prerequisites are required for this example: Installation of Go version 1.21 or later Create your own Go repo using the recommended structure and package management covered in the Go documentation Creating your own Elasticsearch cluster, populated with a set of rodent-based pages , including for our friendly Gopher , from Wikipedia: Connecting to Elasticsearch As a reminder, in our examples, we will make use of the Typed API offered by the Go client. Establishing a secure connection for any query requires configuring the client using either: Cloud ID and API key if making use of Elastic Cloud Cluster URL, username, password and the certificate Connecting to our cluster located on Elastic Cloud would look like this: The client connection can then be used for searching, as demonstrated in the subsequent sections. Manual boosting for hybrid search When combining any set of search algorithms, the traditional approach has been to manually configure constants to boost each query type. Specifically, a factor is specified for each query, and the combined results set is compared to the expected set to determine the recall of the query. Then we repeat for several sets of factors and pick the one closest to our desired state. For example, combining a single text search query boosted by a factor of 0.8 with a knn query with a lower factor of 0.2 can be done by specifying the Boost field in both query types, as shown in the below example: The factor specified in the Boost option for each query is added to the document score. By increasing the score of our match query by a larger factor than the knn query, results from the keyword query are more heavily weighted. The challenge of manual boosting, particularly if you're not a search expert, is that it requires tuning to figure out the factors that will lead to the desired result set. It's simply a case of trying out random values to see what gets you closer to your desired result set. Reciprocal Rank Fusion in hybrid search & Go client Reciprocal Rank Fusion , or RRF, was released under technical preview for hybrid search in Elasticsearch 8.9. It aims to reduce the learning curve associated with tuning and reduce the amount of time experimenting with factors to optimize the result set. With RRF, the document score is recalculated by blending the scores by the below algorithm: The advantage of using RRF is that we can make use of the sensible default values within Elasticsearch. The ranking constant k defaults to 60 . To provide a tradeoff between the relevancy of returned documents and the query performance when searching over large data sets, the size of the result set for each considered query is limited to the value of window_size , which defaults to 100 as outlined in the documentation . k and windows_size can also be configured within the Rrf configuration within the Rank method in the Go client, as per the below example: Conclusion Here we've discussed how to combine vector and keyword search in Elasticsearch using the Elasticsearch Go client . Check out the GitHub repo for all the code in this series. If you haven't already, check out part 1 and part 2 for all the code in this series. Happy gopher hunting! Resources Elasticsearch Guide Elasticsearch Go client What is vector search? | Elastic Reciprocal Rank Fusion Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Connecting to Elasticsearch Manual boosting for hybrid search Reciprocal Rank Fusion in hybrid search & Go client Conclusion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using hybrid search for gopher hunting with Elasticsearch and Go - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/hybrid-search-with-the-elasticsearch-go-client","meta_description":"Learn how to achieve hybrid search by combining keyword and vector search using Elasticsearch and the Elasticsearch Go client."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Spotify Wrapped part 2: Diving deeper into the data We will dive deeper into your Spotify data than ever before and explore connections you didn't even know existed. How To PK By: Philipp Kahr On February 25, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the first part of this series, written by Iulia Feroli, we talked about how to get your Spotify Wrapped data and visualize it in Kibana. In part 2, we're diving deeper into the data to see what else we can find out. To do this we're going to leverage a bit of a different approach and use Spotify to Elasticsearch to index the data into Elasticsearch. This tool is a bit more advanced and requires a bit more setup, but it is worth it. The data is more structured and we can ask more complex questions. What is different from the first approach? In the first blog we used the Spotify export directly and didn't perform any normalisation tasks, or any other data processing. This time we will use the same data but we will perform some data processing to make the data more usable. This will allow us to answer much more complex questions such as: What is the average duration of a song in my top 100? What is the average popularity of a song in my top 100? What is the median listening duration to a song? What is my most skipped tracked? When do I like to skip tracks? Am I listening to a particular hour of the day more than others? Am I listening to a particular day of the week more than others? Is a month of particular interest? What is the artist with the longest listening time? Spotify wrapped is a fun experience every year showing you what you listened to this year. It does not give you year over the year changes and thus you might miss some artists that were once in your top 10, but now have vanished. Data processing There is a large difference in the way we process the data in the first and the second post. If you want to keep working with the data from the first post, you will need to account for some field name changes, as well as need to revert to ES|QL to do certain extractions like hour of day on the fly. Nonetheless, you all should be able to follow this post. The data processing is done in the Spotify to Elasticsearch repository involves asking the Spotify API for the duration of the song, popularity and also renames and enhances some fields. For example the artist field in the Spotify export itself is just a String and does not represent features or multi-artist tracks Dashboarding I created a dashboard in Kibana to visualize the data. The dashboard is available here and you can import it into your Kibana instance. The dashboard is quite extensive and answers many of the above questions. Let's get into some of the questions and how to answer them together! What is the average duration of a song in my top 100? To answer this question we can use Lens, or ES|QL. Let's explore all three options. Let's phrase this question correctly in an Elasticsearch manner. We want to find the top 100 songs and then calculate the average duration of all of those songs combined. In Elasticsearch terms that would be two aggregations: Figure out the top 100 songs Calculate the average duration of those 100 songs. Lens In Lens this is rather simple, create a new Lens, switch to a table and drag and drop the title field into the table. Then click on the title field and set the size to 100, as well as set accuracy mode. Then drag and drop the duration field into the table and use last value , because we really only need the last value of each of the songs duration. The same song will only have one duration. In the bottom of this last value aggregation is a dropdown for a summary row, select average and it will show it to you. ES|QL ES|QL is a pretty fresh language compared to DSL & aggregations, but it is very powerful and easy to use. To answer the same question in ES|QL you would write the following query: Let me take you step through step of this ES|QL query: from spotify-history - This is the index pattern we are using. stats duration=max(duration), count=count() by title - This is the first aggregation, we are calculating the maximum duration of each song and the count of each song. We use max instead of last value as used in the Lens, that is because ES|QL right now does not have a first or last. sort count desc - We sort the songs by the count of each song, so the most listened to song is on top. limit 100 - We limit the result to the top 100 songs. stats Average duration of the songs=avg(duration) - We calculate the average duration of the songs. Is a month of particular interest to me? To answer this question we can use Lens with the help of runtime field and ES|QL. What do we notice straight away, that there is no field in the data that denotes the month directly, instead we need to calculate it from the @timestamp field. There are multiple ways to do this: Use a runtime field, to power the Lens ES|QL I personally think that ES|QL is the neater and quicker solution. That's it, nothing fancy needed to do, we can leverage the DATE_EXTRACT function to extract the month from the @timestamp field and then we can aggregate on it. Using the ES|QL visualisation we can drop that onto the dashboard. What is my listening duration per artist per year? The idea behind that is to see if an artist is just a one-time thing or if there is a reoccurrence. If I remember correctly, Spotify only shows you the top 5 artists in the yearly wrapped. Maybe your number 6 artist stays the same all the time, or they heavily change after the 10th position? One of the simplest representation of this is a percentage bar chart. We can use Lens for this. Follow the steps along: Drag and drop the listened_to_ms field. This field represents how long you listened to a song in milliseconds. Now per default Lens will create a median aggregation, we do not want that, alter that to a sum . In the top select percentage instead of stacked for the bar chart type. For the breakdown select artist and say top 10. In the Advanced dropdown don't forget to select accuracy mode . Now every color block represents how much you listened to this single artist. Depending on your timepicker the bars might represent values from days, to weeks, to months, to years. If you want a weekly breakdown, select the @timestamp and set the mininum interval to year . Now what we can tell in my case is that Fred Again.. is the artist I listened to most, nearly 12% of my total listening time was consumed by Fred Again.. . We also see that Fred Again.. dropped a bit in 2024, but Jamie XX grew largely. If we compare just the size of the bars. We can also tell that whilst Billie Eilish is constantly being played in 2024 the bar widthend. This means that I listened to Billie Eilish more in 2024 than in 2023. What about the top tracks per artist per listening time versus overall listening time? That's a mouthfull of a question. Let me try to explain what I want to say with that. Spotify tells you about the top song from a single artist, or your overall 5 top songs. Well, that's definitely interesting, but what about the breakdown of an artist? Is all my time consumed just by a single song that I play over and over again, or is that evenly distributed? Create a new lens and select Treemap as type. For the metric , same as before: select sum and use listened_to_ms as the field. For the group by we need two values. The first one is artist and then add a second one with title . The intermediate result looks like this: Let's change that to top 100 artists and deselect the other in the advanced dropdown, as well as enable accuracy mode. For title change that to top 10 and enable accuracy mode. The final result looks like this: What does this tell us now exactly? Without looking at any time component, we can tell that over all my listening history with Spotify, I spent 5.67% listening to Fred Again.. . In particularly I spent 1.21% of that time, listening to Delilah (pull me out of this) . It is interesting to see, if there is a single song that occupies an artist, or if there are other songs as well. The treemap itself is a nice form to represent such data distributions. Do I listen on a particular hour and day? Well, that we can answer super simple with a Lens visualisation leveraging the Heat Map . Create a new Lens, select Heat Map . For the Horizontal Axis select dayOfWeek field and set it to Top 7 instead of Top 3. For the Vertical Axis select the hourOfDay and for Cell Value just a simple Count of records . Now this will produce this panel: There are a couple of annoying things around this Lens, that just disturb me when interpreting. Let's try and clean it up a bit. First of all, I don't care about the legend too much, use the symbol in the top with the triangle, square, circle and disable it. Now the 2nd part that is annoying is the sorting of the days. It's Monday, Wednesday, Thursday, or anything else, depending on the values you have. The hourOfDay is correctly sorted. The way to sort the days is a funny hack and that is called to use Filters instead of Top Values . Click on dayOfWeek and select Filters , it should now look like this: Now just start typing the days. One filter for each day. \"dayOfWeek\" : Monday and give it the label Monday and rinse and repeat. One caveat in all of this though is, that Spotify provides the data in UTC+0 without any timezone information. Sure, they also provide the IP address and the country where you listened to and we could infere the timezone information from that, but that can be wonky and for countries like the U.S. that has multiple timezones, it can be too much of a hassle. This is important because Elasticsearch and Kibana have timezone support and by providing the correct timezone in the @timestamp field, Kibana would automatically adjust the time to your browser time. It should look like this when finalized, and we can tell that I am a very active listener during working hours and less so on Saturdays and Sundays. Conclusion In this blog we dove a bit deeper into the intricacies that the Spotify data offers. We showed a few simple and quick ways to get some visualizations up and running. It is simply amazing to have this much control over your own listening history. Stay tuned for more follow up blog posts! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is different from the first approach? Data processing Dashboarding What is the average duration of a song in my top 100? Is a month of particular interest to me? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Spotify Wrapped part 2: Diving deeper into the data - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/spotify-wrapped-part-02","meta_description":"We will dive deeper into your Spotify data than ever before and explore connections you didn't even know existed."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Detecting relationships in data: Spotify Wrapped, part 4 Graphs are a powerful tool for detecting relationships in data. In this blog, we'll explore the relationships between artists and your music taste. How To PK By: Philipp Kahr On April 1, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the first part , we talked about how to get your Spotify Wrapped data and how to visualize it. In the second part , we talked about how to process the data and how to visualize it. In the third part , we explored anomaly detection and how it helps us find interesting listening behaviour. In this part, we will talk about how to find relationships in your music. What is a relationship? In the context of music, a relationship can be many things. It can be the relationship between artists, genres, songs, or even the relationship between the time of day and the music you listen to. In this blog post, we will focus on the relationship between artists. How can we explore relationships? There is a simple solution to this and this is called Kibana Graph. Make sure that you have followed along for the data import in the second blog post, otherwise this won't work. What does Kibana Graph actually do under the hood? Let's assume we have the following documents, each row represents a new document. Now, Kibana Graph will compute all term co-occurrences to build the connections between each node. Therefore, we would expect a graph to look like this: This is a very simple example, but it shows the basic idea of how Kibana Graph works. It takes all the documents and creates a graph from them. There is some terminology involved here: the circles with the artist are known as nodes or vertices and the connection between those circles is known as an edge . There are also a few tuning options that we can use: Significant links Is a feature that helps clear out noisy terms from the dataset. This can be useful when the dataset is very noisy and documents have a large number of terms. This setting is expensive as Kibana Graph has to perform frequency checks on the Elasticsearch side for each request in order to compute the terms score, so it is recommended to turn it off if not strictly required. Certainity By default, this is set to a value of 3 , meaning the link between two artists has to appear at least 3 times in the dataset to be considered a link. I often reduce this value to 0 for other use cases, but for music, this value might be alright since I don't want to see one-time flukes in my graph. Instead, I want to see a relationship between songs I listened to more often. Turning down this value to 0 (or any other value) will increase the potential number of edges. For example, when listening once to a song that features the artists Jamie xx and Fred Again.. , this is enough for the relationship to show up with a value 0 . In contrast, setting it to something higher like 3 means I need to listen to the song at least 3-4 times to see a connection between those two artists. Sample Size The graph doesn't read all documents from the index. Instead, it relies on a sample approach to create the graph. This is done to keep the performance of the graph high. We can change this number to whatever we think is representative of our dataset. However, don't forget to adjust the timeout value as you increase the sample size. Timeout This is easy to explain. It refers to how long Elasticsearch has time to report back. Using Kibana Graph With the fundamentals explained, go to Kibana and click on the Graph app. You will need to select a data view, which is the Spotify History . When prompted to select a field, use the artist field. By default, that should turn out violet with a musical 16th note on it. I had to adjust my sample size to 5000 to get a good starting graph. We can tell that we have multiple artists that are connected to each other. This allows us to select one of those artists and press the + sign. There are already some clusters forming, which is important. Those standalone artists are not interesting to us. There is an all button in the right panel. Select it and press the + again. This will now explode your graph and pull in the additional artists. If we continue this process, increase the sample size and start exploding the graph more and more. Depending on your listening style, you should either get a lot of little islands or a few big clusters. In the next picture, we see one big cluster in the middle that interconnects Rudimental , Fred again.. , and Jamie XX . This makes sense, as all of them belong to the same genre which heavily features the same artists. At the same time, we have some tinier islands around. Kraftklub is a German band and is connected to mostly all of the German music I listen to, like Casper and Blond . There are some isolated vertices such as Harry Styles . Let's dig into why Harry Styles is alone. Does he not feature anyone? How does he fit into my listening behavior when all of my other listened-to music is more or less connected based on the featuring of artists? Go to Discover and perform the following: In the search bar, write artist: \"Harry Styles\" . This filters down to all documents that have Harry Styles in the name. We can simply click on the field artist in the field picker on the left side and see that there is only 1 value. Even though I listened to Harry Styles 2535 times, he has never featured another artist (or at least, according to Spotify data, it is not listed as such). Compare that to e.g. Jamie XX and we can see the difference. Conclusion In this blog, we explored relationships and how easy it is to leverage Kibana Graph and Elasticsearch's graph capabilities. Stay tuned for more parts in this blog series! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is a relationship? How can we explore relationships? Using Kibana Graph Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Detecting relationships in data: Spotify Wrapped, part 4 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/find-relationships-in-data","meta_description":"Learn how to find relationships in data with an example. We'll leverage Kibana Graph & Elasticsearch's graph capabilities to explore data relationships."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to implement image similarity search in Elasticsearch Searching through images to find the right one has always been challenging. With similarity image search, you can create a more intuitive search experience. Learn how to implement image search in Elastic. Generative AI RO By: Radovan Ondas On June 20, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Learn how to implement similarity image search in Elastic in just a few steps. Start setting up the application environment, then import the NLP model, and finally complete generating embeddings for your set of images. Get an overview of image similarity search with Elastic >> How to set up your environment The first step is setting up the environment for your application. General requirements include: Git Python 3.9 Docker Hundreds of images It is important to use hundreds of images to ensure the best results. Go to the working folder and check out the repository code created. Then navigate to the repository folder. Because you will be using Python to run the code, you need to make sure all requirements are met and the environment is ready. Now create the virtual environment and install all the dependencies. Elasticsearch cluster and embedding model Log in to your account to spin up an Elasticsearch cluster. Set up a small cluster with: One HOT node with 2GB of memory One ML (Machine learning) node with 4GB of memory (The size of this node is important as the NLP model you will import into Elasticsearch consumes ~1.5GB of memory.) After your deployment is ready, go to Kibana and check the capacity of your machine learning nodes. You will see one machine learning node in the view. There is no model loaded at the moment. Upload the CLIP embedding model from OpenAI using the Eland library. Eland is a Python Elasticsearch client for exploring and analyzing data in Elasticsearch and is able to handle both text and images. You'll use this model to generate embeddings from the text input and query for matching images. Find more details in the documentation of the Eland library. For the next step, you will need the Elasticsearch endpoint. You can get it from the Elasticsearch cloud console in the deployment detail section. Using the endpoint URL, execute the following command in the root directory of the repository. The Eland client will connect to the Elasticsearch cluster and upload the model into the machine learning node. You refer to your actual cluster URL with the –url parameter, for example, below refers to ‘image-search.es.europe-west1.gcp.cloud.es.io’ as cluster URL. Enter the Eland import command. The output will be similar to the following: The upload might take a few minutes depending on your connection. When finished, check the list of Trained models on the machine learning Kibana page: Menu -> Analytics -> Machine Learning -> Model management ->Trained models. Verify that the NLP Clip model is in the state ‘started’. If you receive a message on the screen — ML job and trained model synchronization required — click on the link to synchronize models. How to create image embeddings After setting up the Elasticsearch cluster and importing the embedding model, you need to vectorize your image data and create image embeddings for every single image in your data set. To create image embeddings, use a simple Python script. You can find the script here: create-image-embeddings.py . The script will traverse the directory of your images and generate individual image embeddings. It will create the document with the name and relative path and save it into an Elasticsearch index ‘ my-image-embeddings ’ using the supplied mapping . Put all your images (photos) into the folder ‘ app%2Fstatic%2Fimages ’. Use a directory structure with subfolders to keep the images organized. Once all images are ready, execute the script with a few parameters. It is crucial to have at least a few hundred images to achieve reasonable results. Having too few images will not give the expected results, as the space you will be searching in will be very small and distances to search vectors will be very similar. In the folder image_embeddings, run the script and use your values for the variables. Depending on the number of images, their size, your CPU, and your network connection, this task will take some time. Experiment with a small number of images before you try to process the full data set. After the script completes, you can verify if the index my-image-embeddings exists and has corresponding documents using the Kibana dev tools. Looking at the documents, you will see very similar JSON objects (like the example). You will see the image name, image id, and the relative path inside the images folder. This path is used in the frontend application to properly display the image when searching. The most important part of the JSON document is the ‘ image_embedding ’ that contains the dense vector produced by the CLIP model. This vector is used when the application is searching for an image or a similar image. Use the Flask application to search images Now that your environment is all set up, you can take the next step and actually search images using natural language and find similar images, using the Flask application that we provide as a proof of concept. The web application has a simple UI that makes image search simple. You can access the prototype Flask application in this GitHub repo . The application in the background performs two tasks. After you input the search string into the search box, the text will be vectorized using the machine learning _infer endpoint. Then, the query with your dense vector is executed against the index my-image-embeddings with the vectors. You can see those two queries in the example. The first API call uses the _infer endpoint, and the result is a dense vector. In the second task, search query, we will utilize the dense vector and get images sorted by score. To get the Flask application up and running, navigate to the root folder of the repository and configure the .env file. The values in the configuration file are used to connect to the Elasticsearch cluster. You need to insert values for the following variables. These are the same values used in the image embedding generation. ES_HOST='URL:PORT' ES_USER='elastic' ES_PWD='password' When ready, run the flask application in the main folder and wait until it starts. If the application starts, you will see an output similar to the below, which at the end indicates which URL you need to visit to access the application. Congrats! Your application should now be up and running and accessible on http:%2F%2F127.0.0.1:5001 via the internet browser. Navigate to the image search tab and input the text that describes your image best. Try to use a non-keyword or descriptive text. In the example below, the text entered was “ endless route to the top .” The results are shown from our data set. If a user likes one particular image in the result set, simply click the button next to it, and similar images will display. Users can do this endless times and build their own path through the image data set. The search also works by simply uploading an image. The application will convert the image into a vector and search for a similar image in the data set. To do this, navigate to the third tab Similar Image , upload an image from the disk, and hit Search . Because the NLP ( sentence-transformers%2Fclip-ViT-B-32-multilingual-v1 ) model we are using in Elasticsearch is multilingual and supports inference in many languages, try to search for the images in your own language. Then verify the results by using English text as well. It’s important to note that the models used are generic models, which are pretty accurate but the results you get will vary depending on the use case or other factors. If you need higher accuracy, you will have to adapt a generic model or develop your own model — the CLIP model is just intended as a starting point. Code summary You can find the complete code in the GitHub repository . You may be inspecting the code in routes.py , which implements the main logic of the application. Besides the obvious route definition, you should focus on methods that define the _infer and _search endpoints ( infer_trained_model and knn_search_images ). The code that generates image embeddings is located in create-image-embeddings.py file. Summary Now that you have the Flask app set up, you can search your own set of images with ease! Elastic provides native integration of vector search within the platform, which avoids communication with external processes. You get the flexibility to develop and employ custom embedding models that you may have developed using PyTorch. Semantic image search delivers the following benefits of other traditional approaches to image search: Higher accuracy: Vector similarity captures context and associations without relying on textual meta descriptions of the images. Enhanced user experience: Describe what you’re looking for, or provide a sample image, compared to guessing which keywords may be relevant. Categorization of image databases: Don’t worry about cataloging your images — similarity search finds relevant images in a pile of images without having to organize them. If your use case relies more on text data, you can learn more about implementing semantic search and applying natural language processing to text in previous blogs. For text data, a combination of vector similarities with traditional keyword scoring presents the best of both worlds. Ready to get started? Sign up for a hands-on vector search workshop at our virtual event hub and engage with the community in our online discussion forum . Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to How to set up your environment Elasticsearch cluster and embedding model How to create image embeddings Use the Flask application to search images Code summary Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to implement image similarity search in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/implement-image-similarity-search-elastic","meta_description":"Searching through images to find the right one has always been challenging. With similarity image search, you can create a more intuitive search experience. Learn how to implement image search in Elastic."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Evaluating search relevance Blog posts discussing how to think about evaluating your own search systems in the context of better understanding the BEIR benchmark. We will introduce specific tips and techniques to improve your search evaluation processes. Part1 ML Research Python July 16, 2024 Evaluating search relevance part 1 - The BEIR benchmark Learn to evaluate your search system in the context of better understanding the BEIR benchmark, with tips & techniques to improve your search evaluation processes. TP TV By: Thanos Papaoikonomou and Thomas Veasey Part2 ML Research Python September 19, 2024 Evaluating search relevance part 2 - Phi-3 as relevance judge Using the Phi-3 language model as a search relevance judge, with tips & techniques to improve the agreement with human-generated annotation. TP TV By: Thanos Papaoikonomou and Thomas Veasey Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Evaluating search relevance - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/evaluating-search-relevance","meta_description":"Blog posts discussing how to think about evaluating your own search systems in the context of better understanding the BEIR benchmark. We will introduce specific tips and techniques to improve your search evaluation processes."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Making Lucene faster with vectorization and FFI/madvise Discover how modern Java features, including vectorization and FFI/madvise, are speeding up Lucene's performance. Lucene CH By: Chris Hegarty On April 17, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Over in Lucene-land we've been eagerly adopting features of new Java versions. These features bring Lucene closer to both the JVM and the underlying hardware, which improves performance and stability. This keeps Lucene modern and competitive. The next major release of Lucene, Lucene 10, will require a minimum of Java 21. Let's take a look at why we decided to do this and how it will benefit Lucene. Foreign memory For efficiency reasons, indices and their various supporting structures are stored outside of the Java heap - they are stored on-disk and mapped into the process' virtual address space. Until recently, the way to do this in Java is with direct byte buffers, which is exactly what Lucene has been doing. Direct byte buffers have some inherent limitations. For example, they can address a maximum of 2GB, requiring more structures and code to span over larger sizes. However, most significant is the lack of deterministic closure, which we workaround by calling Unsafe::invokeCleaner to effectively close the buffer and release the memory. This is, as the name suggests, inherently an unsafe operation. Lucene adds safeguards around this, but, by definition, there is still a miniscule risk of failure if memory were to be accessed after it was released. More recently Java has added MemorySegment , which overcomes the limitations that we encounter with direct byte buffers. We now have safe deterministic closure and can address memory far beyond that of previous limits. While Lucene 9.x already has optional support for a mapped directory implementation backed by memory segments, the upcoming Lucene 10 drops support for byte buffers. All this means that Lucene 10 only operates with memory segments, so is finally operating in a safe model. Foreign function Different workloads; search or indexing, or different types of data, say, doc values or vector embeddings, have different access patterns. As we've seen, because of the way Lucene maps its index data, interaction with the operating system page cache is crucial to performance. Over the years a lot of effort and consideration has been given to optimizations around memory usage and page cache. First through native JNI code that calls madvise directly, and later with a directory implementation that uses direct I/O. However, while good at the time, both these solutions are a little less than ideal. The former requires platform specific builds and artifacts, and the latter leverages an optional JDK-specific API . For these reasons, neither solution is part of Lucene core, but instead lives in the further afield misc module. Mike McCandless has a good blog about this, from 2010! On modern Java we can now use the Panama Foreign Function Interface (FFI) to call library functions native on the system. We use this, directly in Lucene core , to call posix_madvise from the Standard C library - all from Java, and without the need for any JNI code or non-standard features. With this we can now advise the system about the type of memory access patterns we intend to use. Vectorization Parallelism and concurrency, while distinct, often translate to \"splitting a task so that it can be performed more quickly\", or \"doing more tasks at once\". Lucene is continually looking at new algorithms and striving to implement existing ones in more performant and efficient ways. One area that is now more straightforward to us in Java is data level parallelism - the use of SIMD (Single Instruction Multiple Data) vector instructions to boost performance. Lucene is using the latest JDK Vector API to implement vector distance computations that result in efficient hardware specific SIMD instructions. These instructions, when run on supporting hardware, can perform floating point dot product computations 8 times faster than the equivalent scalar code. This blog contains more specific information on this particular optimization. With the move to Java 21 minimum, it is a lot more straightforward to see how we can use the JDK Vector API in more places. We're even experimenting with the possibility of calling customized SIMD implementations with FFI, since the overhead of the native call is now quite minimal. Conclusion While the latest Lucene 9.x releases are able to benefit from many of the recent Java features, the requirement to run on versions of Java as early as Java 11 means that we're reaching a level of complexity with 9.x that, while maybe still ok today, is not where we want to be in the future. The upcoming Lucene 10 will be closer to the JVM and the hardware than ever before. By requiring a minimum of Java 21, we are able to drop the older direct byte buffer directory implementation, reliably advise the system about memory access patterns through posix_madvise , and continue our efforts around levering hardware accelerated instructions. Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Foreign memory Foreign function Vectorization Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Making Lucene faster with vectorization and FFI/madvise - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/lucene-and-java-moving-forward-together","meta_description":"Discover how modern Java features, including vectorization and FFI/madvise, are speeding up Lucene's performance."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to make your own Spotify Wrapped in Kibana Based on the downloadable Spotify personal history, we'll make a custom version of \"Spotify Wrapped\" with the top artists, songs, and trends over the year How To IF By: Iulia Feroli On January 14, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. I am probably not the only one who was a little disapointed by the Spotify Wrapped this year (and the internet seems to agree). Looking back at our yearly musical history has become a highly anticipated moment of the year for heavy Spotify users. However, at the end of the day all \"Wrapped\" is, is a data analytics problem, with a great PR team. So perhaps the mantle must fall on fellow data analysts to attempt to solve this problem in a more satisfying way. With the back-to-work and brand-new-year motivation fueling us - let's see if we can do any better. (Spoiler alert: we definitely can!) Getting started with your custom Spotify Wrapped The best part about this exercise is that it's fully replicable. Spotify allows users to download their own historical streaming data via this link , which you can request out of your account settings. If you want to generate your own version of this dashboard - request your own data to run this example through! Please note, that this could take a few days to a few weeks but should take no longer than 30 days. You will need to confirm that you would like this data and a copy will be sent to your email directly. Alternatively, you can try it out first on the sample data I've provided with a reduced sub-section from my own data. Once this has been generated we can dive into years worth of data and start building our own fun dashboards. Check out this notebook for the code examples. These were built and run using Elasticsearch 8.15 and Python 3.11. To get started with this notebook be sure to first clone the repository and download the required packages: Historical data will be generated as a list of JSON documents pre-formatted and chunked by Spotify, where each json represents an action. In most cases such an action means a song that you've listened to with some additional metadata such as length of time in milliseconds, artist information, as well as device properties. Naturally, if you have any experience with Elastic, the first thought looking at this data would be that this data practically screams \"add me to an index and search me!\". So we will do just that. Building an Elasticsearch Index As you can see in the same notebook once you've connected to your preferred Elasticsearch client (in my case Python, but any language client could be used), it takes a few simple lines of code to send the json documents into a new index: Even the mapping is handled automatically by Elastic due to the high quality and consistency of the data as prepared by Spotify, so you do not need to define it when indexing. One key element to pay attention to is noticing fields like \"Artist Name\" are seen as keywords which will allow us to run more complex aggregations for our dashboards. Wrapping queries to explore your listening habits With the index fully populated you can explore the data through code to run a few simple test queries. For example, my top artist has been Hozier for quite a few years now, so I start with the simplest possible term query to check my data: This gives me back 5653 hits - which means I've played more than five thousand Hozier songs since 2015 (as far as my data goes back). Seems pretty accurate. You can run the same test, or query any of the other fields like album name or song title with a simple text match query. The next steps in the notebook are to build more complex queries, like the most anticipated question - is my top artist list in Wrapped accurate? You can calculate this by either number of hits (how many times songs have been played) or perhaps more accurately, by summing up the total number of milliseconds of playtime by artist bucket. You can read more about aggregations in Elasticsearch here. Building Spotify Wrapped dashboards After these few examples you should have a good understanding of the Elasticsearch mechanics you can use to drill into this data. However, to both save time and make the insights more consumable (and pretty) you can also build a lot of these insights directly in a Kibana dashboard. In my case, I've run these in my cloud Elastic cluster but this can also be run locally. You first need to build a data view from the index and then you can directly build visualizations by dragging the data fields and choosing view types. The best part is - you can really make it your own by choosing the visualization type, date range you want to explore, which fields to showcase, etc. Just pick a visual and drag the needed fields into the graph. For most examples we will use ts (the time field) as a horizontal axis, and a combination of record counts or aggregations over the song_name or artist_name fields on the vertical axis. Within a few hours I built my own Spotify Wrapped - Iulia's Version, going deeper than ever before. Let's take a look. Starting with the \"classic\" wrapped insights - I've first built the top artist and song rank. Here's an example of how one of these graphs is built: Looking at the points of interest in this graph if you want to recreate it: make sure to select the correct time interval for your data to cover 2024 in 1 choose to show the top values of the artist name field, and exclude the other bucket to make your visualization neat in 2 map this against the count of records to rank the artists based on how many times they appear in the data (equivalent to time the songs were played) in 3 From here, I've gone even further by adding more metadata like time or location and looking at how these trends have changed throughout the year. Here you can see the listening time over the year (in weekly buckets), the locations I've been listening from while traveling, and how my top artists have varied month by month (including a sighting of brat summer). Some more tricks worth noting for these graphs: when you work with the playing time instead of just count of records, you should choose to aggregate all the instances of a song or artist being played by using the sum function. This is the kibana equivalent to the aggs operator we were using in the code in the first notebook examples. you can additionally convert the milliseconds into minutes or hours for neater visualisation *you can layer as many field breakdowns as you want, like for example adding the top 3 artists name on top of the monthly aggregations. Comparing my final dashboard to my actual Wrapped - it seems the results were close enough, but maybe not entirely accurate. It seems this year the top song choices are a little off from the way I calculate my ranking in this example. It could be that Spotify used a different formula to build this ranking, which makes it a bit harder to interpret. That's one of the benefits of building this dashboard from scratch - you have full transparency on the type of aggregations and scoring used for your insights. Finally, to really drive the point of the full customization benefit - I've had my colleague Elisheva also send me her own 2024 data. Here's another dashboard example, this time with a few more swifty insights: This time I've highlighted the albums breakdown since it gives some cool \"Eras\" insights' and added the \"hours of playtime per album per month\" - from which you can really pinpoint when Tortured Poets came out as an extra fun treat. Make your own Spotify Wrapped Just having the data stored in an index makes this a really fun and simple Elasticsearch use case, really showcasing some of the coolest features like aggregations or custom visualizations - and I hope to inspire you to try out your very own search engine and personal dashboard! Explore other parts of this series: Spotify Wrapped part 2: Diving deeper into the data Spotify Wrapped part 3: Anomaly detection population jobs Spotify Wrapped part 4: Detecting relationships in data Spotify Wrapped part 5: Finding your best music friend with vectors Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Getting started with your custom Spotify Wrapped Building an Elasticsearch Index Wrapping queries to explore your listening habits Building Spotify Wrapped dashboards Make your own Spotify Wrapped Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to make your own Spotify Wrapped in Kibana - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/spotify-wrapped-create-in-kibana","meta_description":"Here's how you can make your own Spofity Wrapped in Kibana with the top artists, songs, and trends over the year."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder Learn how to ingest data from AWS S3 using Elastic Serverless Forwarder (ESF). Ingestion How To HL By: Hemendra Singh Lodhi On October 2, 2024 Part of Series How to ingest data from AWS S3 into Elastic Cloud Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. This is the first installment in a multi-part blog series exploring different options for ingesting data from AWS S3 into Elastic Cloud . Elasticsearch offers several options for ingesting data from AWS S3 buckets, allowing customers to select the most suitable method based on their specific needs and architectural strategy. These are the key options for ingesting data from AWS S3: Elastic Serverless Forwarder (ESF) - our focus in this blog Elastic Agent - part 2 Elastic S3 Native Connector - part 3 Data ingestion options comparison Features ESF Elastic Agent S3 Connector Logs ✅ ✅ ✅[[^1]] Metrics ❌ ✅ ✅[[^2]] Cost Medium-Lambda,SQS Low-EC2,SQS Low-Elastic Enterprise Search Scaling Auto - Unlimited EC2 instance size Enterprise Search Node size Operation Low - Monitor Lambda function High - Manage Agents Low PrivateLink ✅ ✅ NA (Pull from S3) Primary Use Case Logs Logs & Metrics Content & Search Note1: ESF doesn't support metrics collection due to AWS limitation on services that can trigger Lambda function and you can't invoke Lambda using subscription filter on CloudWatch metrics. However, taking cost consideration into account it is possible to store metrics in S3 and via SQS trigger ingest into Elastic. Note2: [[^1]][[^2]]Although S3 connector can pull logs and metrics from S3 bucket, it is most suited for ingesting content, files, images and other data types In this blog we will focus on how to ingest data from AWS S3 using Elastic Serverless Forwarder(ESF). In the next parts, we will explore Elastic Agent and Elastic S3 Native Connector methods. Let's begin. Follow these steps to launch the Elastic Cloud deployment: Elastic Cloud Create an account if not created already and create an Elastic deployment in AWS . Once the deployment is created, note the Elasticsearch endpoint. This can be found in the Elastic Cloud console under -> Manage -> Deployments . Elastic Serverless Forwarder The Elastic Serverless Forwarder is an AWS Lambda function that forward logs such as VPC Flow logs, WAF, Cloud Trail etc. from AWS environment to Elastic. It can be used to send data to Elastic Cloud as well as self-managed deployment. Features of Elastic Serverless Forwarder Support multiple inputs S3 (via SQS event notification) Kinesis Data Streams CloudWatch Logs subscription filters SQS message payload At least once delivery using \"continuing queue\" and \"replay queue\" (created automatically by serverless forwarder) Support data transfer over PrivateLink which allows data transfer within the AWS Virtual Private Cloud (or VPC) and not on public network. Lambda function is an AWS Serverless compute managed service with automatic scaling in response to code execution request Function execution time is optimised with optimal memory size allocated as required Pay as you go pricing, only pay for compute time during Lambda function execution and for SQS event notification Data flow: I ngesting data from AWS S3 with Elastic Serverless Forwarder We will use S3 input with SQS notification to send VPC flow logs to Elastic Cloud: VPC flow log is configured to write to S3 bucket Once log is written to S3 bucket, S3 event notification (S3:ObjectCreated) is sent to SQS SQS event notification containing event metadata triggers the Lambda function which read the logs from the bucket Continuing queue is created when forwarder is deployed and ensures at least once delivery. Forwarder keeps track of last event sent and helps in processing pending events when forwarder function exceed runtime of 15 min (Lambda max default) Replay queue is also created when forwarder is deployed and handles log ingestion exceptions. Forwarder keeps track of failed events and writes them to the replay queue for later ingestion. For e.g. in my testing, I put the wrong Elastic API key, causing authentication failure, which filled up the replay queue. You can enable the replay queue as a trigger for the ESF lambda function to consume the messages from the S3 bucket again. It is important to address the delivery failure first; otherwise message will accumulate in the replay queue. You can set this trigger permanently but may need to remove/re-enable depending on the message failure issue. To enable the trigger go to SQS -> elastic-serverless-forwarder-replay-queue- -> under Lambda triggers -> Configure Lambda function trigger -> Select the ESF lamnda function Setting up Elastic Serverless Forwarder for AWS S3 data ingestion Create S3 Bucket s3-vpc-flow-logs-elastic to store VPC flow logs AWS Console -> S3 -> Create bucket. You may leave other settings as default or change as per the requirements: Copy the bucket ARN, required to configure flow logs in next step: Enable VPC Flow logs and send to S3 bucket s3-vpc-flow-logs-elastic AWS Console -> VPC -> Select VPC -> Flow logs. Leave other settings as is or change as per the requirements: Provide name of the flow logs, select what filters to apply, aggregation interval and destination for the flow log storage: Once done, it will look like below with S3 as the destination. Going forward all the flow traffic through this VPC will be stored in the bucket s3-vpc-flow-logs-elastic : Create SQS queue Note 1: Create SQS queue in same region as S3 bucket Note 2: Set the visiblity timeout of 910 second which is 10 sec more than AWS Lambda function max runtime of 900 sec. AWS Console -> Amazon SQS -> Create queue Provide queue name and update visiblity timeout to 910 sec. Lambda function runs for max 900 sec (15min) and setting a higher value for visibility timeout allows consumer Elastic Serverless Forwarder(ESF) to process and delete the message from the queue: Update the SQS Access Policy (Advance) to allow S3 bucket to send notification to SQS queue. Replace account-id with your AWS account ID. Keep other options as default. Here, we are specifying S3 to send message to SQS queue (ARN) from the S3 bucket: More details on permission requirement (IAM user) for AWS integration is available here . Copy the SQS ARN, in queue setting under Details : Enable VPC flow log event notification in S3 bucket AWS Console > S3. Select bucket s3-vpc-flow-logs-elastic -> Properties and Create event notification Provide name and on what event type you want to trigger SQS. We have selected object create when any object is added to the bucket: Select destination as SQS queue and choose sqs-vpc-flow-logs-elastic-serverless-forwarder : Once saved, configuration will look like below: Create another S3 bucket to store configuration file for Elastic Serverless Forwarder: Create a file named config.yaml and update with below configuration. Full set of options here : input type : s3-sqs . We are using S3 with SQS notification option output : elasticsearch_url : elasticsearch endpoint from Elastic Cloud deployment Create section above api_key : Create Elasticsearch API key (User API key) using instruction here es_datastream_name : forwarder supports automatic routing of aws.cloudtrail, aws.cloudwatch_logs, aws.elb_logs, aws.firewall_logs, aws.vpcflow, and aws.waf logs . For other log types you can set it to the naming convention required. Leave other options as default. Upload the config.yaml in s3 bucket s3-vpc-flow-logs-serverless-forwarder-config : Install AWS integration assets Elastic integrations comes pre-packaged with assets that simplify collection, parsing , indexing and visualisation. The integrations uses data stream with specific naming convention for indices which is helpful in getting started. Forwarder can write to any other stream name too. Follow the steps to install Elastic AWS integration. Kibana -> Management -> Integrations, Search for AWS: Deploy the Elastic Serverless Forwarder There are several options available to deploy Elastic Serverless Forwarder from SAR (Serverless Application Repository): Using AWS Console Using AWS Cloudformation Using Terraform Deploy directly which provides more customisation options We will use AWS Console option to deploy ESF. Note : Only one deployment per region is allowed when using the AWS console directly. AWS Console -> Lambda -> Application -> Create Application , search for elastic-serverless-forwarder: Under Application settings provide the following details: Application name - elastic-serverless-forwarder ElasticServerlessForwarderS3Buckets - s3-vpc-flow-logs-elastic ElasticServerlessForwarderS3ConfigFile - s3://s3-vpc-flow-logs-serverless-forwarder-config/config.yaml ElasticServerlessForwarderS3SQSEvent - arn:aws:sqs:ap-southeast-2:xxxxxxxxxxx:sqs-vpc-flow-logs-elastic-serverless-forwarder On successful deployment, status of Lambda deployment should be Create Complete : Below are the SQS queues automatically created upon successful deployment of ESF: Once everything is set up correctly, published flow logs in S3 bucket s3-vpc-flow-logs-elastic will send notification to SQS and you will see the messages available in the queue sqs-vpc-flow-logs-elastic-serverless-forwarder to be consumed by ESF. In case of issues such as SQS message count keep on increasing then check the Lambda execution logs Lambda -> Application -> serverlessrepo-elastic-serverless-forwarder-ElasticServerlessForwarderApplication* -> Monitoring -> Cloudwatch Log Insights. Click on LogStream for detailed information: More on troubleshooting here . Validate VPC flow logs in Kibana Discover and Dashboard Kibana -> Discover . This will show VPC flow logs: Kibana -> Dashboards . Look for VPC Flow log Overview dashboard: More dashboards! As mentioned earlier, AWS integration provides pre-built dashboards in addition to other assets. We can monitor involved AWS services in our setup using the Elastic agent ingestion method which we will cover in Part 2 of this series. This will help in tracking usage and help in optimisation. Conclusion Elasticsearch provides multiple options to sync data from AWS S3 into Elasticsearch deployments. In this walkthrough, we have demonstrated that it is relatively easy to implement Elastic Serverless Forwarder(ESF) ingestion options to ingest data from AWS S3 and leverage Elastic's industry-leading search & analytics capabilities. In Part 2 of this series , we'll dive into using Elastic Agent as another option for ingesting AWS S3 data. And in part 3 , we'll explain how to ingest data from AWS S3 using the Elastic S3 Native connector. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Data ingestion options comparison Elastic Cloud Elastic Serverless Forwarder Features of Elastic Serverless Forwarder Data flow: Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/ingest-aws-s3-data-elastic-cloud-elastic-serverless-forwarder","meta_description":"Learn how to ingest data from AWS S3 into Elastic Cloud using the Elastic Serverless Forwarder. Follow this guide to start the AWS S3 ingesting process."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Ruby Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Ruby +1 December 13, 2024 How to migrate your Ruby app from OpenSearch to Elasticsearch A guide to migrate a Ruby codebase from the OpenSearch client to the Elasticsearch client. FB By: Fernando Briano ES|QL Ruby +1 October 24, 2024 How to use the ES|QL Helper in the Elasticsearch Ruby Client Learn how to use the Elasticsearch Ruby client to craft ES|QL queries and handle their results. FB By: Fernando Briano Ruby How To October 16, 2024 How to use Elasticsearch with popular Ruby tools Take a look at how to use Elasticsearch with some popular Ruby libraries. FB By: Fernando Briano Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Ruby - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/ruby-programming","meta_description":"Ruby articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Better Binary Quantization (BBQ) in Lucene and Elasticsearch How Better Binary Quantization (BBQ) works in Lucene and Elasticsearch. Lucene Vector Database BT By: Benjamin Trent On November 11, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Embedding models output float32 vectors, often too large for efficient processing and practical apps. Elasticsearch supports int8 scalar quantization to reduce vector size while preserving performance. Other methods reduce retrieval quality and are impractical for real world use. In Elasticsearch 8.16 and Lucene, we introduced Better Binary Quantization (BBQ), a new approach developed from insights drawn from a recent technique - dubbed “ RaBitQ ” - proposed by researchers from Nanyang Technological University, Singapore. BBQ is a leap forward in quantization for Lucene and Elasticsearch, reducing float32 dimensions to bits, delivering ~95% memory reduction while maintaining high ranking quality. BBQ outperforms traditional approaches like Product Quantization (PQ) in indexing speed (20-30x less quantization time), query speed (2-5x faster queries), with no additional loss in accuracy. In this blog, we will explore BBQ in Lucene and Elasticsearch, focusing on recall, efficient bitwise operations, and optimized storage for fast, accurate vector search. Note, there are differences in this implementation than the one proposed by the original RaBitQ authors. Mainly: Only a single centroid is used for simple integration with HNSW and faster indexing Because we don't randomly rotate the codebook we do not have the property that the estimator is unbiased over multiple invocations of the algorithm Rescoring is not dependent on the estimated quantization error Rescoring is not completed during graph index search and is instead reserved only after initial estimated vectors are calculated Dot product is fully implemented and supported. The original authors focused on Euclidean distance only. While support for dot product was hinted at, it was not fully considered, implemented, nor measured. Additionally, we support max-inner product, where the vector magnitude is important, so simple normalization just won't suffice. What does the \"better\" in Better Binary Quantization mean? In Elasticsearch 8.16 and in Lucene, we have introduced what we call \"Better Binary Quantization\". Naive binary quantization is exceptionally lossy and achieving adequate recall requires gathering 10x or 100x additional neighbors to rerank. This just doesn't cut it. In comes Better Binary Quantization! Here are some of the significant differences between Better Binary Quantization and naive binary quantization: All vectors are normalized around a centroid. This unlocks some nice properties in quantization. Multiple error correction values are stored. Some of these corrections are for the centroid normalization, some are for the quantization. Asymmetric quantization. Here, while the vectors themselves are stored as single bit values, queries are only quantized down to int4. This significantly increases search quality at no additional cost in storage. Bit-wise operations for fast search. The query vectors are quantized and transformed in such a way that allows for efficient bit-wise operations. Indexing with Better Binary Quantization (BBQ) Indexing is simple. Remember, Lucene builds individual read only segments. As vectors come in for a new segment the centroid is incrementally calculated. Then once the segment is flushed, each vector is normalized around the centroid and quantized. Here is a small example: v 1 = [ 0.56 , 0.85 , 0.53 , 0.25 , 0.46 , 0.01 , 0.63 , 0.73 ] c = [ 0.65 , 0.65 , 0.52 , 0.35 , 0.69 , 0.30 , 0.60 , 0.76 ] v c 1 ′ = v 1 − c = [ − 0.09 , 0.19 , 0.01 , − 0.10 , − 0.23 , − 0.38 , − 0.05 , − 0.03 ] b i n ( v c 1 ′ ) = { { 1 x > 0 0 o t h e r w i s e : x ∈ v c 1 ′ } b i n ( v c 1 ′ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] 0 b 00000110 = 6 v_{1} = [0.56, 0.85, 0.53, 0.25, 0.46, 0.01 , 0.63, 0.73] \\newline c = [0.65, 0.65, 0.52, 0.35, 0.69, 0.30, 0.60, 0.76] \\newline v_{c1}' = v_{1} - c = [-0.09, 0.19, 0.01, -0.10, -0.23, -0.38, -0.05, -0.03] \\newline bin(v_{c1}') = \\left\\{ \\begin{cases} 1 & x\\gt 0 \\\\ 0 & otherwise \\end{cases} : x \\in v_{c1}'\\right\\} \\newline bin(v_{c1}') = [0, 1, 1, 0, 0, 0, 0, 0] \\newline 0b00000110 = 6 v 1 ​ = [ 0.56 , 0.85 , 0.53 , 0.25 , 0.46 , 0.01 , 0.63 , 0.73 ] c = [ 0.65 , 0.65 , 0.52 , 0.35 , 0.69 , 0.30 , 0.60 , 0.76 ] v c 1 ′ ​ = v 1 ​ − c = [ − 0.09 , 0.19 , 0.01 , − 0.10 , − 0.23 , − 0.38 , − 0.05 , − 0.03 ] bin ( v c 1 ′ ​ ) = { { 1 0 ​ x > 0 o t h er w i se ​ : x ∈ v c 1 ′ ​ } bin ( v c 1 ′ ​ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] 0 b 00000110 = 6 When quantizing down to the bit level, 8 floating point values are transformed into a single 8bit byte. Then, each of the bits are packed into a byte and stored in the segment along with any error correction values required for the vector similarity chosen. For each vector, bytes stored are dims/8 number of bytes and then any error correction values; 2 floating point values for Euclidean, or 3 for dot product. A quick side quest to talk about how we handle merging When segments are merged, we can take advantage of the previously calculated centroids. Simply doing a weighted average of the centroids and then re-quantizing the vectors around the new centroid. What gets tricky is ensuring HNSW graph quality and allowing the graph to be built with the quantized vectors. What's the point of quantizing if you still need all the memory to build the index?! In addition to appending vectors to the existing largest HNSW graph, we need to ensure vector scoring can take advantage of asymmetric quantization. HNSW has multiple scoring steps: one for the initial collection of neighbors, and another for ensuring only diverse neighbors are connected. In order to efficiently use asymmetric quantization, we create a temporary file of all vectors quantized as 4bit query vectors. So, as a vector is added to the graph we first: Get the already quantized query vector that is stored in the temporary file. Search the graph as normal using the already existing bit vectors. Once we have the neighbors, diversity and reverse-link scoring can be done with the previously int4 quantized values. After the merge is complete, the temporary file is removed leaving only the bit quantized vectors. The temporary file stores each query vector as an int4 byte array which takes dims/2 number of bytes, some floating point error correction values (3 for Euclidean, 4 for dot product), and a short value for the sum of the vector dimensions. Asymmetric quantization, the interesting bits I have mentioned asymmetric quantization and how we lay out the queries for graph building. But, how are the vectors actually transformed? How does it work? The \"asymmetric\" part is straight forward. We quantize the query vectors to a higher fidelity. So, doc values are bit quantized and query vectors are int4 quantized. What gets a bit more interesting is how these quantized vectors are transformed for fast queries. Taking our example vector from above, we can quantize it to int4 centered around the centroid. v c 1 ′ = v 1 − c = [ − 0.09 , 0.19 , 0.01 , − 0.10 , − 0.23 , − 0.38 , − 0.05 , − 0.03 ] m a x v c 1 ′ = m a x ( v c 1 ′ ) = 0.19 , m i n v c 1 ′ = m i n ( v c 1 ′ ) = − 0.38 Q ( x s ) = { ( x − m i n v c 1 ′ ) × 15 m a x v c 1 ′ − m i n v c 1 ′ : x ∈ x s } Q ( v c 1 ′ ) = { ( x − ( − 0.38 ) ) × 15 0.19 − ( − 0.38 ) : x ∈ v c 1 ′ } = { ( x + 0.38 ) × 26.32 : x ∈ v c 1 ′ } = [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] v_{c1}' = v_{1} - c = [-0.09, 0.19, 0.01, -0.10, -0.23, -0.38, -0.05, -0.03] \\newline max_{v_{c1}'}=max(v_{c1}')=0.19,min_{v_{c1}'}=min(v_{c1}')=-0.38 \\newline Q(x_{s}) = \\{(x-min_{v_{c1}'}) \\times \\frac{15}{max_{v_{c1}'} - min_{v_{c1}'}} : x \\in x_{s} \\} \\newline Q(v_{c1}') = \\{(x-(-0.38)) \\times \\frac{15}{0.19 -(-0.38)} : x \\in v_{c1}' \\} \\newline = \\{(x + 0.38) \\times 26.32 : x \\in v_{c1}' \\} \\newline = [8, 15, 10, 7, 4, 0, 9, 9] v c 1 ′ ​ = v 1 ​ − c = [ − 0.09 , 0.19 , 0.01 , − 0.10 , − 0.23 , − 0.38 , − 0.05 , − 0.03 ] ma x v c 1 ′ ​ ​ = ma x ( v c 1 ′ ​ ) = 0.19 , mi n v c 1 ′ ​ ​ = min ( v c 1 ′ ​ ) = − 0.38 Q ( x s ​ ) = {( x − mi n v c 1 ′ ​ ​ ) × ma x v c 1 ′ ​ ​ − mi n v c 1 ′ ​ ​ 15 ​ : x ∈ x s ​ } Q ( v c 1 ′ ​ ) = {( x − ( − 0.38 )) × 0.19 − ( − 0.38 ) 15 ​ : x ∈ v c 1 ′ ​ } = {( x + 0.38 ) × 26.32 : x ∈ v c 1 ′ ​ } = [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] With the quantized vector in hand, this is where the fun begins. So we can translate the vector comparisons to a bitwise dot product, the bits are shifted. Its probably better to just visualize what is happening: Here, each int4 quantized value has its relative positional bits shifted to a single byte. Note how all the first bits are packed together first, then the second bits, and so on. But how does this actually translate to dot product? Remember, dot product is the sum of the component products. For the above example, let's write this fully out: b i n ( v c 1 ′ ) ⋅ Q ( v c 1 ′ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] ⋅ [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] = [ 0 × 8 + 1 × 15 + 1 × 10 + 0 × 7 + 0 × 4 + 0 × 0 + 0 × 9 + 0 × 9 ] = 15 + 10 = 25 bin(v_{c1}') \\cdot Q(v_{c1}') = [0, 1, 1, 0, 0, 0, 0, 0] \\cdot [8, 15, 10, 7, 4, 0, 9, 9] \\newline = [0 \\times 8 + 1 \\times 15 + 1 \\times 10 + 0 \\times 7 + 0 \\times 4 + 0 \\times 0 + 0 \\times 9 + 0 \\times 9] \\newline = 15 + 10 = 25 bin ( v c 1 ′ ​ ) ⋅ Q ( v c 1 ′ ​ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] ⋅ [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] = [ 0 × 8 + 1 × 15 + 1 × 10 + 0 × 7 + 0 × 4 + 0 × 0 + 0 × 9 + 0 × 9 ] = 15 + 10 = 25 We can see that its simply the summation of the query components where the stored vector bits are 1. And since all numbers are just bits, when expressed using a binary expansion, we can move things around a bit to take advantage of bitwise operations. The bits that will be flipped after the & \\& & will be the individual bits of the numbers that contribute to the dot product. In this case 15 and 10. Remember our originally stored vector s t o r e d V e c B i t s = b i n ( v c 1 ′ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] We rotate the combine the bits resulting in s t o r e d V e c t B i t s = 0 b 11000000 The query vector, int4 quantized Q ( v c 1 ′ ) = [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] The binary values of each dimension b i t s ( Q ( v c 1 ′ ) ) = [ 0 b 1 0 0 0 , 0 b 1 1 1 1 , 0 b 1 0 1 0 , 0 b 0 1 1 1 , 0 b 0 1 0 0 , 0 b 0 0 0 0 , 0 b 1 0 0 1 , 0 b 1 0 0 1 ] We shift the bits and align as shown in above visualization q V e c B i t s = a l i g n ( b i t s ( Q ( v c 1 ′ ) ) ) = [ 0 b 11001010 , 0 b 00001110 , 0 b 00011010 , 0 b 11000111 ] q V e c B i t s & s t o r e d V e c t B i t s = { q V e c B i t s & b i t s : b i t s ∈ s t o r e d V e c t B i t s } = [ 0 b 00000010 , 0 b 00000110 , 0000000010 , 0 b 0000110 ] \\text{Remember our originally stored vector} storedVecBits = bin(v_{c1}') = [0, 1, 1, 0, 0, 0, 0, 0] \\newline \\text{We rotate the combine the bits resulting in} \\newline storedVectBits = 0b11000000 \\newline \\text{The query vector, int4 quantized} \\newline Q(v_{c1}') = [8, 15, 10, 7, 4, 0, 9, 9] \\newline \\text{The binary values of each dimension} \\newline bits(Q(v_{c1}')) = [0b\\textcolor{lime}{1}\\textcolor{red}{0}\\textcolor{cyan}{0}\\textcolor{orange}{0}, 0b\\textcolor{lime}{1}\\textcolor{red}{1}\\textcolor{cyan}{1}\\textcolor{orange}{1}, 0b\\textcolor{lime}{1}\\textcolor{red}{0}\\textcolor{cyan}{1}\\textcolor{orange}{0}, 0b\\textcolor{lime}{0}\\textcolor{red}{1}\\textcolor{cyan}{1}\\textcolor{orange}{1}, 0b\\textcolor{lime}{0}\\textcolor{red}{1}\\textcolor{cyan}{0}\\textcolor{orange}{0}, 0b\\textcolor{lime}{0}\\textcolor{red}{0}\\textcolor{cyan}{0}\\textcolor{orange}{0}, 0b\\textcolor{lime}{1}\\textcolor{red}{0}\\textcolor{cyan}{0}\\textcolor{orange}{1}, 0b\\textcolor{lime}{1}\\textcolor{red}{0}\\textcolor{cyan}{0}\\textcolor{orange}{1}] \\newline \\text{We shift the bits and align as shown in above visualization} \\newline qVecBits = align(bits(Q(v_{c1}'))) = [0b\\textcolor{orange}{11001010}, 0b\\textcolor{cyan}{00001110}, 0b\\textcolor{red}{00011010}, 0b\\textcolor{lime}{11000111}] \\newline qVecBits \\, \\& \\, storedVectBits = \\{qVecBits \\, \\& \\, bits : bits \\in storedVectBits\\} \\newline = [0b00000010, 0b00000110, 0000000010, 0b0000110] Remember our originally stored vector s t ore d V ec B i t s = bin ( v c 1 ′ ​ ) = [ 0 , 1 , 1 , 0 , 0 , 0 , 0 , 0 ] We rotate the combine the bits resulting in s t ore d V ec tB i t s = 0 b 11000000 The query vector, int4 quantized Q ( v c 1 ′ ​ ) = [ 8 , 15 , 10 , 7 , 4 , 0 , 9 , 9 ] The binary values of each dimension bi t s ( Q ( v c 1 ′ ​ )) = [ 0 b 1 0 0 0 , 0 b 1 1 1 1 , 0 b 1 0 1 0 , 0 b 0 1 1 1 , 0 b 0 1 0 0 , 0 b 0 0 0 0 , 0 b 1 0 0 1 , 0 b 1 0 0 1 ] We shift the bits and align as shown in above visualization q V ec B i t s = a l i g n ( bi t s ( Q ( v c 1 ′ ​ ))) = [ 0 b 11001010 , 0 b 00001110 , 0 b 00011010 , 0 b 11000111 ] q V ec B i t s & s t ore d V ec tB i t s = { q V ec B i t s & bi t s : bi t s ∈ s t ore d V ec tB i t s } = [ 0 b 00000010 , 0 b 00000110 , 0000000010 , 0 b 0000110 ] Now we can count the bits, shift and sum back together. We can see that all the bits that are left over are the positional bits for 15 and 10. = ( b i t C o u n t ( 0 b 00000010 ) < < 0 ) + ( b i t C o u n t ( 0 b 00000110 ) < < 1 ) + ( b i t C o u n t ( 0 b 00000010 ) < < 2 ) + ( b i t C o u n t ( 0 b 0000110 ) < < 3 ) = ( 1 < < 0 ) + ( 2 < < 1 ) + ( 1 < < 2 ) + ( 2 < < 3 ) = 25 = (bitCount(0b00000010) << 0) + (bitCount(0b00000110) << 1) + (bitCount(0b00000010) << 2) + (bitCount(0b0000110) << 3) \\newline = (1 << 0) + (2 << 1) + (1 << 2) + (2 << 3) \\newline = 25 = ( bi tC o u n t ( 0 b 00000010 ) << 0 ) + ( bi tC o u n t ( 0 b 00000110 ) << 1 ) + ( bi tC o u n t ( 0 b 00000010 ) << 2 ) + ( bi tC o u n t ( 0 b 0000110 ) << 3 ) = ( 1 << 0 ) + ( 2 << 1 ) + ( 1 << 2 ) + ( 2 << 3 ) = 25 Same answer as summing the dimensions directly. Here is the example but simplifed java code: Testing with BBQ: Alright, show me the numbers: We have done extensive testing with BBQ both in Lucene and Elasticsearch directly. Here are some of the results: Lucene benchmarking The benchmarking here is done over three datasets: E5-small , CohereV3 , and CohereV2 . Here, each element indicates recall@100 with oversampling by [1, 1.5, 2, 3, 4, 5]. E5-small This is 500k vectors for E5-small built from the quora dataset. quantization Index Time Force Merge time Mem Required bbq 161.84 42.37 57.6MB 4 bit 215.16 59.98 123.2MB 7 bit 267.13 89.99 219.6MB raw 249.26 77.81 793.5MB It's sort of mind blowing that we get recall of 74% with only a single bit precision. Since the number of dimensions are fewer, the BBQ distance calculation isn't that much faster than our optimized int4. CohereV3 This is 1M 1024 dimensional vectors, using the CohereV3 model. quantization Index Time Force Merge time Mem Required bbq 338.97 342.61 208MB 4 bit 398.71 480.78 578MB 7 bit 437.63 744.12 1094MB raw 408.75 798.11 4162MB Here, 1bit quantization and HNSW gets above 90% recall with only 3x oversampling. CohereV2 This is 1M 768 dimensional vectors, using the CohereV2 model and max inner product similarity. quantization Index Time Force Merge time Mem Required bbq 395.18 411.67 175.9MB 4 bit 463.43 573.63 439.7MB 7 bit 500.59 820.53 833.9MB raw 493.44 792.04 3132.8MB It's really interesting to see how much BBQ and int4 are in lock-step with this benchmark. Its neat that BBQ can get such high recall with inner-product similarity with only 3x oversampling. Larger scale Elasticsearch benchmarking As referenced in our larger scale vector search blog we have a rally track for larger scale vector search benchmarking. This data set has 138M floating point vectors of 1024 dimensions. Without any quantization, this would require around 535 GB of memory with HNSW. With better-binary-quantization, the estimate drops to around 19GB. For this test, we used a single 64GB node in Elastic cloud with the following track parameters: Important note, if you want to replicate, it will take significant time to download all the data and requires over 4TB of disk space. The reason for all the additional disk space is that this dataset also contains text fields, and you need diskspace for both the compressed files and their inflated size. The parameters are as follows: k is the number of neighbors to search for num_candidates is the number of candidates used to explore per shard in HNSW rerank is the number of candidates to rerank, so we will gather that many values per shard, collect the total rerank size and then rescore the top k values with the raw float32 vectors. For indexing time, it took around 12 hours. And instead of showing all the results, here are three interesting ones: k-num_candidates-rerank Avg Nodes Visited % Of Best NDGC Recall Single Query Latency Multi-Client QPS knn-recall-10-100-50 36,079.801 90% 70% 18ms 451.596 knn-recall-10-20 15,915.211 78% 45% 9ms 1,134.649 knn-recall-10-1000-200 115,598.117 97% 90% 42.534ms 167.806 This shows the importance of balancing recall, oversampling, reranking and latency. Obviously, each needs to be tuned for your specific use case, but considering this was impossible before and now we have 138M vectors in a single node, it's pretty cool. Conclusion Thank you for taking a bit of time and reading about Better Binary Quantization. Originally being from Alabama and now living in South Carolina, BBQ already held a special place in my life. Now, I have more reason to love BBQ! We will release this as tech-preview in 8.16, or in serverless right now. To use this, just set your dense_vector.index_type as bbq_hnsw or bbq_flat in Elasticsearch. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to What does the \"better\" in Better Binary Quantization mean? Indexing with Better Binary Quantization (BBQ) A quick side quest to talk about how we handle merging Asymmetric quantization, the interesting bits Testing with BBQ: Alright, show me the numbers: Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Better Binary Quantization (BBQ) in Lucene and Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/better-binary-quantization-lucene-elasticsearch","meta_description":"Learn about Elastic's BBQ (Better Binary Quantization) and how it works in Elasticsearch and Lucene."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Adding passage vector search to Lucene Here's how to add passage vectors to Lucene, the benefits of doing so and how existing Lucene structures can be used to create an efficient retrieval experience. Vector Database Lucene BT By: Benjamin Trent On August 24, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Vector search is a powerful tool in the information retrieval tool box. Using vectors alongside lexical search like BM25 is quickly becoming commonplace. But there are still a few pain points within vector search that need to be addressed. A major one is text embedding models and handling larger text input. Where lexical search like BM25 is already designed for long documents, text embedding models are not. All embedding models have limitations on the number of tokens they can embed. So, for longer text input it must be chunked into passages shorter than the model’s limit. Now instead of having one document with all its metadata, you have multiple passages and embeddings. And if you want to preserve your metadata, it must be added to every new document. Figure 1: Now instead of having a single piece of metadata indicating the first chapter of Little Women, you have to index that information data for every sentence. A way to address this is with Lucene's “join” functionality. This is an integral part of Elasticsearch’s nested field type. It makes it possible to have a top-level document with multiple nested documents, allowing you to search over nested documents and join back against their parent documents. This sounds perfect for multiple passages and vectors belonging to a single top-level document! This is all awesome! But, wait, Elasticsearch doesn’t support vectors in nested fields. Why not, and what needs to change? The (kNN) problem with parents and children The key issue is how Lucene can join back to the parent documents when searching child vector passages. Like with kNN pre-filtering versus post-filtering , when the joining occurs determines the result quality and quantity. If a user searches for the top four nearest parent documents (not passages) to a query vector , they usually expect four documents. But what if they are searching over child vector passages and all four of the nearest vectors are from the same parent document? This would end up returning just one parent document, which would be surprising. This same kind of issue occurs with post-filtering. Figure 2: Documents 3, 5, 10 are parent docs. 1, 2 belong to 3; 4 to 5; 6, 7, 8, 9 to 10. Let us search with query vector A, and the four nearest passage vectors are 6, 7, 8, 9. With “post-joining,” you only end up retrieving parent document 10. Figure 3: Vector “A” matching nearest all the children of 10. What can we do about this problem? One answer could be, “Just increase the number of vectors returned!” However, at scale, this is untenable. What if every parent has at least 100 children and you want the top 1,000 nearest neighbors? That means you have to search for at least 100,000 children! This gets out of hand quickly. So, what’s another solution? Pre-joining to the rescue The solution to the “post-joining” problem is “pre-joining.” Recently added changes to Lucene enable joining against the parent document while searching the HNSW graph! Like with kNN pre-filtering , this ensures that when asked to find the k nearest neighbors of a query vector, we can return not the k nearest passages as represented by dense vectors, but k nearest documents , as represented by their child passages that are most similar to the query vector. What does this actually look like in practice? Let’s assume we are searching the same nested documents as before: Figure 4: Documents 3, 5, 10 are parent docs. 1,2 belong to 3; 4 to 5; 6, 7, 8, 9 to 10. As we search and score documents, instead of tracking children, we track the parent documents and update their scores. Figure 5 shows a simple flow. For each child document visited, we get its score and then track it by its parent document ID. This way, as we search and score the vectors we only gather the parent IDs. This ensures diversification of results with no added complexity to the HNSW algorithm using already existing and powerful tools within Lucene. All this with only a single additional bit of memory required per vector stored. Figure 5: As we search the vectors, we score and collect the associated parent document. Only updating the score if it is more competitive than the previous. But, how is this efficient? Glad you asked! There are certain restrictions that provide some really nice short cuts. As you can tell from the previous examples, all parent document IDs are larger than child IDs. Additionally, parent documents do not contain vectors themselves, meaning children and parents are purely disjoint sets . This affords some nice optimizations via bit sets . A bit set provides an exceptionally fast structure for “tell me the next bit that is set.” For any child document, we can ask the bit set, “Hey, what's the number that is larger than me that is in the set?” Since the sets are disjoint, we know the next bit that is set is the parent document ID. Conclusion In this post, we explored both the challenges of supporting dense document retrieval at scale and our proposed solution using nested fields and joins in Lucene. This work in Lucene paves the way to more naturally storing and searching dense vectors of passages from long text in documents and an overall improvement in document modeling for vector search in Elasticsearch . This is a very exciting step forward for vector search in Elasticsearch! If you want to chat about this or anything else related to vector search in Elasticsearch, come join us in our Discuss forum . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. In this blog post, we may have used or referred to third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch, ESRE, Elasticsearch Relevance Engine and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to The (kNN) problem with parents and children Pre-joining to the rescue Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Adding passage vector search to Lucene - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/adding-passage-vector-search-to-lucene","meta_description":"Here's how to add passage vectors to Lucene, the benefits of doing so and how existing Lucene structures can be used to create an efficient retrieval experience."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Apache Lucene 9.9, the fastest Lucene release ever Lucene 9.9 brings major speedups to query evaluation. Here are the performance improvements observed in nightly benchmarks & optimization resources. Lucene AG By: Adrien Grand On December 7, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Apache Lucene development has always been vibrant, but the last few months have seen an especially high number of optimizations to query evaluation. There isn't one optimization that can be singled out, it's rather a combination of many improvements around mechanical sympathy and improved algorithms. What is especially interesting here is that these optimizations do not only benefit some very specific cases, they translate into actual speedups in Lucene's nightly benchmarks , which aim at tracking the performance of queries that are representative of the real world. Just hover on annotations to see where a speedup (or slowdown sometimes!) is coming from. By the way, special thanks to Mike McCandless for maintaining Lucene's nightly benchmarks on his own time and hardware for almost 13 years now! Key speedup benchmarks in Lucene Here are some speedups that nightly benchmarks observed between Lucene 9.6 (May 2023) and Lucene 9.9 (December 2023): AndHighHigh : 35% faster AndHighMed : 15% faster OrHighHigh : 60% faster OrHighMed : 38% faster CountAndHighHigh : 15% faster CountAndHighMed : 11% faster CountOrHighHigh : 145% faster CountOrHighMed : 155% faster TermDTSort : 24% faster TermTitleSort : 290% faster (not a typo!) TermMonthSort : 7% faster DayOfYearSort : 25% faster VectorSearch : 5% faster Lucene optimization resources In case you are curious about these changes, here are resources that describe some of the optimizations that we applied: Bringing speedups to top-k queries with many and/or high-frequency terms (annotation FK) More skipping with block-max MAXSCORE (annotation FU) Accelerating vector search with SIMD instructions Vector similarity computations FMA-style Lucene 9.9 was just released and is expected to be integrated into Elasticsearch 8.12, which should get released soon. Stay tuned! Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Key speedup benchmarks in Lucene Lucene optimization resources Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Apache Lucene 9.9, the fastest Lucene release ever - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/apache-lucene-9.9-search-speedups","meta_description":"Lucene 9.9 brings major speedups to query evaluation. Here are the performance improvements observed in nightly benchmarks & optimization resources."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series The Spotify Wrapped series Here's how to create your own Spotify Wrapped in Kibana and dive deep into your data. Part1 How To January 14, 2025 How to make your own Spotify Wrapped in Kibana Based on the downloadable Spotify personal history, we'll make a custom version of \"Spotify Wrapped\" with the top artists, songs, and trends over the year IF By: Iulia Feroli Part2 How To February 25, 2025 Spotify Wrapped part 2: Diving deeper into the data We will dive deeper into your Spotify data than ever before and explore connections you didn't even know existed. PK By: Philipp Kahr Part3 How To March 24, 2025 Anomaly detection population jobs: Spotify Wrapped, part 3 Anomaly detection can be a daunting task at first, but in this blog, we'll dive into it and figure out how the different jobs can help us find unusual patterns in our Spotify Wrapped data. PK By: Philipp Kahr Part4 How To April 1, 2025 Detecting relationships in data: Spotify Wrapped, part 4 Graphs are a powerful tool for detecting relationships in data. In this blog, we'll explore the relationships between artists and your music taste. PK By: Philipp Kahr Part5 How To April 10, 2025 Finding your best music friend with vectors: Spotify Wrapped, part 5 Understanding vectors has never been easier. Handcrafting vectors and figuring out various techniques to find your music friend in a heavily biased dataset. PK VB By: Philipp Kahr and Vincent Bosc Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"The Spotify Wrapped series - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/spotify-wrapped-kibana","meta_description":"Here's how to create your own Spotify Wrapped in Kibana and dive deep into your data."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Elasticsearch geospatial search This series covers how to use the new geospatial search features in ES|QL, including how to ingest geospatial data and how to use it in ES|QL queries. Part1 Python How To August 12, 2024 Elasticsearch geospatial search with ES|QL Geospatial search in Elasticsearch Query Language (ES|QL). Elasticsearch has powerful geospatial search features, which are now coming to ES|QL for dramatically improved ease of use and OGC familiarity. CT By: Craig Taverner Part2 How To October 25, 2024 Ingest geospatial data into Elasticsearch with Kibana for use in ES|QL How to use Kibana and the csv ingest processor to ingest geospatial data into Elasticsearch for use with search in Elasticsearch Query Language (ES|QL). Elasticsearch has powerful geospatial search features, which are now coming to ES|QL for dramatically improved ease of use and OGC familiarity. But to use these features, we need Geospatial data. CT By: Craig Taverner Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch geospatial search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/elasticsearch-geospatial-search","meta_description":"This series covers how to use the new geospatial search features in ES|QL, including how to ingest geospatial data and how to use it in ES|QL queries."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to choose between exact and approximate kNN search in Elasticsearch Learn more about exact and approximate kNN search in Elasticsearch, and when to use each one. Vector Database How To CD By: Carlos Delgado On May 20, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. What is kNN? Semantic search is a powerful tool for relevance ranking. It allows you to go beyond using just keywords, but consider the actual meaning of your documents and queries. Semantic search is based on vector search . In vector search, the documents we want to search have vector embeddings calculated for them. These embeddings are calculated using machine learning models, and returned as vectors that are stored alongside our document data. When a query is performed, the same machine learning model is used to calculate the embeddings for the query text. Semantic search consists of finding the closest results to the query, by comparing the query embeddings to the document embeddings. kNN, or k nearest neighbors , is a technique for obtaining the top k closest results to a specific embedding. There are two main approaches for calculating kNN for a query using embeddings: Exact and approximate. This post will help you: Understand what exact and approximate kNN search is How to prepare your index for these approaches How to decide which approach is best for your use case Exact kNN: Search everything One approach for calculating the closer results would be comparing all the existing document embeddings with the one for the query. This would ensure that we're getting the closest matches possible, as we will be comparing all of them. Our search results would be as accurate as possible, as we're considering our whole document corpus and comparing all our document embeddings with the query embeddings. Of course, comparing against all the documents has a downside: it takes time. We will be calculating the embeddings similarity one by one, using a similarity function, over all the documents. This also means that we will be scaling linearly - having twice the number of documents will potentially take twice as long. Exact search can be done on vector fields using script_score with a vector function for calculating similarity between vectors. Approximate kNN: A good estimate A different approach is to use an approximation instead of considering all the documents. For providing an efficient approximation to kNN, Elasticsearch and Lucene use Hierarchical Navigation Small Worlds HNSW . HNSW is a graph data structure that maintains links between elements that are close together, at different layers. Each layer contains elements that are connected, and are also connected to elements of the layer below it. Each layer contains more elements, with the bottom layer containing all the elements. Figure 1 - An example of a HNSW graph. The top layer contains the initial nodes to start the search. These initial nodes serve as entry points to the lower layers, each containing more nodes. The lower layer contains all nodes. Think of it as driving; there are highways, roads and streets. When driving on a highway you will see some exit signs that describe some high-level areas (like a town or a neighborhood). Then you get to a road that has directions for specific streets. Once you get to a street, you can reach a specific address and the ones that are in the same neighborhood. HNSW is similar to that, as it creates different levels of vector embeddings. It calculates the highway that is closer to the initial query, and chooses the exits that look more promising to keep looking for the closer addresses to the one we're looking for. This is great in terms of performance, as it doesn't have to consider all documents, but uses this multi-level approach to quickly find an approximation of the closer ones. But, it's an approximation. Not all nodes are interconnected, meaning that it's possible to overlook results that are closer to specific nodes as they might not be connected. The interconnection of the nodes depends on how the HNSW structure is created. How good HNSW is depends on several factors: How it is constructed. The HNSW construction process will consider a number of candidates to track as the closer ones to a specific node. Increasing the number of candidates to consider will produce a more precise structure, at the cost of spending more time creating it at indexing time. The ef_construction parameter in the dense vector index_options is used for this. How many candidates we're considering when searching. When looking for the closer results, the process will keep track of a number of candidates. The bigger this number is, the more precise it will be, and the slower our search will be. The num_candidates in kNN parameters controls this behavior. How many segments we're searching. Each segment has a HNSW graph that needs to be searched for, and its results combined with the other segment graphs. Having fewer segments will mean searching fewer graphs (so it'll be faster), but will have a less diverse set of results (so it will be less precise). Overall, HNSW offers a good tradeoff between performance and recall, and allows fine tuning both at indexing and query side. Searching using HNSW can be done using the kNN search section in most of the situations. Using a kNN query is also possible for more advanced use cases, like: Combining kNN with other queries (as part of bool queries, or pinned queries) Using function_score to fine tune the scoring Improve aggregations and field collapse diversity You can check about kNN query and the differences to the kNN search section in this post . We'll dive into when you'll want to use this method versus the others below. Indexing for exact and approximate search dense_vector field type There are two main indexing types for dense_vector fields you can choose from for storing your embeddings: flat types (including flat and int8_flat ) store the raw vectors, without adding HNSW data structures. dense_vectors that use flat indexing type will always use exact kNN - the kNN query will actually perform an exact query instead of an approximate one. HNSW types (including hnsw and int8_hnsw ) create the HNSW data structure, allowing approximate kNN search to be used. Does it mean that you can't use exact kNN with HNSW field types? Not really! You can use both exact kNN via a script_score query , or use approximate kNN via the kNN section and the kNN query . This allows more flexibility depending on your search use case. Using HNSW field types means that the HNSW graph structure needs to be built, and that takes time, memory and disk space. If you'll just be using exact search, you can use the flat vector field type. This ensures that your embeddings are indexed optimally and use less space. Remember to always avoid storing your embeddings in _source in any case, to reduce your storage needs. Quantization Using quantization , either flat (int8_flat) or HNSW (int8_hnsw) types of indexing will help you reduce your embeddings size, so you will be able to use less memory and disk storage for holding your embeddings information. As search performance relies on your embeddings fitting as much as possible in memory, you should always look for ways of reducing your data if possible. Using quantization is a trade off between memory and recall. How do I choose between exact and approximate kNN search? There's no one-size-fits-all answer here. You need to consider a number of factors, and experiment, in order to get to the optimal balance between performance and accuracy: Data size Searching everything is not something you should avoid at all costs. Depending on your data size (in terms of number of documents and embedding dimensions), it might make sense to do an exact kNN search. As a rule of thumb, having less than 10 thousand documents to search could be an indication that exact search should be used. Keep in mind that the number of documents to search can be filtered in advance, so the effective number of documents to search can be restricted by the filters applied. Approximate search scales better in terms of the number of documents, so it should be the way to go if you're having a big number of documents to search, or expect it to increase significantly. Figure 2 - an example run of exact and approximate kNN using 768 dimensions vectors from the so_vector rally track . The example demonstrates the linear runtime of exact kNN vs the logarithmic runtime of HNSW searches. Filtering Filtering is important, as it reduces the number of documents to consider for searching. This needs to be taken into account when deciding on using exact vs approximate. You can use query filters for reducing the number of documents to consider, both for exact and approximate search. However, approximate search takes a different approach for filtering. When doing approximate searches using HNSW, query filters will be applied after the top k results have been retrieved. That's why using a query filter along with a kNN query is referred to as a post-filter for kNN . Figure 3 - Post-filtering in kNN search. The problem with using post-filters in kNN is that the filter is being applied after we have gathered the top k results. This means that we can end up with less than k results, as we need to remove the elements that don't pass the filter from the top k results that we've already retrieved from the HNSW graph. Fortunately, there is another approach to use with kNN, and is specifying a filter in the kNN query itself. This filter applies to the graph elements as the results are being gathered traversing the HNSW graph, instead of applying it afterwards. This ensures that the top k elements are returned, as the graph will be traversed - skipping the elements that don't pass the filter - until we get the top k elements. Figure 4 - Pre-filtering in kNN search. This specific kNN query filter is called a kNN pre-filter to specify that it is applied before retrieving results, as opposed to applying it afterwards. That's why, in the context of using a kNN query, regular query filters are referred to as post-filters. Using a kNN pre-filter affects approximate search performance, as we need to consider more elements while searching in the HNSW graph - we will be discarding the elements that don't pass the filter, so we need to look for more elements on each search to retrieve the same number of results. Future enhancements for kNN There are some improvements that are coming soon that will help with exact and approximate kNN. Elasticsearch will add the possibility to upgrade a dense_vector type from flat to HNSW. This means that you'll be able to start using a flat vector type for exact kNN, and eventually start using HNSW when you need the scale. Your segments will be searched transparently for you when using approximate kNN, and will be transformed automatically to HNSW when they are merged together. A new exact kNN query will be added so a simple query will be used to do exact kNN for both flat and HNSW fields, instead of relying on a script score query. This will make exact kNN more straightforward. Conclusion So, should you use approximate or exact kNN on your documents? Check the following: How many documents? Less than 10 thousand (after applying filters) would probably be a good case for exact. Are your searches using filters? This impacts how many documents will be searched. If you need to use approximate kNN, remember to use kNN pre-filters to get more results at the expense of performance. You can compare the performance of both approaches by indexing using a HNSW dense_vector, and comparing kNN search against a script_score for doing exact kNN. This allows comparing both methods using the same field type (just remember to change your dense_vector field type to flat in case you decide to go for exact search!) Happy searching! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to What is kNN? Exact kNN: Search everything Approximate kNN: A good estimate Indexing for exact and approximate search dense_vector field type Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to choose between exact and approximate kNN search in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/knn-exact-vs-approximate-search","meta_description":"Learn more about exact and approximate kNN search in Elasticsearch, and when to use each one."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog From ES|QL to PHP objects Learn how to execute and manage ES|QL queries in PHP. Follow this guide to map ES|QL results to a PHP object or custom class. ES|QL PHP How To EZ By: Enrico Zimuel On April 8, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Starting from elasticsearch-php v8.13.0 you can execute ES|QL queries and map the result to a PHP object of stdClass or a custom class. ES|QL ES|QL is a new Elasticsearch Query Language introduced in Elasticsearch 8.11.0. Right now, it is available in technical preview. It provides a powerful way to filter, transform, and analyze data stored in Elasticsearch. It makes use of \"pipes\" ( | ) to manipulate and transform data in a step-by-step fashion. This approach allows users to compose a series of operations, where the output of one operation becomes the input for the next, enabling complex data transformations and analysis. For instance, the following query returns the first 3 documents (rows) of the sample_data index: Use case: ES|QL features in the official PHP client To illustrate the ES|QL features developed in the official PHP client, we stored in Elasticsearch a CSV file of 81,828 books (54.4 MB) including the following information: We extracted this list from the public available Amazon Books Reviews dataset . We created a books index with the following Elasticsearch mappings: The rating value is the average of the ranking reviews taken from the Books_rating.csv file of 2.9 GB. Here you can find the PHP script that we used to bulk import all the books in Elasticsearch. The bulk operation took 7 sec and 28 MB RAM using PHP 8.2.17. With the proposed mapping the index size in Elasticsearch is about 62 MB. Map ES|QL results to a PHP object or custom class We can execute ES|QL query in PHP using the esql()->query() endpoint. The result of this query is a a table data structure. This is expressed in JSON using the columns and values fields. In the columns field we have the name and type definition. Here is an example of ES|QL query to retrieve the top-10 books written by Stephen King ordered by the user ranking reviews: The JSON result from Elasticsearch looks as follows: In this example we have 6 properties (author, description, publisher, rating, title, year) related to a book and 10 results, all books by Stephen King. A list of all the supported types in ES|QL is reported here . The $result response object can be accessed as an array, a string or as an object (see here for more information). Using the object interface, we can access the values using properties and indexes. For instance, $result->values[0][4] returns the title (4) of the first book (0) in the list, $result->values[1][3] returns the rank score (3) of the second book (1), etc. Remember, the index of an array in PHP starts from zero. This interface can be good enough for some use cases but most of the time we would like to have an array of objects as result. To map the result into an array of objects we can use the new mapTo() feature of elasticsearch-php. This function is available directly in the Elasticsearch response object . That means you can access it as follows: If you have a custom Book class, you can map the result using it, as follows: If your class has other properties in addition to the ones included in the ES|QL result, this will work as well. The mapTo() function will use only the properties returned as columns of the ES|QL result. You can download all the examples reported in this article here . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to ES|QL Use case: ES|QL features in the official PHP client Map ES|QL results to a PHP object or custom class Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"From ES|QL to PHP objects - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-php-map-object-class","meta_description":"Learn how to execute and manage ES|QL queries in PHP. Follow this guide to map ES|QL results to a PHP object or custom class."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Perform vector search in Elasticsearch with the Elasticsearch Go client Learn how to perform vector search in Elasticsearch using the Elasticsearch Go client through a practical example. Vector Database How To CR LS By: Carly Richmond and Laurent Saint-Félix On November 1, 2023 Part of Series Using the Elasticsearch Go client for keyword search, vector search & hybrid search Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Building software in any programming language, including Go, is committing to a lifetime of learning. Through her university and working career, Carly has dabbled in many programming languages and technologies, including the latest and greatest implementations of vector search. But that wasn't enough! So recently Carly started playing with Go, too. Just like animals, programming languages, and your friendly author, search has undergone an evolution of different practices that can be difficult to decide between for your own search use case. In this blog, we'll share an overview of vector search along with examples of each approach using Elasticsearch and the Elasticsearch Go client . These examples will show you how to find gophers and determine what they eat using vector search in Elasticsearch and Go. Prerequisites To follow with this example, ensure the following prerequisites are met: Installation of Go version 1.21 or later Creation of your own Go repo with the Creation of your own Elasticsearch cluster, populated with a set of rodent-based pages, including for our friendly Gopher , from Wikipedia: Connecting to Elasticsearch In our examples, we shall make use of the Typed API offered by the Go client. Establishing a secure connection for any query requires configuring the client using either: Cloud ID and API key if making use of Elastic Cloud. Cluster URL, username, password and the certificate. Connecting to our cluster located on Elastic Cloud would look like this: The client connection can then be used for vector search, as shown in subsequent sections. Vector search Vector search attempts to solve this problem by converting the search problem into a mathematical comparison using vectors. The document embedding process has an additional stage of converting the document using a model into a dense vector representation, or simply a stream of numbers. The advantage of this approach is that you can search non-text documents such as images and audio by translating them into a vector alongside a query. In simple terms, vector search is a set of vector distance calculations. In the below illustration, the vector representation of our query Go Gopher is compared against the documents in the vector space, and the closest results (denoted by constant k ) are returned: Depending on the approach used to generate the embeddings for your documents, there are two different ways to find out what gophers eat. Approach 1: Bring your own model With a Platinum license, it's possible to generate the embeddings within Elasticsearch by uploading the model and using the inference API. There are six steps involved in setting up the model: Select a PyTorch model to upload from a model repository. For this example, we're using the sentence-transformers/msmarco-MiniLM-L-12-v3 from Hugging Face to generate the embeddings. Load the model into Elastic using the Eland Machine Learning client for Python using the credentials for our Elasticsearch cluster and task type text_embeddings . If you don't have Eland installed, you can run the import step using Docker , as shown below: Once uploaded, quickly test the model sentence-transformers__msmarco-minilm-l-12-v3 with a sample document to ensure the embeddings are generated as expected: Create an ingest pipeline containing an inference processor. This will allow the vector representation to be generated using the uploaded model: Create a new index containing the field text_embedding.predicted_value of type dense_vector to store the vector embeddings generated for each document: Reindex the documents using the newly created ingest pipeline to generate the text embeddings as the additional field text_embedding.predicted_value on each document: Now we can use the Knn option on the same search API using the new index vector-search-rodents , as shown in the below example: Converting the JSON result object via unmarshalling is done in the exact same way as the keyword search example. Constants K and NumCandidates allow us to configure the number of neighbor documents to return and the number of candidates to consider per shard. Note that increasing the number of candidates increases the accuracy of results but leads to a longer-running query as more comparisons are performed. When the code is executed using the query What do Gophers eat? , the results returned look similar to the below, highlighting that the Gopher article contains the information requested unlike the prior keyword search: Approach 2: Hugging Face inference API Another option is to generate these same embeddings outside of Elasticsearch and ingest them as part of your document. As this option does not make use of an Elasticsearch machine learning node, it can be done on the free tier. Hugging Face exposes a free-to-use, rate-limited inference API that, with an account and API token, can be used to generate the same embeddings manually for experimentation and prototyping to help you get started. It is not recommended for production use. Invoking your own models locally to generate embeddings or using the paid API can also be done using a similar approach. In the below function GetTextEmbeddingForQuery we use the inference API against our query string to generate the vector returned from a POST request to the endpoint: The resulting vector, of type []float32 is then passed as a QueryVector instead of using the QueryVectorBuilder option to leverage the model previously uploaded to Elastic. Note that the K and NumCandidates options remain the same irrespective of the two options and that the same results are generated as we are using the same model to generate the embeddings Conclusion Here we've discussed how to perform vector search in Elasticsearch using the Elasticsearch Go client . Check out the GitHub repo for all the code in this series. Follow on to part 3 to gain an overview of combining vector search with the keyword search capabilities covered in part one in Go. Until then, happy gopher hunting! Resources Elasticsearch Guide Elasticsearch Go client What is vector search? | Elastic Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Connecting to Elasticsearch Vector search Approach 1: Bring your own model Approach 2: Hugging Face inference API Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Perform vector search in Elasticsearch with the Elasticsearch Go client - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/perform-vector-search-with-the-elasticsearch-go-client","meta_description":"Learn how to perform vector search in Elasticsearch using the Elasticsearch Go client through a practical example."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing LangChain4j to simplify LLM integration into Java applications LangChain4j (LangChain for Java) is a powerful toolset to build your RAG application in plain Java. Integrations Java How To DP By: David Pilato On September 23, 2024 Part of Series Introducing LangChain4j: Building RAG apps in plain Java Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The LangChain4j framework was created in 2023 with this target : The goal of LangChain4j is to simplify integrating LLMs into Java applications. LangChain4j is providing a standard way to: create embeddings (vectors) from a given content, let say a text for example store embeddings in an embedding store search for similar vectors in the embedding store discuss with LLMs use a chat memory to remember the context of a discussion with an LLM This list is not exhaustive and the LangChain4j community is always implementing new features. This post will cover the first main parts of the framework. Adding LangChain4j OpenAI to our project Like in all Java projects, it's just a matter of dependencies. Here we will be using Maven but the same could be achieved with any other dependency manager. As a first step to the project we want to build here, we will be using OpenAI so we just need to add the langchain4j-open-ai artifact: For the rest of the code we will be using either our own API key, which you can get by registering for an account with OpenAI , or the one provided by LangChain4j project for demo purposes only: We can now create an instance of our ChatLanguageModel: And finally we can ask a simple question and get back the answer: The given answer might be something like: If you'd like to run this code, please check out the Step1AiChatTest.java class. Providing more context with langchain4j Let's add the langchain4j artifact: This one is providing a toolset which can help us build a more advanced LLM integration to build our assistant. Here we will just create an Assistant interface which provides the chat method which will be calling automagically the ChatLanguageModel we defined earlier: We just have to ask LangChain4j AiServices class to build an instance for us: And then call the chat(String) method: This is having the same behavior as before. So why did we change the code? In the first place, it's more elegant but more than that, you can now give some instructions to the LLM using simple annotations: This is now giving: If you'd like to run this code, please check out the Step2AssistantTest.java class. Switching to another LLM: langchain4j-ollama We can use the great Ollama project . It helps to run a LLM locally on your machine. Let's add the langchain4j-ollama artifact: As we are running the sample code using tests, let's add Testcontainers to our project: We can now start/stop Docker containers: We \"just\" have to change the model object to become an OllamaChatModel instead of the OpenAiChatModel we used previously: Note that it could take some time to pull the image with its model, but after a while, you could get the answer: Better with memory If we ask multiple questions, by default the system won't remember the previous questions and answers. So if we ask after the first question \"When was he born?\", our application will answer: Which is nonsense. Instead, we should use Chat Memory : Running the same questions now gives a meaningful answer: Conclusion In the next post , we will discover how we can ask questions to our private dataset using Elasticsearch as the embedding store. That will give us a way to transform our application search to the next level. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Adding LangChain4j OpenAI to our project Providing more context with langchain4j Switching to another LLM: langchain4j-ollama Better with memory Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing LangChain4j to simplify LLM integration into Java applications - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/langchain4j-llm-integration-introduction","meta_description":"LangChain4j (LangChain for Java) is a powerful toolset to build your RAG app in plain Java. Here's how to add LangChain4j to your project and more."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog From ES|QL to Pandas dataframes in Python Learn how to export ES|QL queries as Pandas dataframes in Python through practical examples. Python ES|QL How To QP By: Quentin Pradet On March 11, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Update: When we published this article in March 2024, Elasticsearch did not yet support Apache Arrow streaming format. This is possible now, see \"From ES|QL to native Pandas dataframes in Python\" for more details. The Elasticsearch Query Language (ES|QL) provides a powerful way to filter, transform, and analyze data stored in Elasticsearch. Designed to be easy to learn and use, it is a perfect fit for data scientists familiar with Pandas and other dataframe-based libraries. Indeed, ES|QL queries produce tables with named columns, which is the definition of dataframes! This blog explains how to export ES|QL queries as Pandas dataframes in Python. ES|QL to Pandas dataframes in Python Importing test data First, let's import some test data. We will be using the employees sample data and mappings . The easiest way to load this dataset is to run those two Elasticsearch API requests in the Kibana Console . Converting dataset to a Pandas DataFrame object OK, with that out of the way, let's convert the full employees dataset to a Pandas DataFrame object using the ES|QL CSV export: Even though this dataset only contains 100 records, we use a LIMIT command to avoid ES|QL warning us about potentially missing records. This prints the following dataframe: This means you can now analyze the data with Pandas. But you can also continue massaging the data using ES|QL, which is particuarly useful when queries return more than 10,000 rows, the current maximum number of rows that ES|QL queries can return. Analyzing the data with Pandas In the next example, we're counting how many employees are speaking a given language by using STATS ... BY (not unlike GROUP BY in SQL). And then we sort the result with the languages column using SORT : Note that we've used the dtype parameter of pd.read_csv() here, which is useful when the type inferred by Pandas is not enough. The above code prints the following: 21 employees speak 5 languages, wow! Finally, suppose that end users of your code control the minimum number of languages spoken. You could format the query directly in Python, but it would allow an attacker to perform an ES|QL injection! Instead, use the built-in parameters support of the ES|QL REST API: which prints the following: Conclusion As you can see, ES|QL and Pandas play nicely together. However, CSV is not the ideal format as it requires explicit type declarations and doesn't handle well some of the more elaborate results that ES|QL can produce, such as nested arrays and objects. For this, we are working on adding native support for Apache Arrow dataframes in ES|QL, which will make all this transparent and bring significant performance improvements. Additional resources If you want to learn more about ES|QL, the ES|QL documentation is the best place to start. You can also check out this other Python example using Boston Celtics data . To know more about the Python Elasticsearch client itself, you can refer to the documentation , ask a question on Discuss with the language-clients tag or open a new issue if you found a bug or have a feature request. Thank you! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to ES|QL to Pandas dataframes in Python Importing test data Converting dataset to a Pandas DataFrame object Analyzing the data with Pandas Conclusion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"From ES|QL to Pandas dataframes in Python - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-pandas-dataframes-python","meta_description":"Learn how to export ES|QL queries as Pandas dataframes in Python through practical examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Lexical and semantic search with Elasticsearch In this blog, we'll explore various approaches to retrieving information using Elasticsearch, focusing on lexical and semantic search. Vector Database Python PP By: Priscilla Parodi On October 3, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Search is the process of locating the most relevant information based on your search query or combined queries and relevant search results are documents that best match these queries. Although there are several challenges and methods associated with search, the ultimate goal remains the same, to find the best possible answer to your question . Considering this goal, in this blog post, we will explore different approaches to retrieving information using Elasticsearch, with a specific focus on text search: lexical and semantic search. Prerequisites To accomplish this, we will provide Python examples that demonstrate various search scenarios on a dataset generated to simulate e-commerce product information. This dataset contains over 2,500 products, each with a description. These products are categorized into 76 distinct product categories, with each category containing a varying number of products, as shown below: Treemap visualization - top 22 values of category.keyword (product categories) For the setup you will need: Python 3.6 or later The Elastic Python client Elastic 8.8 deployment or later, with 8GB memory machine learning node The Elastic Learned Sparse EncodeR model that comes pre-loaded into Elastic installed and started on your deployment We will be using Elastic Cloud, a free trial is available . Besides the search queries provided in this blog post, a Python notebook will guide you through the following processes: Establish a connection to our Elastic deployment using the Python client Load a text embedding model into the Elasticsearch cluster Create an index with mappings for indexing feature vectors and dense vectors. Create an ingest pipeline with inference processors for text embedding and text expansion Lexical search - sparse retrieval The classic way documents are ranked for relevance by Elasticsearch based on a text query uses the Lucene implementation of the BM25 model, a sparse model for lexical search . This method follows the traditional approach for text search, looking for exact term matches. To make this search possible, Elasticsearch converts text field data into a searchable format by performing text analysis. Text analysis is performed by an analyzer , a set of rules to govern the process of extracting relevant tokens for searching. An analyzer must have exactly one tokenizer . The tokenizer receives a stream of characters and breaks it up into individual tokens (usually individual words), like in the example below: String tokenization for lexical search Output In this example we are using the default analyzer, the standard analyzer, which works well for most use cases as it provides English grammar based tokenization. Tokenization enables matching on individual terms, but each token is still matched literally. If you want to personalize your search experience you can choose a different built-in analyzer . Example, by updating the code to use the stop analyzer it will break the text into tokens at any non-letter character with support for removing stop words. Output When the built-in analyzers do not fulfill your needs, you can create a custom analyzer , which uses the appropriate combination of zero or more character filters , a tokenizer and zero or more token filters . In the above example that combines a tokenizer and token filters, the text will be lowercased by the lowercase filter before being processed by the synonyms token filter . Lexical matching BM25 will measure the relevance of documents to a given search query based on the frequency of terms and its importance. The code below performs a match query, searching for up to two documents considering \"description\" field values from the \"ecommerce-search\" index and the search query \" Comfortable furniture for a large balcony \" . Refining the criteria for a document to be considered a match for this query can improve the precision. However, more specific results come at the cost of a lower tolerance for variations. Output By analyzing the output, the most relevant result is the \" Barbie Dreamhouse \" product, in the \" Toys \" category, and its description is highly relevant as it includes the terms \" furniture \", \" large\" and \"balcony \", this is the only product with 3 terms in the description that match the search query, the product is also the only one with the term \"balcony\" in the description. The second most relevant product is a \" Comfortable Rocking Chair \" categorized as \" Indoor Furniture \" and its description includes the terms \" comfortable \" and \" furniture \". Only 3 products in the dataset match at least 2 terms of this search query, this product is one of them. \"Comfortable\" appears in the description of 105 products and \"furniture\" in the description of 4 products with 4 different categories: Toys , Indoor Furniture, Outdoor Furniture and 'Dog and Cat Supplies & Toys'. As you could see, the most relevant product considering the query is a toy and the second most relevant product is indoor furniture. If you want detailed information about the score computation to know why these documents are a match, you can set the explain __query parameter to true. Despite both results being the most relevant ones, considering both the number of documents and the occurrence of terms in this dataset, the intention behind the query \" Comfortable furniture for a large balcony \" is to search for furniture for an actual large balcony, excluding among others, toys and indoor furniture. Lexical search is relatively simple and fast , but it has limitations since it is not always possible to know all the possible terms and synonyms without necessarily knowing the user's intention and queries. A common phenomenon in the usage of natural language is vocabulary mismatch . Research shows that, on average, 80% of the time different people (experts in the same field) will name the same thing differently. These limitations motivate us to look for other scoring models that incorporate semantic knowledge. Transformer-based models, which excel at processing sequential input tokens like natural language, capture the underlying meaning of your search by considering mathematical representations of both documents and queries. This allows for a dense, context aware vector representation of text, powering Semantic Search , a refined way to find relevant content. Semantic search - dense retrieval In this context, after converting your data into meaningful vector values, k-nearest neighbor (kNN) search algorithm is utilized to find vector representations in a dataset that are most similar to a query vector. Elasticsearch supports two methods for kNN search, exact brute-force kNN and approximate kNN , also known as ANN. Brute-force kNN guarantees accurate results but doesn't scale well with large datasets. Approximate kNN efficiently finds approximate nearest neighbors by sacrificing some accuracy for improved performance. With Lucene's support for kNN search and dense vector indexes, Elasticsearch takes advantage of the Hierarchical Navigable Small World (HNSW) algorithm, which demonstrates strong search performance across a variety of ann-benchmark datasets . An approximate kNN search can be performed in Python using the below example code. Semantic search with approximate kNN This code block uses Elasticsearch's kNN to return up to two products with a description similar to the vectorized query (query_vector_build) of \" Comfortable furniture for a large balcony \" considering the embeddings of the “ description ” field in the products dataset. The products embeddings were previously generated in an ingest pipeline with an inference processor containing the \" all-mpnet-base-v2 \" text embedding model to infer against data that was being ingested in the pipeline. This model was chosen based on the evaluation of pretrained models using \" sentence_transformers.evaluation \" where different classes are used to assess a model during training. The \"all-mpnet-base-v2\" model demonstrated the best average performance according to the Sentence-Transformers ranking and also secured a favorable position on the Massive Text Embedding Benchmark (MTEB) Leaderboard. The model pre-trained microsoft/mpnet-base model and fine-tuned on a 1B sentence pairs dataset, it maps sentences to a 768 dimensional dense vector space. Alternatively, there are many other models available that can be used, especially those fine-tuned for your domain-specific data. Output The output may vary based on the chosen model, filters and approximate kNN tune . The kNN search results are both in the \" Outdoor Furniture \" category, even though the word \" outdoor \" was not explicitly mentioned as part of the query, which highlights the importance of semantics understanding in the context. Dense vector search offers several advantages: Enabling semantic search Scalability to handle very large datasets Flexibility to handle a wide range of data types However, dense vector search also comes with its own challenges : Selecting the right embedding model for your use case Once a model is chosen, fine-tuning the model to optimize performance on a domain-specific dataset might be necessary, a process that demands the involvement of domain experts Additionally, indexing high-dimensional vectors can be computationally expensive Semantic search - learned sparse retrieval Let’s explore an alternative approach: learned sparse retrieval, another way to perform semantic search. As a sparse model, it utilizes Elasticsearch's Lucene-based inverted index, which benefits from decades of optimizations. However, this approach goes beyond simply adding synonyms with lexical scoring functions like BM25. Instead, it incorporates learned associations using a deeper language-scale knowledge to optimize for relevance. By expanding search queries to include relevant terms that are not present in the original query, the Elastic Learned Sparse Encoder improves sparse vector embeddings , as you can see in the example below. Sparse vector search with Elastic Learned Sparse Encoder Output The results in this case include the \" Garden Furniture \" category, which offers products quite similar to \" Outdoor Furniture \". By analyzing \"ml.tokens\", the \"rank_features\" field containing Learned Sparse Retrieval generated tokens, it becomes apparent that among the various tokens generated there are terms that, while not part of the search query, are still relevant in meaning, such as \" relax \" (comfortable), \" sofa \" (furniture) and \" outdoor \" (balcony). The image below highlights some of these terms alongside the query, both with and without term expansion. As observed, this model provides a context-aware search and helps mitigate the vocabulary mismatch problem while providing more interpretable results. It can even outperform dense vector models when no domain-specific retraining is applied. Hybrid search: relevant results by combining lexical and semantic search When it comes to search, there is no universal solution. Each of these retrieval methods has its strengths but also its challenges. Depending on the use case, the best option may change. Often the best results across retrieval methods can be complementary. Hence, to improve relevance, we’ll look at combining the strengths of each method. There are multiple ways to implement a hybrid search , including linear combination, giving a weight to each score and reciprocal rank fusion (RRF), where specifying a weight is not necessary. Elasticsearch: best of both worlds with lexical and semantic search In this code, we performed a hybrid search with two queries having the value \" A dining table and comfortable chairs for a large balcony \". Instead of using \" furniture \" as a search term, we are specifying what we are looking for, and both searches are considering the same field values, \"description\". The ranking is determined by a linear combination with equal weight for the BM25 and ELSER scores. Output In the code below, we will use the same value for the query, but combine the scores from BM25 (query parameter) and kNN (knn parameter) using the reciprocal rank fusion method to combine and rank the documents. RRF functionality is in technical preview. The syntax will likely change before GA. Output Here we could also use different fields and values; some of these examples are available in the Python notebook . As you can see, with Elasticsearch you have the best of both worlds: the traditional lexical search and vector search, whether sparse or dense, to reach your goal and find the best possible answer to your question. If you want to continue learning about the approaches mentioned here, these blogs can be useful: Improving information retrieval in the Elastic Stack: Hybrid retrieval Vector search in Elasticsearch: The rationale behind the design How to get the best of lexical and AI-powered search with Elastic’s vector database Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model Elasticsearch provides a vector database, along with all the tools you need to build vector search: Elasticsearch vector database Vector search use cases with Elastic Conclusion In this blog post, we explored various approaches to retrieving information using Elasticsearch, focusing specifically on text, lexical and semantic search. To demonstrate this, we provided Python examples showcasing different search scenarios using a dataset containing e-commerce product information. We reviewed the classic lexical search with BM25 and discussed its benefits and challenges, such as vocabulary mismatch. We emphasized the importance of incorporating semantic knowledge to overcome this issue. Additionally, we discussed dense vector search, which enables semantic search, and covered the challenges associated with this retrieval method, including the computational cost when indexing high-dimensional vectors. On the other hand, we mentioned that sparse vectors compress exceptionally well. Thus, we discussed Elastic's Learned Sparse Encoder, which expands search queries to include relevant terms not present in the original query. There is no one-size-fits-all solution when it comes to search. Each retrieval method has its strengths and challenges. Therefore, we also discussed the concept of hybrid search. As you could see, with Elasticsearch, you can have the best of both worlds: traditional lexical search and vector search! Ready to get started? Check the available Python notebook and begin a free trial of Elastic Cloud . Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Prerequisites Lexical search - sparse retrieval String tokenization for lexical search Lexical matching Semantic search - dense retrieval Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Lexical and semantic search with Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/lexical-and-semantic-search-with-elasticsearch","meta_description":"In this blog, we'll explore various approaches to retrieving information using Elasticsearch, focusing on lexical and semantic search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Relativity uses Elasticsearch and Azure OpenAI to build AI search experiences With Elasticsearch Relevance Engine, you can create AI-powered search apps. Learn how Relativity uses Elastic & Azure Open AI for this goal. Generative AI HM AT By: Hemant Malik and Aditya Tripathi On July 5, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch has been used by developers to build search experiences for over a decade. At Microsoft Build this year, we announced the launch of Elasticsearch Relevance Engine — a set of tools to enable developers to build AI-powered search applications. With generative AI, large language models (LLMs), and vector search capabilities gaining mindshare, we are delighted to expand our range of tools and enable our customers in building the next generation of search apps. One example of what a next-generation search experience might look like comes from Relativity — the eDiscovery and legal search technology company. At Build, we shared the stage with the Relativity team as they spoke about how they’re using Elasticsearch and Microsoft Azure. You can read about Relativity’s coverage of Build on their blog . This blog explores how Relativity leverages Elasticsearch and Azure OpenAI to build futuristic search experiences. It also examines the key components of an AI-powered search experience and the important architectural considerations to keep in mind. About Relativity Relativity is the company behind RelativityOne , a leading cloud-based eDiscovery software solution. Relativity partners with Microsoft to innovate and deliver its solutions to hundreds of organizations, helping them manage, search, and act on large amounts of heterogeneous data. RelativityOne is respected in the industry for its global reach, and it is powered by Microsoft Azure infrastructure and a host of other Microsoft Azure services, such as Cognitive Services Translator. The RelativityOne product is built with scale in mind. Typical use cases involve ingesting large amounts of data provided for legal eDiscovery. This data is presented to legal teams via a search interface. In order to enable high-quality legal investigations, it is critical for the search experience to return highly accurate and relevant results, every time. Elasticsearch fits these requirements and is a key underlying technology. eDiscovery's future with generative AI Brittany Roush, senior product manager at Relativity, says, “The biggest challenge Relativity customers are facing right now is data explosion from heterogeneous data sources. The challenge is really compounded by the differences in data generated from different modes of communication.” This explosion of data, sources, and complexity renders traditional keyword search approaches ineffective. With Elasticsearch Relevance Engine (ESRE) , Relativity sees the potential of providing a search experience that goes beyond keyword search and basic conceptual search. ESRE provides an opportunity to natively tailor searches to case data. Relativity wants to augment the search experience with AI capabilities such as GPT-4, Signals, Classifications, and its in-house AI solutions. In this talk at Microsoft Build, Roush shared that in Relativity’s future vision for search, there are a few key challenges. As data grows exponentially, and as investigators must search through documents, images, and video records, traditional keyword search approaches can reach their limits. Privacy and confidentiality are key factors in the legal discovery process. Additionally, the ability to search as if you’re having a natural conversation and leveraging LLMs are all important factors when the team imagines the future of search. Elasticsearch for the future of AI search experiences In making the vision for the future of eDiscover search real, Relativity relies on Elasticsearch for its proven track record at scale, the ability to manage structured and unstructured data sources, and its leadership in search and hybrid scoring. ESRE provides an opportunity to natively tailor searches to case data. With Elasticsearch Relevance Engine, developers get a full vector database, the ability to manage multiple machine learning models or Elastic’s Learned Sparse Encoder that comes included with ESRE, and a plethora of data ingestion capabilities. Roush adds, “With ESRE and Azure OpenAI-powered search experiences, we aim to reduce what may take our customers months during an investigation to hours.” Components of an AI-powered search experience Natural language processing (NLP) NLP offers the ability to interact with a search interface using human language, or spoken intent. The search engine must identify intent and match it to data records. An example is matching “side hustle” to mean a secondary work occupation. Vector database This is a database that stores data as numerical representations, or vectors embeddings, across a vector space that may span several dimensions. Each dimension may be a mathematical representation of features or attributes used to describe search documents. Vector search A vector space is a mathematical representation of search documents as numbers. Traditional search relies on placement of keywords, frequency, and lexical similarity. A vector search engine uses numerical distances between vectors to represent similarity. Model flexibility and LLM integration In this fast-evolving space, support for public and proprietary machine learning models and the ability to switch models enables customers to keep up with innovation. Architectural considerations for an AI-powered search experience In order to build a search experience that understands natural language search prompts, stores underlying data as vectors, and leverages a large language model to present context-relevant responses, an approach such as the one below may be used: In order to achieve goals, for example building next-generation eDiscovery search leveraging LLM like OpenAI, users should consider the following factors: Cost LLMs can require large-scale resources and are trained on public data sets. Elasticsearch helps customers bring their private data and integrate with LLMs by passing a context window to keep the costs in check. Scale and flexibility Elasticsearch is a proven data store and vector database at petabyte scale . AI search experiences are powered by the data. The ability to ingest from a variety of private data sources is table stakes for the platform powering the solution. Elasticsearch as a datastore has been optimized over the years to host numerous data types , including the ability to store vectors. We cover Elastic’s ingestion capabilities in this recent webinar . Most AI-powered search experiences will benefit from having the flexibility of retrieval: keyword search, vector search, and the ability to deliver hybrid ranking. Elastic 8.8 introduced Elastic Learned Sparse Encoder in technical preview. This is a semantic search model trained and optimized by Elastic for superior relevance out of the box. Our model provides superior performance for vector and hybrid search approaches, and some of the work is documented in this blog . Elastic also supports a wide variety of third-party NLP transformer models to enable you to add NLP models that you may have trained for your use cases. Privacy and security Elasticsearch can help users limit access to domain-specific data by sharing only relevant context with LLM services. Combine this with Microsoft’s enterprise focus on data, privacy, and security of the Azure OpenAPI service, and users like Relativity can roll out search experiences leveraging generative AI built on proprietary data. For the private data hosted in Elasticsearch, applying Role Based Access Control will help protect sensitive data by configuring roles and corresponding access levels. Elasticsearch offers security options such as Document Level and Field Level security, which can restrict access based on domain-specific data sensitivity requirements. Elasticsearch Service is built with a security-first mindset and is independently audited and certified to meet industry compliance standards such as PCI DSS, SOC2, and HIPAA to name a few. Industry-specific considerations: Compliance, air-gapped, private clouds Elastic can go where our users are, whether that is on public cloud, private cloud, on-prem, and in air-gapped environments. For a privacy-first LLM experience, users can deploy proprietary transformer models to air-gapped environments. What will you build with Elastic and generative AI? We’re excited about the experiences that customers such as Relativity are building. The past few years in search have been very exciting, but with the rapid adoption of generative AI capabilities, we can’t wait to see what developers create with Elastic’s tools. If you’d like to try some of the capabilities that were mentioned here, we recommend these resources: Demo video: State of the Art Data Retrieval with Machine Learning & Elasticsearch Blog: How to deploy NLP: Text Embeddings and Vector Search Announcing Elasticsearch Relevance Engine : AI search tools for developers Sign up for an Elastic Cloud trial and get started today! The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. In this blog post, we may have used third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to About Relativity eDiscovery's future with generative AI Elasticsearch for the future of AI search experiences Components of an AI-powered search experience Architectural considerations for an AI-powered search experience Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Relativity uses Elasticsearch and Azure OpenAI to build AI search experiences - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/relativity-elasticsearch-azure-openai","meta_description":"With Elasticsearch Relevance Engine, you can create AI-powered search apps. Learn how Relativity uses Elastic & Azure Open AI for this goal."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds Amazon Bedrock support Elasticsearch open inference API added Amazon Bedrock support. Here's how to use Amazon Bedrock models via Elasticsearch's open inference API. Integrations Generative AI Vector Database How To MH HM By: Mark Hoy and Hemant Malik On July 10, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The Elasticsearch open inference API enables developers to create inference endpoints and use machine learning models from leading providers. Models hosted on Amazon Bedrock are available via the Elasticsearch Open Inference API . Developers building RAG applications using the Elasticsearch vector database can store and use embeddings generated from models hosted on Amazon Bedrock (such as Amazon Titan, Anthropic Claude, Cohere Command R, and others ). Bedrock integration with open inference API offers a consistent way of interacting with different AI models, such as text embeddings and chat completion, simplifying the development process with Elasticsearch. pick a model from Amazon Bedrock create and use an inference endpoint in Elasticsearch use the model as part of an inference pipeline Using a base model in Amazon Bedrock This walkthrough assumes you already have an AWS Account with access to Amazon Bedrock - a fully managed hosted models service that makes foundation models available through a unified API. From Amazon Bedrock in AWS Console, make sure that you do have access to Amazon Titan Embeddings G1 - Text model. You can check that by going to Amazon Bedrock service in AWS Console and checking for _Model access. _ If you don’t have access, you can request it through _Modify model access _ from Amazon Bedrock service in AWS Console. Amazon provides extensive IAM policies to control permissions and access to the models. From within IAM, you’ll also need to create a pair of access and secret keys that allow programmatic access to Amazon Bedrock for your Elasticsearch inference endpoint to communicate. Creating an inference API endpoint in Elasticsearch Once your model is deployed, we can create an endpoint for your inference task in Elasticsearch. For the examples below, we are using the Amazon Titan Text base model to perform inference for chat completion. In Elasticsearch, create your endpoint by providing the service as “amazonbedrock”, and the service settings including your region, the provider, the model (either the base model ID, or if you’ve created a custom model, the ARN for it), and your access and secret keys to access Amazon Bedrock. In our example, as we’re using Amazon Titan Text, we’ll specify “amazontitan” as the provider, and “amazon.titan-text-premier-v1:0” as the model id. When you send Elasticsearch the command, it should return back the created model to confirm that it was successful. Note that the API key will never be returned and is stored in Elasticsearch’s secure settings. Adding a model for using text embeddings is just as easy. For reference, if we use the Amazon Titan Embeddings Text base model, we can create our inference model in Elasticsearch with the “text_embeddings” task type by providing the appropriate API key and target URL from that deployment’s overview page: Let’s perform some inference That’s all there is to setting up your model. Now that that’s out of the way, we can use the model. First, let’s test the model by asking it to provide some text given a simple prompt. To do this, we’ll call the _inference API with our input text: And we should see Elasticsearch provide a response. Behind the scenes, Elasticsearch calls out to Amazon Bedrock with the input text and processes the results from the inference. In this case, we received the response: We’ve tried to make it easy for the end user to not have to deal with all the technical details behind the scenes, but we can also control our inference a bit more by providing additional parameters to control the processing, such as sampling temperature and requesting the maximum number of tokens to be generated: That was easy. What else can we do? This becomes even more powerful when we are able to use our new model in other ways, such as adding additional text to a document when it’s used in an Elasticsearch ingestion pipeline. For example, the following pipeline definition will use our model, and anytime a document using this pipeline is ingested, any text in the field “question_field” will be sent through the inference API, and the response will be written to the “completed_text_answer” field in the document. This allows large batches of documents to be augmented. Limitless possibilities By harnessing the power of Amazon Bedrock models in your Elasticsearch inference pipelines, you can enhance your search experience’s natural language processing capabilities. If you’re an AWS developer using Elasticsearch, there is more to look forward to. We recently added support for Amazon Bedrock to our Playground ( blog ), allowing developers to test and tune RAG workflows. In addition, the new semantic_text mapping lets you easily vectorize and chunk information. In upcoming versions of Elasticsearch, users can take advantage of new field mapping types that simplify the process described in this blog even further, where designing an ingest pipeline would no longer be necessary. Also, as alluded to in our accelerated roadmap for semantic search the future will provide dramatically simplified support for inference tasks with Elasticsearch retrievers at query time. These capabilities are available through the open inference API in our stateless offering on Elastic Cloud. They’ll also soon be available to everyone in an upcoming versioned Elasticsearch release. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Using a base model in Amazon Bedrock Creating an inference API endpoint in Elasticsearch Let’s perform some inference That was easy. What else can we do? Limitless possibilities Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch open inference API adds Amazon Bedrock support - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-amazon-bedrock-support","meta_description":"Elasticsearch open inference API added Amazon Bedrock support. Here's how to use Amazon Bedrock models via Elasticsearch's open inference API."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds support for OpenAI chat completions Learn how OpenAI chat completions and Elasticsearch can be used to summarize, translate or perform question & answering on any text. Integrations Generative AI How To TG By: Tim Grein On April 15, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. OpenAI Chat Completions has been integrated into Elastic’s inference APIs . This feature marks another milestone in our journey of integrating cutting-edge AI capabilities within Elasticsearch, offering additional easy-to-use features like generating human-like text completions. This blog explains how OpenAI chat completions and Elasticsearch can be used to summarize, translate or perform question & answering on any text. Before we get started, let's take a quick look at the recent Elastic features and integrations. The essence of continuous innovation at Elastic Elastic invests heavily in everything AI. We’ve recently released a lot of new features and exciting integrations: Elasticsearch open inference API adds support for Cohere Embeddings Introducing Elasticsearch vector database to Azure OpenAI Service On Your Data (preview) Speeding Up Multi- graph Vector Search ...explore more of Elastic Search Labs to learn about recent developments The new completion task type inside our inference API with OpenAI as the first backing provider is already available in our stateless offering on Elastic Cloud. It’ll be soon available to everyone in our next release. Using OpenAI chat completions with Elasticsearch's open inference API In this short guide we’ll show a simple example on how to use the new completion task type in the inference API during document ingestion. Please refer to the Elastic Search Labs GitHub repository for more in-depth guides and interactive notebooks. For the following guide to work you'll need to have an active OpenAI account and obtain an API key. Refer to OpenAI’s quickstart guide for the steps you need to follow. You can choose from a variety of OpenAI’s models. In the following example we’ve used `gpt-3.5-turbo`. In Kibana, you'll have access to a console for you to input these next steps in Elasticsearch without even needing to set up an IDE. Firstly, you configure a model, which will perform the completions: After running this command you should see a corresponding `200 OK` status indicating that the model is properly set up for performing inference on arbitrary text. You’re now able to call the configured model to perform inference on arbitrary text input: You’ll get a response with status code `200 OK` looking similar to the following: The next command creates an example document we’ll summarize using the model we’ve just configured: To summarize multiple documents, we’ll use an ingest pipeline together with the script- , inference- and remove-processor to set up our summarization pipeline. This pipeline simply prefixes the content with the instruction “Please summarize the following text: “ in a temporary field so the configured model knows what to do with the text. You can change this text to anything you would like of course, which unlocks a variety of other popular use cases: Question and Answering Translation …and many more! The pipeline deletes the temporary field after performing inference. We now send our document(s) through the summarization pipeline by calling the reindex API . Your document is now summarized and ready to be searched: That’s basically it, you just created a powerful summarization pipeline with a few simple API calls, which can be used with any ingestion mechanism! There are a lot of use cases, where summarization comes in handy, for example by summarizing large pieces of text before generating semantic embeddings or transforming large documents into a concise summary. This can reduce your storage cost, improve time-to-value for example, if you’re only interested in a summary of large documents etc. By the way if you want to extract text from binary documents you can take a look at our open-code data-extraction service ! Exciting future ahead But we won’t stop here. We’re already working on integrating Cohere’s chat as another provider for our `completion` task. We’re also actively exploring new retrieval and ingestion use cases in combination with the completion API. Bookmark Elastic Search Labs now to stay up to date! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to The essence of continuous innovation at Elastic Using OpenAI chat completions with Elasticsearch's open inference API Exciting future ahead Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch open inference API adds support for OpenAI chat completions - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-openai-completion-support","meta_description":"Learn how OpenAI chat completions and Elasticsearch can be used to summarize, translate or perform question & answering on any text."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Go Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Go How To October 31, 2023 Perform text queries with the Elasticsearch Go client Learn how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client through a practical example. CR LS By: Carly Richmond and Laurent Saint-Félix Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Go - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/go-programming","meta_description":"Go articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial PHP Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe ES|QL PHP +1 April 8, 2024 From ES|QL to PHP objects Learn how to execute and manage ES|QL queries in PHP. Follow this guide to map ES|QL results to a PHP object or custom class. EZ By: Enrico Zimuel Generative AI PHP June 21, 2023 How to use Elasticsearch to prompt ChatGPT with natural language This blog post presents an experimental project for querying Elasticsearch in natural language using ChatGPT. EZ By: Enrico Zimuel Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"PHP - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/php-programming","meta_description":"PHP articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Open Crawler released for tech-preview The Open Crawler lets users crawl web content and index it into Elasticsearch from wherever they like. Learn about it & how to use it here. Ingestion NF By: Navarone Feekery On June 7, 2024 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Don't miss the follow-up to this blog announcing Open Crawler's promotion to beta! Discover how fast deployment has become with our out-of-the-box Docker images, and explore recent enhancements such as Extraction Rules or Binary Content Extraction. Read more in the blog post Open Crawler now in beta . There's a new crawler in town! The Open Crawler enables users to crawl web content and index it into Elasticsearch from wherever they prefer. This blog goes over the Open Crawler, compares it to the Elastic Crawler, and explains how to use it. Background Elastic has seen a few iterations of Crawler over the years. What started out as Swiftype's Site Search became the App Search Crawler , and then most recently the Elastic Crawler . These Crawlers are feature-rich, and allow for a robust and nuanced method for ingesting website data into Elasticsearch. However, if a user wants to run these on their own infrastructure, they are required to run the entirety of Enterprise Search as well. The Enterprise Search codebase is massive and contains a lot of different tools, so users don't have the option to run just Crawler. Because Enterprise Search is private code, it also isn't entirely clear to the user what they are running. That has all changed, as we have released the latest iteration of Crawler; the Open Crawler ! Open Crawler allows users to crawl web content and index it into Elasticsearch from wherever they like. There are no requirements to use Elastic Cloud, nor to have Kibana or Enterprise Search instances running. Only an Elasticsearch instance is required to ingest the crawl results into. This time the repository is open code too. Users can now inspect the codebase, open issues and PR requests, or fork the repository to make changes and run their own crawler variant. What has changed? The Open Crawler is considerably more light-weight than the SaaS crawlers that came before it. This product is essentially the core crawler code from the existing Elastic Crawler, decoupled from the Enterprise Search service. Decoupling the Open Crawler has meant leaving some features behind, temporarily. There is a full feature comparison table at the end of this blog if you'd like to read our roadmap towards feature-parity. We intend to reintroduce those features and reach near-feature-parity when this product becomes GA. This process allowed us to also make improvements to the core product. For example: We were able to remove the limitations towards index naming It is now possible to use custom mappings on your index before crawling and ingesting content Crawl results are now also bulk indexed into Elasticsearch instead of indexed one webpage at a time This provided a signficant performance boost, which we get into below How does the Open Crawler compare to the Elastic Crawler? As already mentioned, this crawler can be run from wherever you like; your computer, your personal server, or a cloud-hosted one. It can index documents into Elasticsearch on-prem, on Cloud, and even Serverless. You are also no longer tied to using Enterprise Search to ingest your website content into Elasticsearch. But what is most exciting is that Open Crawler is also faster than the Elastic Crawler. We ran a performance test to compare the speed of Open Crawler versus that of the Elastic Crawler, our next-latest crawler. Both crawlers crawled the site elastic.co with no alterations to the default configuration. The Open Crawler was set up to run on two AWS EC2 instances; m1.small and m1.large , while Elastic Crawler was run natively from Elastic Cloud. All were set up in the region of N. Virginia. The content was indexed into an Elasticsearch Cloud instance with identical settings (the default 360 GB storage | 8 GB RAM | Up to 2.5 vCPU ). Here are the results: Crawler Type Server RAM Server CPU Crawl Duration (mins) Docs Ingested (n) Elastic Crawler 2GB up to 8 vCPU 305 43957 Open Crawler (m1.small) 1.7 GB 1 vCPU 160 56221 Open Crawler (m1.large) 3.75 GB per vCPU 2 vCPU 100 56221 Open Crawler was almost twice as fast for the m1.small and over 3 times as fast for the m1.large ! This is also despite running on servers with less-provisioned vCPU. Open Crawler ingested around 13000 more documents, but that is because the Elastic Crawler combines website pages with identical bodies into a single document. This feature is called duplicate content handling and is in the feature comparison matrix at the end of this blog. The takeaway here is that both Crawlers encountered the same amount of web pages during their respective crawls, even if the ingested document count is different. Here are some graphs comparing the impact of this on Elasticsearch. These compare the Elastic Crawler with the Open Crawler that ran on the m1.large instance. CPU Naturally the Open Crawler caused signficantly less CPU usage on Elastic Cloud, but that's because we've removed the entire Enterprise Search server. It's still worth taking a quick look at where this CPU usage was distributed. Elastic Crawler CPU load (Elastic Cloud) Open Crawler CPU load (Elastic Cloud) The Elastic Crawler reached the CPU threshold immediately and consistently used it for an hour. It then dropped down and had periodic spikes until the crawl completed. For the Open Crawler there was almost no noticable CPI usage on Elastic Cloud, but the CPU is still being consumed somewhere , and in our case this was on the EC2 instance. EC2 CPU load ( m1.large ) We can see here that the Open Crawler didn't reach the 100% limit threshold. The most CPU it used was 84.3%. This means there's still more room for optimization here. Depending on user setup (and optimizations we can add to the codebase), Open Crawler could be even faster . Requests (n) We can see a real change on Elasticsearch server load here by comparing the amount of requests made during the crawl. Elastic Crawler requests Open Crawler requests The indexing request impact from Open Crawler is so small it's not even noticeable on this graph compared to the background noise. There's a slight uptick in index requests, and no change to the search requests. Elastic Crawler, meanwhile, has an explosion of requests; particularly search requests. This means the Open Crawler is a great solution for users who want to reduce requests made to their Elasticsearch instance. So why is the Open Crawler so much faster? 1. The Open Crawler makes significantly fewer indexing requests. The Elastic Crawler indexes crawl results one at a time. It does this to allow for features such as duplicate content management. This means that Elastic Crawler performed 43,957 document indexing requests during its crawl. It also updates documents when it encounters duplicate content, so it also performed over 13000 individual update requests. The Open Crawler instead pools crawl results and indexes them in bulk. In this test, it indexed the same amount of crawl results in only 604 bulk requests of varying sizes. That's less than 1.5% of the indexing requests made, which is a significant load reduction for Elasticsearch to manage. 2. Elastic Crawler also performs many search requests, further slowing down performance The Elastic Crawler has its configuration and metadata managed in Elasticsearch pseudo-system indices. When Crawling it periodically checks this configuration and updates the metadata on a few of these indices, which is done through further Elasticsearch requests. The Open Crawler's configuration is entirely managed in yaml files. It also doesn't track metadata on Elasticsearch indices. The only requests it makes to Elasticsearch are to index documents from crawl results while crawling a website. 3. Open Crawler is simply doing less with crawl results In the tech-preview stage of Open Crawler, there are many features that are not available yet. In Elastic Crawler, these features are all managed through pseudo-system indices in Elasticsearch. When we add these features to the Open Crawler, we can ensure they are done in a way that doesn't involve multiple requests to Elasticsearch to check configuration. This means Open Crawler should still retain this speed advantage even after reaching near feature parity with Elastic Crawler. How do I use the Open Crawler? You can clone the repository now and follow the documentation here to get started. I recommend using Docker to run the Open Crawler if you aren't making changes to source code, to make the process smoother. If you want to index the crawl results into Elasticsearch, you can also try out a free trial of Elasticsearch on Cloud or download and run Elasticsearch yourself from source . Here's a quick demo of crawling the website parksaustralia.gov.au . The requirements for this are Docker, a clone/fork of the Open Crawler repository, and a running Elasticsearch instance. 1. Build the Docker image and run it This can be done in one line with docker build -t crawler-image . && docker run -i -d --name crawler crawler-image . You can then confirm it is running by using the CLI command to check the version docker exec -it crawler bin/crawler version . 2. Configure the crawler Using examples in the repository you can create a configuration file. For this example I'm crawling the website parksaustralia.org.au and indexing into a Cloud-based Elasticsearch instance. Here's an example of my config, which I creatively named example.yml . I can copy this into the docker container using docker cp config/example.yml crawler:/app/config/example.yml 3. Validate the domain Before crawling you can check that the configured domain is valid using docker exec -it crawler bin/crawler validate config/example.yml 4. Crawl! Start the crawl with docker exec -it crawler bin/crawler crawl config/example.yml . It will take a while to complete if the site is large, but you'll know it's done based on the shell output. 5. Check the content We can then do a _search query against the index. This could also be done in Kibana Dev Tools if you have an instance of that running. And the results! You could even hook these results up with Semantic Search and do some cool real-language queries, like What park is in the centre of Australia? . You just need to add the name of the pipeline you create to the Crawler config yaml file, under the field elasticsearch.pipeline . Feature comparison breakdown Here is a full list of Elastic Crawler features as of v8.13 , and when we intend to add them to the Open Crawler. Features available in tech-preview are available already. These aren't tied to any specific stack version, but we have a general time we're aiming for each release. tech-preview : Today (June 2024) beta : Autumn 2024 GA : Summer 2025 Feature Open Crawler Elastic Crawler Index content into Elasticsearch tech-preview ✔ No index name restrictions tech-preview ✖ Run anywhere, without Enterprise Search or Kibana tech-preview ✖ Bulk index results tech-preview ✖ Ingest pipelines tech-preview ✔ Seed URLs tech-preview ✔ robots.txt and sitemap.xml adherence tech-preview ✔ Crawl through proxy tech-preview ✔ Crawl sites with authorization tech-preview ✔ Data attributes for inclusion/exclusion tech-preview ✔ Limit crawl depth tech-preview ✔ Robots meta tags tech-preview ✔ Canonical URL link tags tech-preview ✔ No-follow links tech-preview ✔ CSS selectors beta ✔ XPath selectors beta ✔ Custom data attributes beta ✔ Binary content extraction beta ✔ URL pattern extraction (extraction directly from URLs using regex) beta ✔ URL filters (extraction rules for specific endpoints) beta ✔ Purge crawls beta ✔ Crawler results history and metadata GA ✔ Duplicate content handling TBD ✔ Schedule crawls TBD ✔ Manage crawler through Kibana UI TBD ✔ The TBD features are still undecided as we are assessing the future of the Open Crawler. Some of these, like Schedule crawls , can be done already using cron jobs or similar automation. Depending on user feedback and how the Open Crawler project evolves, we may decide to implement these features properly in a later release. If you have a need for one of these, reach out to us! You can find us in the forums and in the community Slack , or you can create an issue directly in the repository . What's next? We want to get this to GA in time for v9.0 . The Open Crawler is designed with Elastic Cloud Serverless in mind, and we intend for it to be the main web content ingestion method for that version. We also have plans to support the Elastic Data Extraction Service , so even larger binary content files can be ingested using Open Crawler. There are many features we need to introduce in the meantime to get the same feature-rich experience that Elastic Crawler has today. Report an issue Related content Integrations Ingestion +1 March 7, 2025 Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. JR By: Jeffrey Rengifo Integrations Ingestion +1 February 19, 2025 Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. AK By: Amit Khandelwal Integrations Ingestion +1 February 18, 2025 Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. JR TM By: Jeffrey Rengifo and Tomás Murúa Ingestion How To February 4, 2025 How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. AL By: Andre Luiz Integrations Ingestion +1 February 3, 2025 Elastic Playground: Using Elastic connectors to chat with your data Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources. JR TM By: Jeffrey Rengifo and Tomás Murúa Jump to Background What has changed? How does the Open Crawler compare to the Elastic Crawler? CPU Requests (n) Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Open Crawler released for tech-preview - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-open-crawler-release","meta_description":"The Open Crawler lets users crawl web content and index it into Elasticsearch from wherever they like. Learn about it & how to use it here."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. Elastic Cloud Serverless Agent FS By: Fram Souza On March 4, 2025 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. This little command-line tool lets you manage your Serverless Elasticsearch projects in plain English. It talks to an AI (in this case, OpenAI) to figure out what you mean and call the right functions using LlamaIndex! What does it do? Create a project : Spin up a new Serverless Elasticsearch project. Delete a project : Remove an existing project (yep, it cleans up after you). Get project status : Check on how your project is doing. Get project details : Fetch all the juicy details about your project. Check the code on GitHub. How it works When you type in something like: \"Create a serverless project named my_project\" …here’s what goes on behind the scenes: User input & context: Your natural language command is sent to the AI agent. Function descriptions: The AI agent already knows about a few functions—like create_ess_project, delete_ess_project, get_ess_project_status, and get_ess_project_details—because we gave it detailed descriptions. These descriptions tell the AI what each function does and what parameters they need. LLM processing: Your query plus the function info is sent off to the LLM. This means the AI sees: The user query : Your plain-English instruction. Available functions & descriptions : Details on what each tool does so it can choose the right one. Context/historic chat info : Since it’s a conversation, it remembers what’s been said before. Function call & response: The AI figures out which function to call, passes along the right parameters (like your project name), and then the function is executed. The response is sent back to you in a friendly format. In short, we’re sending both your natural language query and a list of detailed tool descriptions to the LLM so it can “understand” and choose the right action for your request. Setup Prerequisites: Before running the AI agent, ensure that you have the following set up: Python (v3.7 or later) installed. Elasticsearch serverless account set up on Elastic Cloud. OpenAI account to interact with the language model. Steps: 1. Clone the repository: 2. Create a virtual environment (optional but recommended): If you're facing environment-related issues, you can set up a virtual environment for isolation: 3. Install the dependencies: Ensure that all required dependencies are installed by running: 4. Configure your environment: Create a .env file in the project root with the following variables. Here’s an example .env.example file to help you out: Ensure that you have the correct values for ES_URL , API_KEY , and OPENAI_API_KEY . You can find your API keys in the respective service dashboards. 5. Projects File: The tool uses a projects.json file to store your project mappings (project names to their details). This file will be created automatically if it doesn't already exist. Running the agent You’ll see a prompt like this: Type in your command, and the AI agent will work its magic! When you're done, type exit or quit to leave. A few more details LLM integration : The LLM is given both your query and detailed descriptions of each available function. This helps it understand the context and decide, for example, whether to call create_ess_project or delete_ess_project . Tool descriptions : Each function tool (created using FunctionTool.from_defaults) has a friendly description. This description is included in the prompt sent to the LLM so that it “knows” what actions are available and what each action expects. Persistence : Your projects and their details are saved in projects.json, so you don’t have to re-enter info every time. Verbose logging : The agent is set to verbose mode, which is great for debugging and seeing how your instructions get translated into function calls. Example utilization Report an issue Related content Agent How To March 28, 2025 Connect Agents to Elasticsearch with Model Context Protocol Let’s use Model Context Protocol server to chat with your data in Elasticsearch. JB JM By: Jedr Blaszyk and Joe McElroy Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to What does it do? How it works Setup Prerequisites: Steps: Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"The AI Agent to manage Elasticsearch Serverless projects - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/serverless-elasticsearch-ai-agent","meta_description":"A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. Elastic Cloud Serverless AD By: Andrei Dan On December 10, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. Traditionally, users change the sharding configuration of data streams in order to deal with various workloads and make the best use of the available resources. In Elastic Cloud Serverless we've introduced autosharding of data streams, enabling them to be managed and scaled automatically based on indexing load. This post explores the mechanics of autosharding, its benefits, and its implications for users dealing with variable workloads. The autosharding philosophy is to increase the number of shards aggressively and reduce them very conservatively, such that an increase in shards is not followed prematurely by a reduction of shards due to a small period of reduced workload. Autosharding of data streams in Serverless Elasticsearch Imagine you have a large pizza that needs to be shared among your friends at a party. If you cut the pizza into only two slices for a group of six friends, each slice will need to serve multiple people. This will create a bottleneck, where one person hogs a whole slice while others wait, leading to a slow sharing process. Additionally, not everyone can enjoy the pizza at the same time; you can practically hear the sighs from the friends left waiting. If more friends show up unexpectedly, you’ll struggle to feed them with just two slices and find yourself scrambling to reshape those slices on the spot. On the other hand, if you cut the pizza into 36 tiny slices for those same six friends, managing the sharing becomes tricky. Instead of enjoying the pizza, everyone spends more time figuring out how to grab their tiny portions. If the slices are too small, the pizza might even fall apart. To ensure everyone enjoys the pizza efficiently, you’d aim to cut it into a number of slices that matches the number of friends. If you have six friends, cutting the pizza into 6 or 12 slices allows everyone to grab a slice without long waits. By finding the right balance in slicing your pizza, you’ll keep the party running smoothly and everyone happy. You know it’s a good analogy when you immediately follow-up with the explanation; the pizza represents the data, the slices represent the index shards, and the friends are the Elasticsearch nodes in your cluster. Traditionally, users of Elasticsearch had to anticipate their indexing throughput and manually configure the number of shards for each data stream . This approach relied heavily on predictive heuristics and required ongoing adjustments based on workload characteristics whilst also balancing data storage, search analytics, and application performance . Businesses with seasonal traffic, like retail, often deal with spikes in data demands, while IoT applications can experience rapid load increases at specific times. Development and testing environments typically run only a few hours a week, making fixed shard configurations inefficient. New applications might struggle to estimate workload needs accurately, leading to potential over- or under-provisioning. We've introduced autosharding of data streams in Elastic Cloud Serverless . Data streams in Serverless are managed and scaled automatically based on indexing load - automatically slicing your pizza as friends arrive to your party or finish eating. The promise of autosharding Autosharding addresses these challenges by automatically adjusting the number of shards in response to the current indexing load. This means that instead of users having to manually tweak configurations, Elasticsearch will dynamically manage shard counts for the data streams in your project based on real-time data traffic. Elasticsearch keeps track of the indexing load for every index as part of a metric named write load, and exposes it for on-prem and ESS deployments as part of the index stats API under the indexing section. The write_load represents the average number of write threads used while indexing documents. For an index with one shard the maximum possible value of the write_load metric is the number of write threads available (e.g. all write threads are busy writing in the same shard). For indices with multiple shards the maximum possible value for the write load is the number of write threads available in a node times the number of indexing nodes in the project. (e.g. all write threads on all the indexing nodes that host a shard for our index are busy writing in the shards belonging to our index, exclusively) To get a sense of the values allowed for write_load let’s look at index logs with one shard running on one Elasticsearch machine with 2 allocated processors. The write thread pool will be sized to 2 threads. This means that if this Elasticsearch node is exclusively and constantly writing to the same index logs , the write_load we’ll report for index logs will be 2.0 (i.e. 2 write threads fully utilized for writing into index logs ). If logs has 2 primary shards and we’re now running on two Elasticsearch nodes, each with 2 allocated processors we’ll be able to get a maximum reported write_load of 4.0 if all write threads on both Elasticsearch nodes are exclusively writing into the logs index. Serverless autoscaling We just looked at how the write load capacity doubled when we increased the number of shards and Elasticsearch nodes. Elastic Cloud Serverless takes care automatically of both these operations using data stream autosharding and ingest autoscaling . Autoscaling refers to the process of dynamically adjusting resources - like memory, CPU, and disk - based on current demands. In our serverless architecture, we start with a small 2GB memory server and use a step-function scaling approach to increase capacity efficiently. We scale up memory incrementally and then scale out by adding servers. This cycle continues, increasing memory per server incrementally up to 64GB while managing the number of servers. Linking autoscaling and autosharding The connection between auto scaling and auto sharding is essential for optimizing performance. When calculating the optimal number of shards for a data stream, we consider the minimum and maximum number of available write threads per node in our scaling setup. For small projects, the system will move from 1 to 2 shards when the data stream uses more than half the capacity of a node (i.e., more than one indexing thread). For medium-sized projects, as the system scales across multiple nodes, it will not exceed 3 shards to avoid excessive overhead. Once we reach the largest node sizes, further sharding is enabled to accommodate larger workloads. Autosharding also enables autoscaling to increase resources as needed, preventing the system from staying at low capacity during high indexing workloads, by enabling projects to reach higher ingestion load values. Auto sharding formula To determine the number of shards needed, we use the following formula: This equation balances the need for increasing shards based on write_load while capping the number of shards to prevent oversharding. The division by 2 reflects the strategy of increasing shards only after exceeding half the capacity of a node. The min/max write threads represent the minimum and maximum number of write threads available in the autoscaling step function (i.e. the number of write threads available on the smallest 2GB step and the number of write threads available on the largest server) Let’s visualize the output of the formula: On the Y axis we have the number of shards . And on the X axis we have the write load . We start with 1 shard and we get to 3 shards when the write load is just over 3.0. We remain with 3 shards for quite some time until the write load is about 48.0. This covers us for the time we scale up through the nodes but haven’t really got to 2 or more or the largest servers, at which point we unlock auto sharding to more than 3 shards, as many as needed to ingest data. While adding shards can improve indexing performance, excessive sharding in an Elasticsearch cluster can have negative repercussions - imagine that pizza with 56 slices being shared by only 7 friends. Each shard carries overhead costs, including maintenance and resource allocation. Our algorithm accounts for and avoids the peril of excessive sharding until we get to the largest workloads where adding more than 3 shards makes a material difference to indexing performance and throughput. Implementing autosharding with rollovers The implementation of autosharding relies on the concept of rollover . A rollover operation creates a new index within the data stream , promoting it to the write index while designating the previous index as a regular backing index, which no longer accepts writes. This transition can occur based on specific conditions, such as exceeding a shard size of 50GB. We take care of configuring the optimal rollover conditions for data streams in Serverless . In Serverless alongside the usual rollover conditions that relate to maintaining healthy indices and shards we introduce a new condition that evaluates whether the current write load necessitates an increase in shard count. If this condition is met, a rollover will be triggered and the new resulting data stream write index will be configured with the optimal number of shards. For downscaling, the system will monitor the workload and will not trigger a rollover solely for reducing shards. Instead, it will wait until a regular rollover condition, like the primary shard size, triggers the rollover. The resulting write index will be configured with a lower number of shards. Cooldown periods for shard adjustments To ensure stability during shard adjustments, we implement cooldown periods: Increase shards cooldown : A minimum wait time of 4.5 minutes is enforced before increasing the number of shards since the last adjustment. The 4.5 minutes cooldown might seem peculiar but the interval has been chosen to make sure we can increase the number of shards every time data stream lifecycle checks if data streams should rollover (currently, every 5 minutes) but not more often than 5 minutes, covering for internal Elasticsearch cluster reconfiguration. Decrease shards cooldown : We maintain a 3-day minimum wait time before reducing shards to ensure that the decision is based on sustained workload patterns rather than temporary fluctuations. Conclusion The data streams autosharding feature in Serverless Elasticsearch represents significant progress in managing data streams effectively. By automatically adjusting shard counts based on real-time indexing loads, this feature simplifies operations and enhances scalability. With the added benefits of autoscaling , users can expect a more efficient and responsive experience, whether they are handling small projects or large-scale applications. As data workloads continue to evolve, the adaptability provided by auto sharding ensures that Elasticsearch remains a robust solution for managing diverse indexing needs. Try out our Serverless Elasticsearch offering to take advantage of data streams auto sharding and observe the indexing throughput scaling seamlessly as your data ingestion load increases. Your pizzas will be optimally sliced as more friends arrive at your party, keen to try those sourdough craft pizzas you prepared for them. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Elastic Cloud Serverless Ingestion September 20, 2024 Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. VR MR By: Vishal Raj and Marc Lopez Rubio Jump to Autosharding of data streams in Serverless Elasticsearch The promise of autosharding Serverless autoscaling Linking autoscaling and autosharding Auto sharding formula Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Autosharding of data streams in Elasticsearch Serverless - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/datastream-autosharding-serverless","meta_description":"Learn about autosharding of data streams in Elasticsearch Serverless, the mechanics of autosharding, its benefits, and its implications for users."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds support for Azure OpenAI embeddings Elasticsearch open inference API adds support for Azure OpenAI embeddings to be stored in the world's most downloaded vector database. Integrations Vector Database How To MH By: Mark Hoy On May 22, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. We're happy to announce that Elasticsearch now supports Azure OpenAI embeddings in our open inference API , enabling developers to store generated embeddings into our highly scalable and performant vector database . This new functionality further solidifies our commitment to not only working with Microsoft and the Azure platform, but also toward our commitment to offering our customers more flexibility with their AI solutions. Ongoing Investment in AI at Elastic This is the latest in a series of additional features and integrations on AI enablement for Elasticsearch following on from: Elasticsearch open inference API adds Azure AI Studio support Elasticsearch open inference API adds support for Azure OpenAI chat completions Elasticsearch open inference API adds support for OpenAI chat completions Elasticsearch open inference API adds support for Cohere Embeddings Introducing Elasticsearch vector database to Azure OpenAI Service On Your Data (preview) The new inference embeddings service provider for Azure OpenAI is already available in our stateless offering on Elastic Cloud, and will be soon available to everyone in an upcoming Elastic release. Using Azure OpenAI Embeddings with the Elasticsearch Inference API Deploying an Azure OpenAI Embeddings Model To get started, you will need a Microsoft Azure Subscription as well as access to Azure OpenAI service . Once you have registered and have access, you will need to create a resource in your Azure Portal , and then deploy an embedding model to Azure OpenAI Studio . To do this, if you do not already have an Azure OpenAI resource in your Azure Portal, create a new one from the “Azure OpenAI” type which can be found in the Azure Marketplace, and take note of your resource name as you will need this later. When you create your resource, the region you choose may impact what models you have access to. See the Azure OpenAI deployment model availability table for additional details. Once you have your resource, you will also need one of your API keys which can be found in the “Keys and Endpoint” information from the Azure Portal's left side navigation: Now, to deploy your Azure OpenAI Embedding model, go into your Azure OpenAI Studio's console and create your deployment using an OpenAI Embeddings model such as text-embedding-ada-002 . Once your deployment is created, you should see the deployment overview. Also take note of the deployment name, in the example below it is “example-embeddings-model”. Using your deployed Azure OpenAI embeddings model with the Elasticsearch Inference API With an Azure OpenAI embeddings model deployed, we can now configure your Elasticsearch deployment's _inference API and create a pipeline to index embeddings vectors in your documents. Please refer to the Elastic Search Labs GitHub repository for more in-depth guides and interactive notebooks. To perform these tasks, you can use the Kibana Dev Console, or any REST console of your choice. First, configure your inference endpoint using the create inference model endpoint - we'll call this “example_model”: For your inference endpoint, you will need your API key, your resource name, and the deployment id that you created above. For the “api_version”, you will want to use an available API version from the Azure OpenAI embeddings documentation - we suggest always using the latest version which is “2024-02-01” as of this writing. You can also optionally add a username in the task setting's “user” field which should be a unique identifier representing your end-user to help Azure OpenAI to monitor and detect abuse. If you do not want to do this, omit the entire “task_settings” object. After running this command you should receive a 200 OK status indicating that the model is properly set up. Using the perform inference endpoint , we can see an example of your inference endpoint at work: The output from the above command should provide the embeddings vector for the input text: Now that we know our inference endpoint works, we can create a pipeline that uses it: This will create an ingestion pipeline named “azureopenai_embeddings” that will read the contents of the “name” field upon ingestion and apply the embeddings inference from our model to the “name_embedding” output field. You can then use this ingestion pipeline when documents are ingested (e.g. via the _bulk ingest endpoint), or when reindexing an index that is already populated. This is currently available through the open inference API in our stateless offering on Elastic Cloud. It'll also be soon available to everyone in an upcoming versioned Elasticsearch release, with additional semantic text capabilites that will make this step even simpler to integrate into your existing workflows. For an additional use case, you can walk through the semantic search with inference tutorial for how to perform ingestion and semantic search on a larger scale with Azure OpenAI and other services such as reranking or chat completions. Plenty more on the horizon This new extensibility is only one of many new features we are bringing to the AI table from Elastic. Bookmark Elastic Search Labs now to stay up to date! Ready to build RAG into your apps? Want to try different LLMs with a vector database? Check out our sample notebooks for LangChain, Cohere and more on Github , and join the Elasticsearch Engineer training starting soon! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Ongoing Investment in AI at Elastic Using Azure OpenAI Embeddings with the Elasticsearch Inference API Deploying an Azure OpenAI Embeddings Model Using your deployed Azure OpenAI embeddings model with the Elasticsearch Inference API Plenty more on the horizon Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch open inference API adds support for Azure OpenAI embeddings - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-azure-openai-embeddings-support","meta_description":"Elasticsearch open inference API adds support for Azure OpenAI embeddings to be stored in the world's most downloaded vector database."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing the sparse vector query: Searching sparse vectors with inference or precomputed query vectors Learn about the Elasticsearch sparse vector query, how it works, and how to effectively use it. Vector Database KD By: Kathleen DeRusso On July 23, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Sparse vector queries take advantage of Elasticsearch’s powerful inference API , allowing easy built-in setup for Elastic-hosted models such as ELSER and E5 , as well as the flexibility to host other models. Introduction Vector search is evolving, and as our needs for vector search evolve so does the need for a consistent and forward thinking vector search API. When Elastic first launched semantic search, we leveraged existing rank_features fields using the text_expansion query. We then reintroduced the sparse_vector field type for semantic search use cases. As we think about what sparse vector search is going forward, we’ve introduced a new sparse vector query. As of Elasticsearch 8.15.0, both the text_expansion query and weighted_tokens query have been deprecated in favor of the new sparse vector query. The sparse vector query supports two modes of querying: using an inference ID and using precomputed query vectors. Both modes of querying require data to be indexed in a sparse_vector mapped field. These token-weight pairs are then used in a query against a sparse vector. At query time, query vectors are calculated using the same inference model that was used to create the tokens. Let’s look at an example: let’s say we’ve indexed a document detailing when Orion is most visible in the night sky: Now, assume we’re looking for constellations that are visible in the northern hemisphere, and we run this query through the same learned sparse encoder model. The output might look similar to this: At query time, these vectors are ORed together, and scoring is effectively a dot product calculation between the stored dimensions and the query dimensions, which would score this example at 10.84: Sparse vector queries with inference Sparse vector queries using inference work in a very similar way to the previous text expansion query, instead of sending in a trained model, we create an inference endpoint associated with the model we want to use. Here’s an example of how to create an inference endpoint for ELSER: You should use an inference endpoint to index your sparse vector data, and use the same endpoint as input to your sparse_vector query. For example: Sparse vector queries with precomputed query vectors You may have precomputed vectors that don’t require inference at query time. These can be sent into the sparse_vector query instead of using inference. Here is an example: Query optimization with token pruning Like text expansion search, the sparse vector query is subject to performance penalties from huge boolean queries. Therefore the same token pruning strategies available for text expansion strategies are available in the sparse vector query. You can see the impact of token pruning in our nightly MS Marco Passage Ranking benchmarks . In order to enable pruning with the default pruning configuration (which has been tuned for ELSER V2), simply add prune: true to your request: Alternately, you can adjust the pruning configuration by sending it directly in with the request: Because token pruning will incur a recall penalty, we recommend adding the pruned tokens back in a rescore: What's next? While the text_expansion query is GA’d and will be supported throughout Elasticsearch 8.x, we recommend updating to the sparse_vector query as soon as possible in order to ensure you’re using the most up to date features as we continually improve the vector search experience in Elasticsearch. If you are using the weighted_tokens query, this was never GA’d and will be replaced by the sparse_vector query very soon. The sparse_vector query will be available starting with 8.15.0 and is already available in Serverless - try it out today! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Introduction Sparse vector queries with inference Sparse vector queries with precomputed query vectors Query optimization with token pruning What's next? Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing the sparse vector query: Searching sparse vectors with inference or precomputed query vectors - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-sparse-vector-query","meta_description":"Learn about the Elasticsearch sparse vector query, how it works, and how to effectively use it."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Understanding Int4 scalar quantization in Lucene This blog explains how int4 quantization works in Lucene, how it lines up, and the benefits of using int4 quantization. Lucene ML Research BT TV By: Benjamin Trent and Thomas Veasey On April 25, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction to Int4 quantization in Lucene In our previous blogs, we walked through the implementation of scalar quantization as a whole in Lucene. We also explored two specific optimizations for quantization . Now we've reached the question: how does int4 quantization work in Lucene and how does it line up? How does Int4 quantization work in Lucene Storing and scoring the quantized vectors Lucene stores all the vectors in a flat file, making it possible for each vector to be retrieved given some ordinal. You can read a brief overview of this in our previous scalar quantization blog . Now int4 gives us additional compression options than what we had before. It reduces the quantization space to only 16 possible values (0 through 15). For more compact storage, Lucene uses some simple bit shift operations to pack these smaller values into a single byte, allowing a possible 2x space savings on top of the already 4x space savings with int8. In all, storing int4 with bit compression is 8x smaller than float32 . Figure 1: This shows the reduction in bytes required with int4 which allows an 8x reduction in size from float32 when compressed. int4 also has some benefits when it comes to scoring latency. Since the values are known to be between 0-15 , we can take advantage of knowing exactly when to worry about value overflow and optimize the dot-product calculation. The maximum value for a dot product is 15*15=225 which can fit in a single byte. ARM processors (like my macbook) have a SIMD instruction length of 128 bits (16 bytes). This means that for a Java short we can allocate 8 values to fill the lanes. For 1024 dimensions, each lane will end up accumulating a total of 1024/8=128 multiplications that have a max value of 225 . The resulting maximum sum of 28800 fits well within the limit of Java's short value and we can iterate more values at a time than. Here is some simplified code of what this looks like for ARM. Calculating the quantization error correction For a more detailed explanation of the error correction calculation and its derivation, please see error correcting the scalar dot-product . Here is a short summary, woefully (or joyfully) devoid of complicated mathematics. For every quantized vector stored, we additionally keep track of a quantization error correction. Back in the Scalar Quantization 101 blog there was a particular constant mentioned: α × i n t 8 i × m i n \\alpha \\times int8_i \\times min α × in t 8 i ​ × min This constant is a simple constant derived from basic algebra. However, we now include additional information in the stored float that relates to the rounding loss. ∑ i = 0 d i m − 1 ( ( i − m i n ) − i ′ × α ) i ′ × α \\sum_{i=0}^{dim-1} ((i - min) - i'\\times\\alpha)i'\\times\\alpha i = 0 ∑ d im − 1 ​ (( i − min ) − i ′ × α ) i ′ × α Where i i i is each floating point vector dimension, i ′ i' i ′ is the scalar quantized floating point value, and α = m a x − m i n ( 1 < < b i t s ) − 1 \\alpha=\\frac{max - min}{(1 << bits) - 1} α = ( 1 << bi t s ) − 1 ma x − min ​ . This has two consequences. The first is intuitive, as it means that for a given set of quantization buckets, we are slightly more accurate as we account for some of the lossiness of the quantization. The second consequence is a bit more nuanced. It now means we have an error correction measure that is impacted by the quantization bucketing. This implies that it can be optimized. Finding the optimal bucketing for int4 quantization The naive and simple way to do scalar quantization can get you pretty far. Usually, you pick a confidence interval from which you calculate the allowed extreme boundaries for vector values. The default in Lucene and consequently Elasticsearch is 1 − 1 / ( d i m e n s i o n s + 1 ) 1-1/(dimensions+1) 1 − 1/ ( d im e n s i o n s + 1 ) . Figure 2 shows the confidence interval over some sampled CohereV3 embeddings . Figure 3 shows the same vectors, but scalar quantized with that statically set confidence interval. Figure 2: A sampling of CohereV3 dimension values. Figure 3: CohereV3 dimension values quantized into int7 values. What are those spikes at the end? Well, that is the result of truncating extreme values during the quantization process. But, we are leaving some nice optimizations on the floor. What if we could tweak the confidence interval to shift the buckets, allowing for more important dimensional values to have higher fidelity. To optimize, Lucene does the following: Sample around 1,000 vectors from the data set and calculate their true nearest 10 neighbors. Calculate a set of candidate upper and lower quantiles. The set is calculated by using two different confidence intervals: 1 − 1 / ( d i m e n s i o n s + 1 ) 1 - 1/(dimensions+1) 1 − 1/ ( d im e n s i o n s + 1 ) and 1 − ( d i m e n s i o n s / 10 ) / ( d i m e n s i o n s + 1 ) 1-(dimensions/10)/(dimensions + 1) 1 − ( d im e n s i o n s /10 ) / ( d im e n s i o n s + 1 ) . These intervals are on the opposite extremes. For example, vectors with 1024 dimensions would search quantile candidates between confidence intervals 0.99902 and 0.90009 . Do a grid search over a subset of the quantiles that exist between these two confidence intervals. The grid search finds the quantiles that maximize the coefficient of determination of the quantization score errors vs. the true 10 nearest neighbors calculated earlier. Figure 3: Lucene searching the confidence interval space and testing various buckets for int4 quantization. Figure 4: The best int4 quantization buckets found for this CohereV3 sample set. For a more complete explanation of the optimization process and the mathematics behind this optimization, see optimizing the truncation interval . Speed vs. size for quantization As I mentioned before, int4 gives you an interesting tradeoff between performance and space. To drive this point home, here are some memory requirements for CohereV3 500k vectors. Figure 5: Memory requirements for CohereV3 500k vectors. Of course, we see the typical 4x reduction in regular scalar quantization, but then additional 2x reduction with int4 . Moving the required memory from 2GB to less than 300MB . Keep in mind, this is with compression enabled. Decompressing and compressing bytes does have an overhead at search time. For every byte vector, we must decompress them before doing the int4 comparisons. Consequently, when this is introduced in Elasticsearch, we want to give users the ability to choose to compress or not. For some users, the cheaper memory requirements are just too good to pass up, for others, their focus is speed. Int4 gives the opportunity to tune your settings to fit your use-case. Figure 6: HNSW graph search speed comparison for CohereV3 500k vectors. Speed part 2: more SIMD in int4 Figure 6 is a bit disappointing in terms of the speed of compressed scalar quantization. We expect performance benefits from loading fewer bytes to the JVM heap. However, this is being outweighed by the cost of decompressing them. This caused us dig deeper. The reason for the performance impact was naively decompressing the bytes separately from the dot-product comparison. This is a mistake. We can do better. Consequently, we can use SIMD to decompress the bytes and compare them in the same function. This is a bit more complicated than the previous SIMD example, but it is possible. Here is a simplified version of what this looks like for ARM. As expected, this has a significant improvement on ARM. Effectively removing all performance discrepancies on ARM between compressed and uncompressed scalar quantization. Figure 7: HNSW graph search comparison with int4 quantized vectors over 500k Coherev3 vectors. This is on ARM architecture. The end? Over the last two large and technical blog posts, we've gone over the math and intuition around the optimizations and what they bring to Lucene. It's been a long ride, and we are nowhere near done. Keep an eye out for these capabilities in a future Elasticsearch version! Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Jump to Introduction to Int4 quantization in Lucene How does Storing and scoring the quantized vectors Calculating the quantization error correction Finding the optimal bucketing for int4 quantization Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Understanding Int4 scalar quantization in Lucene - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/int4-scalar-quantization-in-lucene","meta_description":"This blog explains how int4 quantization works in Lucene, how it lines up, and the benefits of using int4 quantization."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog NEST lifetime extended & Elastic.Clients.Elasticsearch (v8) Roadmap Announcing the extension of the NEST (v7) lifetime and providing a high level overview of the Elastic.Clients.Elasticsearch (v8) roadmap. Developer Experience .NET FB By: Florian Bernd On October 15, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Extension of the NEST lifetime At Elastic, we typically offer support for our Elasticsearch products for two entire major series. So when version 9 is released, everything related to version 7 hits end-of-life. But sometimes there's a reason to deviate from this policy, and we've chosen to do just that for our version 7 client for .NET: NEST. We've been watching the downloads, and listening to user feedback, and have seen and heard that it's going to take a little longer for many of you to migrate onto the version 8 .NET client. For that reason, we'll be extending the support period of NEST right up to the end of 2025. This means that you're absolutely fine to keep using NEST for a while longer, and you don't need to rush your migration to the version 8 client. We'll also be working on our migration docs and code examples to help you as much as we can with the transition. Although we want to keep the lights on for longer, we are still working towards the end of life for NEST. So we won't be adding any new features to the old client, nor will we add explicit support for any new Elasticsearch APIs, or upgrade NEST to work with new versions of the .NET framework or Elasticsearch ( sticking with the NEST client prevents migration to Elasticsearch 9.x ). But we will monitor the library for bug fixes and security patches, and we'll make sure that NEST keeps working the way it always has. Let's dive into the roadmap in a bit more depth... Elastic.Clients.Elasticsearch (v8) roadmap In the meantime, we are working hard to not only improve the actual client, but also the documentation. A few of the most important points on our roadmap in this regard are as follows: Improvement of the getting started guide Providing as many code examples as possible Publishing up-to-date auto-generated reference documentation for each version In addition to the planned improvements to the documentation, several usability optimizations are also on the agenda. These include, for example: Re-implementation of the && , || and ! operators for the logical combination of queries (also for descriptor syntax) Simpler sorting Extended support for the default index Conclusion In conclusion, Elastic's extension of NEST support until the end of 2025 provides users with ample time to transition smoothly to the new client, ensuring continued stability and assistance throughout the process. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote Jump to Extension of the NEST lifetime Elastic.Clients.Elasticsearch (v8) roadmap Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"NEST lifetime extended & Elastic.Clients.Elasticsearch (v8) Roadmap - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/nest-lifetime-extension-v8-roadmap","meta_description":"Covering the extension of the Elastic NEST (v7) lifetime and providing a high level overview of the Elastic.Clients.Elasticsearch (v8) roadmap."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Scalar quantization optimized for vector databases Optimizing scalar quantization for the vector database use case allows us to achieve significantly better performance for the same retrieval quality at high compression ratios. ML Research TV BT By: Thomas Veasey and Benjamin Trent On April 25, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction We talked before about the scalar quantization work we've been doing in Lucene. In this two part blog we will dig a little deeper into how we're optimizing scalar quantization for the vector database use case. This has allowed us to unlock very nice performance gains for int4 quantization in Lucene, as we'll discuss in the second part. In the first part we're going to dive into the details of how we did this. Feel free to jump ahead to learn about what this will actually mean for you. Otherwise, buckle your seatbelts, because Kansas is going bye-bye! Scalar quantization recap First of all, a quick refresher on scalar quantization. Scalar quantization was introduced as a mechanism for accelerating inference. To date it has been mainly studied in that setting. For that reason, the main considerations were the accuracy of the model output and the performance gains that come from reduced memory pressure and accelerated matrix multiplication (GEMM) operations. Vector retrieval has some slightly different characteristics which we can exploit to improve quantization accuracy for a given compression ratio. The basic idea of scalar quantization is to truncate and scale each floating point component of a vector (or tensor) so it can be represented by an integer. Formally, if you use b b b bits to represent a vector component x x x as an integer in the interval [ 0 , 2 b − 1 ] [0,2^b-1] [ 0 , 2 b − 1 ] you transform it as follows x ↦ ⌊ ( 2 b − 1 ) clamp ( x , a , b ) b − a ⌉ x \\mapsto \\left\\lfloor \\frac{(2^b-1)\\text{clamp}(x,a,b)}{b-a} \\right\\rceil x ↦ ⌊ b − a ( 2 b − 1 ) clamp ( x , a , b ) ​ ⌉ where clamp ( ⋅ , a , b ) \\text{clamp}(\\cdot,a,b) clamp ( ⋅ , a , b ) denotes min ⁡ ( max ⁡ ( ⋅ , a ) , b ) \\min(\\max(\\cdot, a),b) min ( max ( ⋅ , a ) , b ) and ⌊ ⋅ ⌉ \\lfloor\\cdot\\rceil ⌊ ⋅ ⌉ denotes round to the nearest integer. People typically choose a a a and b b b based on percentiles of the distribution. We will discuss a better approach later. If you use int4 or 4 bit quantization then each component is some integer in the interval [0,15], that is each component takes one of only 16 distinct values! Novelties introduced to scalar quantization In this part, we are going to describe in detail two specific novelties we have introduced: A first order correction to the dot product when it is computed using integer values. An optimization procedure for the free parameters used to compute the quantized vector. Just to pause on point 1 for a second. What we will show is that we can continue to compute the dot product directly using integer arithmetic . At the same time, we can compute an additive correction that allows us to improve its accuracy. So we can improve retrieval quality without losing the opportunity of using extremely optimized implementations of the integer dot product. This translates to a clear cut win in terms of retrieval quality as a function of performance. 1. Error correcting the scalar dot product Most embedding models use either cosine or dot product similarity. The good thing is if you normalize your vectors then cosine (and even Euclidean) is equivalent to dot product (up to order). Therefore, reducing the quantization error in the dot product covers the great majority of use cases. This will be our focus. The vector database use case is as follows. There is a large collection of floating point vectors from some black box embedding model. We want a quantization scheme which achieves the best possible recall when retrieving with query vectors which come from a similar distribution. Recall is the proportion of true nearest neighbors, those with maximum dot product computed with float vectors, which we retrieve computing similarity with quantized vectors. We assume we've been given the truncation interval [ a , b ] [a,b] [ a , b ] for now. All vector components are snapped into this interval in order to compute their quantized values exactly as we discussed before. In the next part we will discuss how to optimize this interval for the vector database use case. To movitate the following analysis, consider that for any given document vector, if we knew the query vector ahead of time , we could compute the quantization error in the dot product exactly and simply subtract it. Clearly, this is not realistic since, apart from anything else, the query vector is not fixed. However, maybe we can achieve a real improvement by assuming that the query vector is drawn from a distribution that is centered around the document vector. This is plausible since queries that match a document are likely to be in the vicinity of its embedding. In the following, we formalize this intuition and actually derive a correction term. We first study the error that scalar quantization introduces into the dot product. We then devise a correction based on the expected first order error in the vicinity of each indexed vector. To do this requires us to store one extra float per vector. Since realistic vector dimensions are large this results in minimal overhead. We will call an arbitrary vector in our database x \\mathbf{x} x and an arbitrary query vector y \\mathbf{y} y . Then x t y = ( a + x − a ) t ( a + x − a ) = a t a + a t ( y − a ) + a t ( x − a ) + ( y − a ) t ( x − a ) \\begin{align*} \\mathbf{x}^t\\mathbf{y} &= (\\mathbf{a}+\\mathbf{x}-\\mathbf{a})^t(\\mathbf{a}+\\mathbf{x} - \\mathbf{a}) \\\\ &= \\mathbf{a}^t\\mathbf{a}+\\mathbf{a}^t(\\mathbf{y}-\\mathbf{a})+\\mathbf{a}^t(\\mathbf{x}-\\mathbf{a})+(\\mathbf{y}-\\mathbf{a})^t(\\mathbf{x}-\\mathbf{a}) \\end{align*} x t y ​ = ( a + x − a ) t ( a + x − a ) = a t a + a t ( y − a ) + a t ( x − a ) + ( y − a ) t ( x − a ) ​ On the right hand side, the first term is a constant and the second two terms are a function of a single vector and can be precomputed. For the one involving the document, this is an extra float that can be stored with its vector representation. So far all our calculations can use floating point arithmetic. Everything interesting however is happening in the last term, which depends on the interaction between the query and the document. We just need one more bit more notation: define α = b − a 2 b − 1 \\alpha = \\frac{b-a}{2^b-1} α = 2 b − 1 b − a ​ and ⋆ q = clamp ( ⋆ , a , b ) α \\star_q=\\frac{\\text{clamp}(\\star, \\mathbf{a}, \\mathbf{b})}{\\alpha} ⋆ q ​ = α clamp ( ⋆ , a , b ) ​ where we understand that the clamp function broadcasts over vector components. Let's rewrite the last term, still keeping everything in floating point, using a similar trick: ( y − a ) t ( x − a ) = ( α y q + y − a − α y q ) t ( α x q + x − a − α x q ) = α 2 y q t x q − α y q t ϵ x + α x q t ϵ y + O ( ∥ ϵ ∥ 2 ) \\begin{align*} (\\mathbf{y}-\\mathbf{a})^t(\\mathbf{x}-\\mathbf{a}) &= (\\alpha \\mathbf{y}_q + \\mathbf{y} -\\mathbf{a} - \\alpha\\mathbf{y}_q)^t(\\alpha \\mathbf{x}_q + \\mathbf{x} -\\mathbf{a} - \\alpha\\mathbf{x}_q) \\\\ &= \\alpha^2\\mathbf{y}_q^t\\mathbf{x}_q - \\alpha\\mathbf{y}_q^t\\mathbf{\\epsilon}_x + \\alpha\\mathbf{x}_q^t\\mathbf{\\epsilon}_y + \\text{O}(\\|\\mathbf{\\epsilon}\\|^2) \\end{align*} ( y − a ) t ( x − a ) ​ = ( α y q ​ + y − a − α y q ​ ) t ( α x q ​ + x − a − α x q ​ ) = α 2 y q t ​ x q ​ − α y q t ​ ϵ x ​ + α x q t ​ ϵ y ​ + O ( ∥ ϵ ∥ 2 ) ​ Here, ϵ ⋆ = ⋆ − a − ⋆ q \\mathbf{\\epsilon}_{\\star} = \\star - \\mathbf{a}- \\star_q ϵ ⋆ ​ = ⋆ − a − ⋆ q ​ represents the quantization error. The first term is just the scaled quantized vector dot product and can be computed exactly. The last term is proportional in magnitude to the square of the quantization error and we hope this will be somewhat small compared to the overall dot product. That leaves us with the terms that are linear in the quantization error. We can compute the quantization error vectors in the query and document, ϵ y \\mathbf{\\epsilon}_y ϵ y ​ and ϵ x \\mathbf{\\epsilon}_x ϵ x ​ respectively, ahead of time. However, we don't actually know the value of x \\mathbf{x} x we will be comparing to a given y \\mathbf{y} y and vice versa. So we don't know how to calculate the error in the dot product quantities exactly. In such cases it is natural to try and minimize the error in expectation (in some sense we discuss below). If x \\mathbf{x} x and y \\mathbf{y} y are drawn at random from our corpus they are random variables and so too are x q \\mathbf{x}_q x q ​ and y q \\mathbf{y}_q y q ​ . For any distribution we average over for x q \\mathbf{x}_q x q ​ then E x [ α x q t ϵ y ] = α E x [ x q ] t ϵ y \\mathbb{E}_x[\\alpha\\mathbf{x}_q^t \\mathbf{\\epsilon}_y]=\\alpha\\mathbb{E}_x[\\mathbf{x}_q]^t \\mathbf{\\epsilon}_y E x ​ [ α x q t ​ ϵ y ​ ] = α E x ​ [ x q ​ ] t ϵ y ​ , since α \\alpha α and ϵ y \\mathbf{\\epsilon}_y ϵ y ​ are fixed for a query. This is a constant additive term to the score of each document, which means it does not change their order. This is important as it will not change the quality of retrieval and so we can drop it altogether. What about the α y q t ϵ x \\alpha\\mathbf{y}_q^t \\mathbf{\\epsilon}_x α y q t ​ ϵ x ​ term? The naive thing is to assume that the y q t \\mathbf{y}_q^t y q t ​ is a random sample from our corpus. However, this is not the best assumption. In practice, we know that the queries which actually match a document will come from some region in the vicinity of its embedding as we illustrate in the figure below. Schematic query distribution that is expected to match a given document embedding (orange) vs all queries (blue plus orange). We can efficiently find nearest neighbors of each document x \\mathbf{x} x in the database once we have a proximity graph. However, we can do something even simpler and assume that for relevant queries E y [ y q ] ≈ x q \\mathbb{E}_y[\\mathbf{y}_q] \\approx \\mathbf{x}_q E y ​ [ y q ​ ] ≈ x q ​ . This yields a scalar correction α x q t ϵ x \\alpha\\mathbf{x}_q^t\\mathbf{\\epsilon}_x α x q t ​ ϵ x ​ which only depends on the document embedding and can be precomputed and added on to the a t ( x − a ) \\mathbf{a}^t(\\mathbf{x}-\\mathbf{a}) a t ( x − a ) term and stored with the vector. We show later how this affects the quality of retrieval. The anisotropic correction is inspired by this approach for reducing product quantization errors. Finally, we note that the main obstacle to improving this correction is we don't have useful estimates of the joint distribution of the query and document embedding quantization errors. One approach that might enable this, at the cost of some extra memory and compute, is to use low rank approximations of these errors. We plan to study schemes like this since we believe they could unlock accurate general purpose 2 or even 1 bit scalar quantization. 2. Optimizing the truncation interval So far we worked with some specified interval [ a , b ] [a,b] [ a , b ] but didn't discuss how to compute it. In the context of quantization for model inference people tend to use quantile points of the component distribution or their minimum and maximum. Here, we discuss a new method for computing this based on the idea that preserving the order of the dot product values is better suited to the vector database use case. First off, why is this not equivalent to minimizing the magnitude of the quantization errors? Suppose for a query y \\mathbf{y} y the top-k matches are x i \\mathbf{x}_i x i ​ for i ∈ [ k ] i \\in [k] i ∈ [ k ] . Consider two possibilities, the quantization error is some constant c c c , or it is is normally distributed with mean 0 0 0 and standard deviation c 10 \\frac{c}{10} 10 c ​ . In the second case the expected error is roughly 10 times smaller than the first. However, the first effect is a constant shift, which preserves order and has no impact on recall. Meanwhile, if 1 k ∑ i = 1 k ∣ y t ( x i − x i + 1 ) ∣ < c 10 \\frac{1}{k}\\sum_{i=1}^k \\left|\\mathbf{y}^t(\\mathbf{x}_i-\\mathbf{x}_{i+1})\\right| < \\frac{c}{10} k 1 ​ ∑ i = 1 k ​ ∣ y t ( x i ​ − x i + 1 ​ ) ∣ < 10 c ​ it is very likely the random error will reorder matches and so affect the quality of retrieval. Let's use the previous example to better motivate our approach. The figure below shows the various quantities at play for a sample query y \\mathbf{y} y and two documents x i \\mathbf{x}_i x i ​ and x i + 1 \\mathbf{x}_{i+1} x i + 1 ​ . The area of each blue shaded rectangle is equal to one of the floating point dot products and the area of each red shaded rectangle is equal to one of the quantization errors . Specifically, the dot products are ∥ y ∥ ∥ P y x i ∥ \\|\\mathbf{y}\\|\\|P_y\\mathbf{x}_i\\| ∥ y ∥∥ P y ​ x i ​ ∥ and ∥ y ∥ ∥ P y x i + 1 ∥ \\|\\mathbf{y}\\|\\|P_y\\mathbf{x}_{i+1}\\| ∥ y ∥∥ P y ​ x i + 1 ​ ∥ , and the quantization errors are ∥ y ∥ ∥ P y ( x i − a − x i , q ) ∥ \\|\\mathbf{y}\\|\\|P_y(\\mathbf{x}_i-\\mathbf{a}-\\mathbf{x}_{i,q})\\| ∥ y ∥∥ P y ​ ( x i ​ − a − x i , q ​ ) ∥ and ∥ y ∥ ∥ P y ( x i + 1 − a − x i + 1 , q ) ∥ \\|\\mathbf{y}\\|\\|P_y(\\mathbf{x}_{i+1}-\\mathbf{a}-\\mathbf{x}_{i+1,q})\\| ∥ y ∥∥ P y ​ ( x i + 1 ​ − a − x i + 1 , q ​ ) ∥ where P y = y y t ∥ y ∥ 2 P_y=\\frac{\\mathbf{y}\\mathbf{y}^t}{\\|\\mathbf{y}\\|^2} P y ​ = ∥ y ∥ 2 y y t ​ is the projection onto the query vector. In this example the errors preserve the document order. This follows because the right blue rectangle (representing the exact dot product) and union of right blue and red rectangles (representing the quantized dot product) are both larger than the left ones. It is visually clear the more similar the left and right red rectangles the less likely it is the documents will be reordered. Conversely, the more similar the left and right blue rectangles the more likely it is that quantization will reorder them. Schematic of the dot product values and quantization error values for a query and two near neighbor documents. In this case, the document order is preserved by quantization. One way to think of the quantized dot product is that it models the floating point dot product. From the previous discussion we want to minimize the variance of this model's residual error, which should be as similar as possible for each document. However, there is a second consideration: the density of the floating point dot product values. If these values are close together it is much more likely that quantization will reorder them. It is quite possible for this density to change from one part of the embedding space to another and higher density regions are more sensitive to quantization errors. A natural measure which captures both the quantization error variance and the density of the dot product values is the coefficient of determination of the quantized dot product with respect to the floating point dot product. A good interval [ a , b ] [a,b] [ a , b ] will maximize this in expectation over a representative query distribution. We need a reasonable estimator for this quantity for the database as a whole that we can compute efficiently. We found the following recipe is both fast and yields an excellent choice for parameters a a a and b b b : Sample 1000 random document vectors from the index. For each sample vector find its 10 nearest neighbors. Maximize the average coefficient of determination of the quantized dot product between the sampled vectors and their nearest neighbors with respect to the interval [ a , b ] [a,b] [ a , b ] . This optimization problem can be solved by any black box solver. For example, we used a variant of the adaptive LIPO algorithm in the following. Furthermore, we found that our optimization objective was well behaved (low Lipschitz constant ) for all data sets we tested. Proof of principle for int4 quantization Before deciding to implement this scheme for real we studied how it behaves with int4 quantization. Below we show results for two data sets that are fairly typical of passage embedding model distributions on real data. To generate these we use e5-small-v2 and Cohere's multilingual-22-12 models. These are both fairly state-of-the-art text embedding models. However, they have rather different characteristics. The e5-small-v2 model uses cosine similarity, its vectors have 384 dimensions and very low angular variation. The multilingual-22-12 model uses dot product similarity, its vectors have 768 dimensions and it encodes information in their length. They pose rather different challenges for our quantization scheme and improving both gives much more confidence it works generally. For e5-small-v2 we embedded around 500K passages and 10K queries sampled from the MS MARCO passage data set. For multilingual-22-12 we used around 1M passages and 1K distinct passages for queries sampled from the English Wikipedia data set. First of all, it is interesting to understand the accuracy of the int4 dot product values. The figure below shows the int4 dot product values compared to their float values for a random sample of 100 documents and their 10 nearest neighbors taken from the set we use to compute the optimal truncation interval for e5-small-v2. The orange “best fit” line is y = x − 0.017 y=x-0.017 y = x − 0.017 . Note that this underlines the fact that this procedure can pick a biased estimator if it reduces the residual variance: in this case the quantized dot product is systematically underestimating the true dot product. However, as we discussed before, any constant shift in the dot product is irrelevant for ranking. For the full 1k samples we achieve an R 2 R^2 R 2 of a little less than 0.995, i.e. the int4 quantized dot product is a very good model of the float dot product! Comparison of int4 dot product values to the corresponding float values for a random sample of 100 documents and their 10 nearest neighbors. While this is reassuring, what we really care about is the impact on retrieval quality. Since one can implement brute force nearest neighbor search in a few lines, it allows us to quickly test the impact of our design choices on retrieval. In particular, we are interested in understanding the expected proportion of true nearest neighbors we retrieve when computing similarities using int4 quantized vectors. Below we show the results for an ablation study of the dot product correction and interval optimization. In general, one can boost the accuracy of any quantization scheme by gathering more than the requested vector count and reranking them using their floating point values. However, this comes with a cost: it is significantly more expensive to search graphs for more matches and the floating point vectors must be loaded from disk or it defeats the purpose of compressing them. One way of comparing alternatives is therefore to understand how many vectors must be reranked to achieve the same recall. The lower this number is, the better. In the figures below we show average recall curves as a function of the number of candidates we rerank for different combinations of the two improvements we have discussed: “Baseline” sets [ a , b ] [a,b] [ a , b ] to the 1 − 1 d + 1 1-\\frac{1}{d+1} 1 − d + 1 1 ​ central confidence and applies no correction to the dot product, “No correction” optimizes the interval [ a , b ] [a,b] [ a , b ] by maximizing R 2 R^2 R 2 but applies no correction to the dot product, “No optimization” sets [ a , b ] [a,b] [ a , b ] to the 1 − 1 d + 1 1-\\frac{1}{d+1} 1 − d + 1 1 ​ central confidence but applies the linear correction to the dot product, and “Our scheme” optimizes the interval [ a , b ] [a,b] [ a , b ] by maximizing R 2 R^2 R 2 and applies the linear correction to the dot product. Note that we used d d d to denote the vector dimension. Average recall@10 curves for e5-small-v2 embeddings as a function of the number of candidates reranked for different combinations of the two improvements we discussed. Average recall@10 curves for multilingual-22-12 embeddings as a function of the number of candidates reranked for different combinations of the two improvements we discussed. For e5-small-v2 embeddings we roughly halve the number of vectors we need to rerank to achieve 95% recall compared to the baseline. For multilingual-22-12 embeddings we reduce it by closer to a factor of three. Interestingly, the impact of the two improvements is different for the different data sets. For e5-small-v2 embeddings applying the linear correction has a significantly larger effect than optimizing the interval [ a , b ] [a,b] [ a , b ] whilst the converse is true for multilingual-22-12 embeddings. Another important observation is the gains are more significant if one wants to achieve very high recall: to achieve close to 99% recall one has to rerank at least 5 times as many vectors for both data sets in the baseline versus our improved quantization scheme. Conclusion We have discussed the theoretical and empirical motivation behind two novelties we introduced to achieve high quality int4 quantization, as well as some preliminary results that indicate it'll be an effective general purpose scheme for in memory vector storage for retrieval. This is all well and good, but how well does it work in a real vector database implementation? In the companion blog we discuss our implementation in Lucene and compare it to other storage options such as floating point and int7, which Lucene also provides. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Introduction Scalar quantization recap Novelties introduced to scalar quantization 1. Error correcting the scalar dot product 2. Optimizing the truncation interval Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Scalar quantization optimized for vector databases - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/vector-db-optimized-scalar-quantization","meta_description":"Optimizing scalar quantization for the vector database use case allows us to achieve significantly better performance for the same retrieval quality at high compression ratios."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Bringing maximum-inner-product into Lucene Explore how we brought maximum-inner-product into Lucene and the investigations undertaken to ensure its support. Lucene BT By: Benjamin Trent On September 1, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Currently Lucene restricts dot_product to be only used over normalized vectors. Normalization forces all vector magnitudes to equal one. While for many cases this is acceptable, it can cause relevancy issues for certain data sets. A prime example are embeddings built by Cohere . Their vectors use magnitudes to provide more relevant information. So, why not allow non-normalized vectors in dot-product and thus enable maximum-inner-product? What's the big deal? Negative values and Lucene optimizations Lucene requires non-negative scores, so that matching one more clause in a disjunctive query can only make the score greater, not lower. This is actually important for dynamic pruning optimizations such as block-max WAND , whose efficiency is largely defeated if some clauses may produce negative scores. How does this requirement affect non-normalized vectors? In the normalized case, all vectors are on a unit sphere. This allows handling negative scores to be simple scaling. Figure 1: Two opposite, two dimensional vectors in a 2d unit sphere (e.g. a unit circle). When calculating the dot-product here, the worst it can be is -1 = [1, 0] * [-1, 0]. Lucene accounts for this by adding 1 to the result. With vectors retaining their magnitude, the range of possible values is unknown. Figure 2: When calculating the dot-product for these vectors [2, 2] \\* [-5, -5] = -20 To allow Lucene to utilize blockMax WAND with non-normalized vectors, we must scale the scores. This is a fairly simple solution. Lucene will scale non-normalize vectors with a simple piecewise function: Now all negative scores are between 0-1, and all positives are scaled above 1. This still ensures that higher values mean better matches and removes negative scores. Simple enough, but this is not the final hurdle. The triangle problem Maximum-inner-product doesn't follow the same rules as of simple euclidean spaces . The simple assumed knowledge of the triangle inequality is abandoned. Unintuitively, a vector is no longer nearest to itself. This can be troubling. Lucene’s underlying index structure for vectors is Hierarchical Navigable Small World (HNSW). This being a graph based algorithm, it might rely on euclidean space assumptions. Or would exploring the graph be too slow in non-euclidean space? Some research has indicated that a transformation into euclidean space is required for fast search . Others have gone through the trouble of updating their vector storage enforcing transformations into euclidean space. This caused us to pause and dig deep into some data. The key question is this: does HNSW provide good recall and latency with maximum-inner-product search? While the original HNSW paper and other published research indicate that it does, we needed to do our due diligence. Experiments and results: Maximum-inner-product in Lucene The experiments we ran were simple. All of the experiments are over real data sets or slightly modified real data sets. This is vital for benchmarking as modern neural networks create vectors that adhere to specific characteristics ( see discussion in section 7.8 of this paper ). We measured latency (in milliseconds) vs. recall over non-normalized vectors. Comparing the numbers with the same measurements but with a euclidean space transformation. In each case, the vectors were indexed into Lucene’s HNSW implementation and we measured for 1000 iterations of queries. Three individual cases were considered for each dataset: data inserted ordered by magnitude (lesser to greater), data inserted in a random order, and data inserted in reverse order (greater to lesser). Here are some results from real datasets from Cohere: Figure 3: Here are results for the Cohere’s Multilingual model embedding wikipedia articles. Available on HuggingFace . The first 100k documents were indexed and tested. Figure 4: This is a mixture of Cohere’s English and Japanese embeddings over wikipedia. Both datasets are available on HuggingFace. We also tested against some synthetic datasets to ensure our rigor. We created a data set with e5-small-v2 and scaled the vector's magnitudes by different statistical distributions. For brevity, I will only show two distributions. Figure 5: Pareto distribution of magnitudes. A pareto distribution has a “fat tail” meaning there is a portion of the distribution with a much larger magnitude than others. Figure 6: Gamma distribution of magnitudes. This distribution can have high variance and makes it unique in our experiments. In all our experiments, the only time where the transformation seemed warranted was the synthetic dataset created with the gamma distribution. Even then, the vectors must be inserted in reverse order, largest magnitudes first, to justify the transformation. These are exceptional cases. If you want to read about all the experiments, and about all the mistakes and improvements along the way, here is the Lucene Github issue with all the details (and mistakes along the way). Here’s one for open research and development! Conclusion This has been quite a journey requiring many investigations to make sure maximum-inner-product can be supported in Lucene. We believe the data speaks for itself. No significant transformations required or significant changes to Lucene. All this work will soon unlock maximum-inner-product support with Elasticsearch and allow models like the ones provided by Cohere to be first class citizens in the Elastic Stack. Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Negative values and Lucene optimizations The triangle problem Experiments and results: Maximum-inner-product in Lucene Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Bringing maximum-inner-product into Lucene - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/lucene-bringing-maximum-inner-product-to-lucene","meta_description":"Explore how we brought maximum-inner-product into Lucene and the investigations undertaken to ensure its support."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch geospatial search with ES|QL Geospatial search in Elasticsearch Query Language (ES|QL). Elasticsearch has powerful geospatial search features, which are now coming to ES|QL for dramatically improved ease of use and OGC familiarity. Python How To CT By: Craig Taverner On August 12, 2024 Part of Series Elasticsearch geospatial search Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch has had powerful geospatial search and analytics capabilities for many years, but the API was quite different from what typical GIS users were used to. In the past year we've added the ES|QL query language , a piped query language as easy, or even easier, than SQL. It's particularly suited to the search, security, and observability use cases Elastic excels at. We're also adding support for geospatial search and analytics within ES|QL, making it far easier to use, especially for users coming from SQL or GIS communities. Elasticsearch 8.12 and 8.13 brought basic support for geospatial types to ES|QL. This was dramatically enhanced with the addition of geospatial search capabilities in 8.14. More importantly, this support was designed to conform closely to the Simple Feature Access standard from the Open Geospatial Consortium (OGC) used by other spatial databases like PostGIS, making it much easier to use for GIS experts familiar with these standards. In this blog, we'll show you how to use ES|QL to perform geospatial searches, and how it compares to the SQL and Query DSL equivalents. We'll also show you how to use ES|QL to perform spatial joins, and how to visualize the results in Kibana Maps. Note that all the features described here are in \"technical preview\", and we'd love to hear your feedback on how we can improve them. Searching for geospatial data Let's start with an example query: This performs a search for any city boundary polygons that intersect with a rectangular search polygon around the Sanya Phoenix International Airport (SYX). In a sample dataset of airports, cities and city boundaries, this search finds the intersecting polygon and returns the desired fields from the matching document: abbrev airport region city city_location SYX Sanya Phoenix Int'l 天涯区 Sanya POINT(109.5036 18.2533) That was easy! Now compare this to the classic Elasticsearch Query DSL for the same query: Both queries are reasonably clear in their intent, but the ES|QL query closely resembles SQL. The same query in PostGIS looks like this: Look back at the ES|QL example. So similar, right? We've found that existing users of the Elasticsearch API find ES|QL much easier to use. We now expect that existing SQL users, particularly Spatial SQL users, will find that ES|QL feels very familiar to what they are used to seeing. Why not SQL? What about Elasticsearch SQL? It has been around for a while and has some geospatial features. However, Elasticsearch SQL was written as a wrapper on top of the original Query API, which meant only queries that could be transpiled down to the original API were supported. ES|QL does not have this limitation. Being a completely new stack allows for many optimizations that were not possible in SQL. Our benchmarks show ES|QL is very often faster than the Query API , particularly with aggregations! Differences to SQL Clearly, from the previous example, ES|QL is somewhat similar to SQL, but there are some important differences. For example, ES|QL is a piped query language, starting with a source command like FROM and then chaining all subsequent commands together with the pipe | character. This makes it very easy to understand how each command receives a table of data and performs some action on that table, such as filtering with WHERE , adding columns with EVAL , or performing aggregations with STATS . Rather than starting with SELECT to define the final output columns, there can be one or more KEEP commands, with the last one specifying the final output results. This structure simplifies reasoning about the query. Focusing in on the WHERE command in the above example, we can see it looks quite similar to the PostGIS example: ES|QL PostGIS Aside from the difference in string quotation characters, the biggest difference is in how we type-cast the string to a spatial type. In PostGIS, we use the ::geometry suffix, while in ES|QL, we use the ::geo_shape suffix. This is because ES|QL runs within Elasticsearch, and the type-casting operator :: can be used to convert a string to any of the supported ES|QL types , in this case, a geo_shape . Additionally, the geo_shape and geo_point types in Elasticsearch imply the spatial coordinate system known as WGS84, more commonly referred to using the SRID number 4326. In PostGIS, this needs to be explicit, hence the use of the SRID=4326; prefix to the WKT string. If that prefix is removed, the SRID will be set to 0, which is more like the Elasticsearch types cartesian_point and cartesian_shape , which are not tied to any specific coordinate system. Both ES|QL and PostGIS provide type conversion function syntax as well: ES|QL PostGIS OGC functions Elasticsearch 8.14 introduces the following four OGC spatial search functions: ES|QL PostGIS Description ST_INTERSECTS ST_Intersects Returns true if two geometries intersect, and false otherwise. ST_DISJOINT ST_Disjoint Returns true if two geometries do not intersect, and false otherwise. The inverse of ST_INTERSECTS. ST_CONTAINS ST_Contains Returns true if one geometry contains another, and false otherwise. ST_WITHIN ST_Within Returns true if one geometry is within another, and false otherwise. The inverse of ST_CONTAINS. These function behave similarly to their PostGIS counterparts, and are used in the same way. For example, ST_INTERSECTS returns true if two geometries intersect and false otherwise. If you follow the documentation links in the above table, you might notice that all the ES|QL examples are within a WHERE clause after a FROM clause, while all the PostGIS examples are using literal geometries. In fact, both platforms support using the functions in any part of the query where they make sense. The first example in the PostGIS documentation for ST_INTERSECTS is: The ES|QL equivalent of this would be: Note how we did not specify the SRID in the PostGIS example. This is because in PostGIS when using the geometry type, all calculations are done on a planar coordinate system, and so if both geometries have the same SRID, it does not matter what the SRID is. In Elasticsearch, this is also true for most functions, however, there are exceptions where geo_shape and geo_point use spherical calculations, as we'll see in the next blog about spatial distance search. ES|QL versatility So, we've seen examples above for using spatial functions in WHERE clauses, and in ROW commands. Where else would they make sense? One very useful place is in the EVAL command. This command allows you to evaluate an expression and return the result. For example, let's determine if the centroids of all airports grouped by their country names are within a boundary outlining the country: The results are expected, the centroid of UK airports are within the UK boundary, and not within the Iceland boundary, and vice versa: centroid count in_uk in_iceland within_uk within_iceland POINT (-21.946634463965893 64.13187285885215) 1 false true false true POINT (-2.597342072712148 54.33551226578214) 17 true false true false POINT (0.04453958108176276 23.74658354606057) 873 false false false false In fact, these functions can be used in any part of the query where their signature makes sense. They all take two arguments, which are either a literal spatial object or a field of a spatial type, and they all return a boolean value. One important consideration is that the coordinate reference system (CRS) of the geometries must match, or an error will be returned. This means you cannot mix geo_shape and cartesian_shape types in the same function call. You can, however, mix geo_point and geo_shape types, as the geo_point type is a special case of the geo_shape type, and both share the same coordinate reference system. The documentation for each of the functions defined above lists the supported type combinations. Additionally, either argument can be a spatial literal or a field, in either order. You can even specify two fields, two literals, a field and a literal, or a literal and a field. The only requirement is that the types are compatible. For example, this query compares two fields in the same index: The query basically asks if the city location is within the city boundary, which should generally be true, but there are always exceptions: cardinality count in_city few 29 false many 740 true A far more interesting question would be whether the airport location is within the boundary of the city that the airport serves. However, the airport location resides in a different index than the one containing the city boundaries. This requires a method to effectively query and correlate data from these two separate indexes. Spatial joins ES|QL does not support JOIN commands, but you can achieve a special case of a join using the ENRICH command , which behaves similarly to a 'left join' in SQL. This command operates akin to a 'left join' in SQL, allowing you to enrich results from one index with data from another index based on a spatial relationship between the two datasets. For example, let's enrich the results from a table of airports with additional information about the city they serve by finding the city boundary that contains the airport location, and then perform some statistics on the results: This returns the top 5 regions with the most airports, along with the centroid of all the airports that have matching regions, and the range in length of the WKT representation of the city boundaries within those regions: centroid count min_wkt max_wkt region POINT (-32.56093470960719 32.598117914802714) 90 207 207 null POINT (-73.94515332765877 40.70366442203522) 9 438 438 City of New York POINT (-83.10398317873478 42.300230911932886) 9 473 473 Detroit POINT (-156.3020245861262 20.176383580081165) 5 307 803 Hawaii POINT (-73.88902732171118 45.57078813901171) 4 837 837 Montréal So, what really happened here? Where did the supposed JOIN occur? The crux of the query lies in the ENRICH command: This command instructs Elasticsearch to enrich the results retrieved from the airports index, and perform an intersects join between the city_location field of the original index, and the city_boundary field of the airport_city_boundaries index, which we used in a few examples earlier. But some of this information is not clearly visible in this query. What we do see is the name of an enrich policy city_boundaries , and the missing information is encapsulated within that policy definition. Here we can see that it will perform a geo_match query ( intersects is the default), the field to match against is city_boundary , and the enrich_fields are the fields we want to add to the original document. One of those fields, the region was actually used as the grouping key for the STATS command, something we could not have done without this 'left join' capability. For more information on enrich policies, see the enrich documentation . While reading those documents, you will notice that they describe using the enrich indexes for enriching data at index time, by configuring ingest pipelines. This is not required for ES|QL, as the ENRICH command works at query time. It is sufficient to prepare the enrich index with the necessary data and enrich policy, and then use the ENRICH command in your ES|QL queries. You may also notice that the most commonly found region was null . What could this imply? Recall that I likened this command to a 'left join' in SQL, meaning if no matching city boundary is found for an airport, the airport is still returned but with null values for the fields from the airport_city_boundaries index. It turns out there were 89 airports that found no matching city_boundary , and one airport with a match where the region field was null . This lead to a count of 90 airports with no region in the results. Another interesting detail is the need for the MV_EXPAND command. This is necessary because the ENRICH command may return multiple results for each input row, and MV_EXPAND helps to separate these results into multiple rows, one for each outcome. This also clarifies why \"Hawaii\" shows different min_wkt and max_wkt results: there were multiple regions with the same name but different boundaries. Kibana Maps Kibana has added support for Spatial ES|QL in the Maps application. This means that you can now use ES|QL to search for geospatial data in Elasticsearch, and visualize the results on a map. There is a new layer option in the add layers menu, called \"ES|QL\". Like all of the geospatial features described so far, this is in \"technical preview\". Selecting this option allows you to add a layer to the map based on the results of an ES|QL query. For example, you could add a layer to the map that shows all the airports in the world. Or you could add a layer that shows the polygons from the airport_city_boundaries index, or even better, how about that complex ENRICH query above that generates statistics for how many airports are in each region? What's next You might have noticed in two of the examples above we squeezed in yet another spatial function ST_CENTROID_AGG . This is an aggregating function used in the STATS command, and the first of many spatial analytics features we plan to add to ES|QL. We'll blog about it when we've got more to show! Before that, we want to tell you more about a particularly exciting feature we've worked on: the ability to perform spatial distance searches, one of the most used spatial search features of Elasticsearch. Can you imagine what the syntax for distance searches might look like? Perhaps similar to an OGC function? Stay tuned for the next blog in this series to find out! Spoiler alert: Elasticsearch 8.15 has just been released, and spatial distance search with ES|QL is included! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Searching for geospatial data Why not SQL? Differences to SQL OGC functions ES|QL versatility Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch geospatial search with ES|QL - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-geospatial-search-part-one","meta_description":"Learn how to perform geospatial searches with Elasticsearch's ES|QL. Explore  geospatial queries, spatial joins & visualize results in Kibana."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch heap size usage and JVM garbage collection Exploring Elasticsearch heap size usage and JVM garbage collection, including best practices and how to resolve issues when heap memory usage is too high or when JVM performance is not optimal. How To KB By: Kofi Bartlett On April 22, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. The heap size is the amount of RAM allocated to the Java Virtual Machine of an Elasticsearch node. As of version 7.11, Elasticsearch by default automatically sets the JVM heap size based on a node’s roles and total memory. Using the default sizing is recommended for most production environments. However, if you want to manually set your JVM heap size, as a general rule you should set -Xms and -Xmx to the SAME value, which should be 50% of your total available RAM subject to a maximum of (approximately) 31GB. A higher heap size will give your node more memory for indexing and search operations. However, your node also requires memory for caching, so using 50% maintains a healthy balance between the two. For this same reason in production you should avoid using other memory intensive processes on the same node as Elasticsearch. Typically, the heap usage will follow a saw tooth pattern, oscillating between around 30 and 70% of the maximum heap being used. This is because the JVM steadily increases heap usage percentage until the garbage collection process frees up memory again. High heap usage occurs when the garbage collection process cannot keep up. An indicator of high heap usage is when the garbage collection is incapable of reducing the heap usage to around 30%. In the image above, you can see a normal sawtooth of JVM heap. You will also see that there are two types of garbage collections, young and old GC. In a healthy JVM, garbage collection should ideally meet the following conditions: Young GC is processed quickly (within 50 ms). Young GC is not frequently executed (about 10 seconds). Old GC is processed quickly (within 1 second). Old GC is not frequently executed (once per 10 minutes or more). How to resolve when heap memory usage is too high or when JVM performance is not optimal There can be a variety of reasons why heap memory usage can increase: Oversharding Please see the document on oversharding here . Large aggregation sizes In order to avoid large aggregation sizes, keep the number of aggregation buckets (size) in your queries to a minimum. You can use slow query logging (slow logs) and implement it on a specific index using the following. Queries that take a long time to return results are likely to be the resource-intensive ones. Excessive bulk index size If you are sending large requests, then this can be a cause of high heap consumption. Try reducing the size of the bulk index requests. Mapping issues In particular, if you use “fielddata: true” then this can be a major user of your JVM heap. Heap size incorrectly set The heap size can be manually defined by: Setting the environment variable: Editing the jvm.options file in your Elasticsearch configuration directory: The environmental variable setting takes priority over the file setting. It is necessary to restart the node for the setting to be taken into account. JVM new ratio incorrectly set It is generally NOT necessary to set this, since Elasticsearch sets this value by default. This parameter defines the ratio of space available for “new generation” and “old generation” objects in the JVM. If you see that old GC is becoming very frequent, you can try specifically setting this value in jvm.options file in your Elasticsearch config directory. What are the best practices for managing heap size usage and JVM garbage collection in a large Elasticsearch cluster? The best practices for managing heap size usage and JVM garbage collection in a large Elasticsearch cluster are to ensure that the heap size is set to a maximum of 50% of the available RAM, and that the JVM garbage collection settings are optimized for the specific use case. It is important to monitor the heap size and garbage collection metrics to ensure that the cluster is running optimally. Specifically, it is important to monitor the JVM heap size, garbage collection time, and garbage collection pauses. Additionally, it is important to monitor the number of garbage collection cycles and the amount of time spent in garbage collection. By monitoring these metrics, it is possible to identify any potential issues with the heap size or garbage collection settings and take corrective action if necessary. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to How to resolve when heap memory usage is too high or when JVM performance is not optimal Oversharding Large aggregation sizes Excessive bulk index size Mapping issues Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch heap size usage and JVM garbage collection - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-heap-size-jvm-garbage-collection","meta_description":"Exploring Elasticsearch heap size usage and JVM garbage collection, including best practices and how to resolve issues when heap memory usage is too high or when JVM performance is not optimal."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Building multilingual RAG with Elastic and Mistral Building a multilingual RAG application using Elastic and Mixtral 8x22B model Integrations Generative AI Vector Database Python How To GL By: Gustavo Llermaly On August 2, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Mixtral 8x22B is the most performant open model, and one of its most powerful features is fluency in many languages; including English, Spanish, French, Italian, and German. Imagine a multinational company with support tickets and solutions in different languages and wants to take advantage of that knowledge across divisions. Currently, knowledge is limited to the language the agent speaks. Let's fix that! In this article, I’m going to show you how to test Mixtral’s language capabilities, by creating a multilingual RAG system. You can follow the notebook to reproduce this article's example here Steps Creating embeddings endpoint Creating mappings Indexing data Asking questions Creating embeddings endpoint Our support tickets for this example will come in English, Spanish, and German. The Mistral embeddings model is not multilingual, but we can generate multilingual embeddings using the e5 model, so we can index text on different languages and manage it as a single source, giving us a much richer context. To create e5 multilingual embeddings you can use Kibana: Or the _inference API: Creating Mappings For the mappings we will use semantic_text mapping type, which is one of my favorite features. It handles the process of chunking the data, generating embeddings, and querying embeddings for you! We call the text field super_body because with a single mapping type it will handle chunks and embeddings. Indexing data We will index a couple of support tickets with problems and solutions in two languages, and then ask a question about problems within many documents in a third. The following documents will be added to the index: 1. English Support Ticket: Calendar Sync Issue Support Ticket #EN1234 Subject : Calendar sync not working with Google Calendar Description : I'm having trouble syncing my project deadlines with Google Calendar. Whenever I try to sync, I get an error message saying \"Unable to connect to external calendar service.\" Resolution : The issue was resolved by following these steps: Go to Settings > Integrations Disconnect the Google Calendar integration Clear browser cache and cookies Reconnect the Google Calendar integration Authorize the app again in Google's security settings The sync should now work correctly. If problems persist, ensure that third-party cookies are enabled in your browser settings. 2. German Support Ticket: File Upload Problem Support-Ticket #DE5678 Betreff : Datei-Upload funktioniert nicht Beschreibung : Ich kann keine Dateien mehr in meine Projekte hochladen. Jedes Mal, wenn ich es versuche, bleibt der Ladebalken bei 99% stehen und dann erscheint eine Fehlermeldung. Lösung : Das Problem wurde durch folgende Schritte gelöst: Überprüfen Sie die Dateigröße. Die maximale Uploadgröße beträgt 100 MB. Deaktivieren Sie vorübergehend den Virenschutz oder die Firewall. Versuchen Sie, die Datei im Inkognito-Modus hochzuladen. Wenn das nicht funktioniert, leeren Sie den Browser-Cache und die Cookies. Als letzten Ausweg, versuchen Sie einen anderen Browser zu verwenden. In den meisten Fällen lag das Problem an zu großen Dateien oder an Interferenzen durch Sicherheitssoftware. Nach Anwendung dieser Schritte sollte der Upload funktionieren. 3. Marketing Campaign Ideas (noise) Q3 Marketing Campaign Ideas Social media contest: \"Share Your Productivity Hack\" Users share tips using our software, best entry wins a premium subscription. Webinar series: \"Mastering Project Management\" Invite industry experts to share insights using our tool. Email campaign: \"Unlock Hidden Features\" Series of emails highlighting lesser-known but powerful features. Partner with a productivity podcast for sponsored content. Create a \"Project Management Memes\" social media account for lighter, shareable content. 4. Mitarbeiter des Monats (noise) Mitarbeiter des Monats: Juli 2023 Wir freuen uns, bekannt zu geben, dass Sarah Schmidt zur Mitarbeiterin des Monats Juli gewählt wurde! Sarah hat außergewöhnliche Leistungen in folgenden Bereichen gezeigt: Kundenbetreuung: Sarah hat durchschnittlich 95% positive Bewertungen erhalten. Teamarbeit: Sie hat maßgeblich zur Verbesserung unseres internen Wissensmanagementsystems beigetragen. Innovation: Sarah hat eine neue Methode zur Priorisierung von Support-Tickets vorgeschlagen, die unsere Reaktionszeiten um 20% verbessert hat. Bitte gratulieren Sie Sarah zu dieser wohlverdienten Anerkennung! This is how a document will look like inside Elasticsearch: Asking questions Now, we are going to ask a question in Spanish: Hola, estoy teniendo problemas para ocupar su aplicación, estoy teniendo problemas para sincronizar mi calendario, y encima al intentar subir un archivo me da error. The expectation is retrieving documents #1 and #2, then sending them to the LLM as additional context, and finally, getting an answer in Spanish. Retrieving documents To retrieve the relevant documents, we can use this nice and short query that will run a search on the embeddings, and return the support tickets most relevant to the question. Notes about the parameters set: size: 2 Because we know we want the top 2 documents. excludes For clarity in the response. Documents are short so each one will be one chunk long. Answering the question Now we can call the Mistral completion API using the Python library to answer the question. The answer is in perfect Spanish and on point! Show answer Hola, me alegra que te hayas comunicado con nosotros. Parece que hay dos problemas distintos. En cuanto a la sincronización del calendario, puedes seguir estos pasos para resolver el problema: Ve a Configuración > Integraciones Desconecta la integración del Calendario de Google Borra la caché y las cookies del navegador Vuelve a conectar la integración del Calendario de Google Autoriza de nuevo la aplicación en la configuración de seguridad de Google Si sigues teniendo problemas, asegúrate de que las cookies de terceros están habilitadas en la configuración de tu navegador. En cuanto al problema de subir un archivo, hay varias cosas que puedes probar: Comprueba el tamaño del archivo. El tamaño máximo de carga es de 100 MB. Desactiva temporalmente el antivirus o el cortafuegos. Intenta cargar el archivo en modo incógnito. Si eso no funciona, borra la caché y las cookies del navegador. Como último recurso, prueba a usar un navegador diferente. En la mayoría de los casos, el problema se debe a archivos demasiado grandes o a interferencias causadas por software de seguridad. Al seguir estos pasos, deberías poder cargar el archivo correctamente. ¡Espero que esto te ayude a resolver tus problemas! Si tienes alguna otra pregunta, no dudes en preguntar. Conclusion Mixtral 8x22B is a powerful model that enables us to leverage data sources in different languages, being able to answer, understand, and translate in many languages. This ability– together with multilingual embeddings– allows you to have multilingual support both in the data retrieval and the answer generation stages, removing language barriers entirely. If you are interested on reproducing the examples of this article, you can find the Python Notebook with the requests here Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Creating embeddings endpoint Creating Mappings Indexing data Asking questions Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Building multilingual RAG with Elastic and Mistral - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/building-multilingual-rag-with-elastic-and-mistral","meta_description":"Learn how to build a multilingual RAG application using Mixtral 8x22B model and Elastic."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Dataset translation with LangChain, Python & Vector Database for multilingual insights Learn how to translate a dataset from one language to another and use Elastic's vector database capabilities to gain more insights. Generative AI Vector Database Python How To JG By: Jessica Garson On September 10, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Translating a dataset from one language to another can be a powerful tool. You can gain insights into a dataset you previously might not have been able to, such as detecting new patterns or trends. Using LangChain , you can take a dataset and translate it into the language of your choice. After your dataset has been translated, you can use Elastic’s vector database to gain insight. This blog post will walk you through how to load data into a DataFrame using Pandas , translate the data from one language to another using LangChain , load the translated data into Elasticsearch, and use Elastic’s vector database capabilities to learn more about your dataset. The full code for this example can be found on the Search Labs GitHub repository . Setting up your environment for dataset translation Configure an environment variable for your OpenAI API Key First, you will want to configure an environment variable for your OpenAI API Key, which you can find on the API keys page in OpenAI's developer portal . You will need this API Key to work with LangChain. You can find more information on getting started in the LangChain quick start guide . Mac/Unix: Windows: Set up Elasticsearch This demo uses Elasticsearch version 8.15, but you can use any version of Elasticsearch that is higher than 8.0. If you are new, check out our Quick Start on Elasticsearch and the documentation on the integration between LangChain and Elasticsearch. Python version The version of Python that is used is Python 3.12.1 but you can use any version of Python higher than 3.9. Install the required packages The packages you will be working with are as follows: Jupyter Notebooks to work with the dataset interactively. nest_asyncio for asynchronous execution for processing your dataset. pandas for data manipulation and cleaning of the dataset used. To integrate natural language processing capabilities, you will use the LangChain library. To work with Elasticsearch, you will use Elasticsearch Python Client to connect to Elasticsearch. The langchain-elasticsearch package allows for an extra level of interaction between LangChain and Elasticsearch. You will need to install the TikToken package, which is used under the hood to break text into manageable pieces for efficient further processing. The datasets package will allow you to easily work with a dataset from Hugging Face. You can run the following command in your terminal to install the required packages for this blog post. Dataset The dataset used is a collection of news articles in Spanish , known as the DACSA corpus. Below is sample of what the dataset looks like: You will need to authenticate with Hugging Face to use this dataset. You first will need to create a token . The huggingface-cli is installed when you install the datasets package in the first step. After doing so, you can log in from the command line as follows: If you don't have a token already you will be prompted to create one from the command line interface. Loading a Jupyter notebook You will want to load a Jupyter Notebook to work with your data interactively. To do so, you can run the following command in your terminal. In the right-hand corner, you can select where it says “New” to create a new Jupyter Notebook. Translate dataset column from Spanish to English The code in this section will first load data from a dataset into a Pandas DataFrame and create a subset of the dataset that contains only 25 records. Once your dataset is ready, you can set up a role to allow your model to act as a translator and create an event loop that will translate a column of your dataset from Spanish to English. The subset that is being used is only 25 records to avoid hitting OpenAI’s rate limits. You may need to use batch loading if you are using a larger dataset. Import packages In your Jupyter Notebook, you will first want to import the following packages, including asyncio, which allows you to use async functions, and openai to work with models from OpenAI. Additionally, you will want to import the following packages, which also include getpass to keep your secrets secure and functools, which will help create an event loop to translate your dataset. In this code sample, you will create an event loop, which will allow you to translate many rows of a dataset at once using nest_asyncio . Event loops are a core construct of asyncio , they run within a thread and will execute all tasks inside of a thread. Before you can create an event loop you first need to run the following line of code. Loading in your dataset You can create a variable called ds, which loads in the ELiRF/dacsa dataset in Spanish. Later in this blog post, you will translate this dataset. This dataset contains articles in Catalan and Spanish, but for this blog post, you will only use the records in Spanish. The output will show the different datasets available, what columns each has, and how many rows. Now that the data is loaded, you can translate it into a Pandas DataFrame to make it easier to work with. Since this dataset contains almost 2000 rows, you can create a sample of the dataset to make it smaller for the purposes of this blog post. You can now view the first 5 rows of the dataset to ensure everything has been properly loaded. The output should look something like this: Translating dataset from one language to another You will want to create an async function to create an event loop, allowing you to translate the data seamlessly. Since you will be using GPT-4o, you will want to set a role to tell your model to act like a translator and another to give directions to translate your data from Spanish to English. You will translate the data from two specified columns of the dataset and add new columns with the translated data back to the original dataset. Finally, you can run the event loop and translate the specified column of your dataset. This dataset will now have columns entitled “translated summary” and “translated article” that contains the translation of the summaries and articles loaded. To confirm your data has been translated you can run the head.() method again. You will now see a new column called translated_summary containing the translation of the summary and another column entitled translated_article containing the translations of the articles from Spanish to English. Loading the translated articles into a vector database and searching A vector database allows you to find similar data quickly. It stores vector embeddings, a type of vector data representation that converts words, sentences, and other data into numbers that capture their meaning and relationships. In this section, you will learn how to load data into an Elasticsearch vector database, and perform searches on your newly translated dataset. Authenticate to Elasticsearch Now, you can use the Elasticsearch Python client to establish a secure connection to Elasticsearch. You will want to pass in your Elasticsearch host and port, and API key. Create an index Before you can load your data into a vector database, you must create an index. You will first create a variable to name your index. From there, check to see if an index exists. If one already does exist, it will delete your index and allow you to create a new index without error. Adding embeddings Embeddings leverage a machine learning model to translate text into numbers, allowing you to perform vector searches. You must also set up your index to be used as a vector database. At this point, you will want to set the embedding variable to OpenAI Embeddings . You will also want to specify the model used as text-embedding-3-large . Loading data At this point, you will want to load your translated data into a Python list, allowing you to load the data into the vector database. You can use the LangChain library to turn characters into text and, from there, load the data into a vector database. I chose this method because its ability to handle long documents by splitting text into smaller chunks helps manage memory and processing power efficiently, and the fact that you control how text is split. Performing searches You can now ask questions about the data, such as, \"What happened in Spain?\" You will now be able to get results from your dataset similar to your question. The output you get back should look something like this: You can find the complete output for this query here . Since you are using kNN by default, if you change the value of k, which is the number of global nearest neighbors to retrieve, you will return more values. The output should look similar to the following: You can check out the complete output for this query if needed. You can also adjust the num_candidates field to the number of approximate nearest neighbor candidates on each shard. You can check out our blog post on the subject for more information. If you are looking to tune this a bit more, you may want to check out our documentation on tuning an approximate KNN search. Conclusion This is just the start of how you can utilize Elastic’s vector database capabilities. To learn more about what’s available be sure to check out our resource on the subject . By leveraging LangChain, and Elastic’s vector database capabilities, you can draw insights from a dataset that may contain a language you are not familiar with. In this article, we were able to ask questions regarding specific locations mentioned in the text and receive responses in the translated English text. To dig in deeper on vector database capabilities you may want to check out our tutorial . Additionally you can find another tutorial on working with multilingual datasets on Search Labs as well as this post which walks you through how to build multilingual RAG with Elastic and Mistral. The full code for this example can be found on the Search Labs GitHub repository. Let us know if you built anything based on this blog or if you have questions on our forums and the community Slack channel. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to Setting up your environment for dataset translation Configure an environment variable for your OpenAI API Key Set up Elasticsearch Python version Install the required packages Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Dataset translation with LangChain, Python & Vector Database for multilingual insights - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/dataset-translation-langchain-python-elastic","meta_description":"Learn how to translate datasets from one language to another using LangChain & Python. Then, use Elastic’s vector database to uncover multilingual insights."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to migrate data between different versions of Elasticsearch & between clusters Exploring methods for transferring data between Elasticsearch versions and clusters. How To KB By: Kofi Bartlett On April 14, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. When you want to upgrade an Elasticsearch cluster, it is sometimes easier to create a new, separate cluster and transfer data from the old cluster to the new one. This affords users the advantage of being able to test all of their data and configurations on the new cluster with all of their applications without any risk of downtime or data loss. The disadvantages of that approach are that it requires some duplication of hardware and could create difficulties when trying to smoothly transfer and synchronize all of the data. It may also be necessary to carry out a similar procedure if you need to migrate applications from one data center to another. In this article, we will discuss and detail three ways to transfer data between Elasticsearch clusters. How to migrate data between Elasticsearch clusters? There are 3 ways to transfer data between Elasticsearch clusters: Reindexing from a remote cluster Transferring data using snapshots Transferring data using Logstash Using snapshots is usually the quickest and most reliable way to transfer data. However, bear in mind that you can only restore a snapshot onto a cluster of an equal or higher version and never with a difference of over one major version. That means you can restore a 6.x snapshot onto a 7.x cluster but not an 8.x cluster. If you need to increase by more than one major version, you will need to reindex or use Logstash. Now, let’s look in detail at each of the three options for transferring data between Elasticsearch clusters. 1. Reindexing data from a remote cluster Before starting to reindex, remember that you will need to set up appropriate mappings for all of the indices on the new cluster. To do that, you must either create the indices directly with the appropriate mappings or use index templates. Reindexing from remote — configuration required In order to reindex from remote, you should add the configuration below to the elasticseearch.yml file for the cluster that is receiving the data, which, in Linux systems, is usually located here: /etc/elasticsearch/elasticsearch.yml. The configuration to add is as follows: If you are using SSL, you should add the CA certificate to each node and include the following in the command for each node in elasticsearch.yml: Alternatively, you can add the line below to all Elasticsearch nodes in order to disable SSL verification. However, that approach is less recommended since it is not as secure as the previous option: You will need to make these modifications on every node and carry out a rolling restart. For more information on how to do that, please see our guide . Reindexing command After you have defined the remote host in the elasticsearch.yml file and added the SSL certificates if necessary, you can start reindexing data with the command below: While doing that, you may face timeout errors, so it may be useful to establish generous values for timeouts rather than relying on defaults. Now, let’s take a look at some other common errors that you may encounter when reindexing from remote. Common errors when reindexing from remote 1. Reindexing not whitelisted If you encounter this error, it shows that you did not define the remote host IP address or node name DNS in Elasticsearch as described above or forgot to restart Elasticsearch services. To fix that for the Elasticsearch cluster, you need to add the remote host to all Elasticsearch nodes and restart Elasticsearch services. 2. SSL handshake exception This error means that you forgot to add the reindex.ssl.certificate_authorities to elasticsearch.yml as described above. To add it: 2. Transferring data using snapshots Remember, as mentioned above, you can only restore a snapshot onto a cluster of an equal or higher version and never with a difference of over one major version If you need to increase by more than one major version, you will need to reindex or use Logstash. The following steps are required to transfer data via snapshots: Step 1. Adding the repository plugin to the first Elasticsearch cluster – In order to transfer data between clusters via snapshots, you need to ensure that the repository is accessible from both the new and the old clusters. Cloud storage repositories such as AWS, Google, and Azure are generally ideal for this. To take snapshots, please see our guide and follow the steps it describes. Step 2. Restart Elasticsearch service (rolling restart). Step 3. Create a repository for the first Elasticsearch cluster. Step 4- Add the repository plugin to the second Elasticsearch cluster. Step 5- Add repository as read only to second Elasticsearch cluster – You will need to add a repository by repeating the same steps that you took to create the first Elasticsearch cluster. Important note: When connecting the second Elasticsearch cluster to the same AWS S3 repository, you should define the repository as a read-only repository: That is important because you want to prevent the risk of mixing Elasticsearch versions inside the same snapshot repository. Step 6- Restoring data to the second Elasticsearch cluster – After taking the above steps, you can restore data and transfer it to the new cluster. Please follow the steps described in this article to restore data to the new cluster. 3. Transferring data using Logstash Before starting to transfer the data with logstash, remember that you will need to set up appropriate mappings for all of the indices on the new cluster. To do that, you will need to either create the indices directly or use index templates. To transfer data between two Elasticsearch clusters, you can set up a temporary Logstash server and use it to transfer your data between two clusters. For small clusters, a 2GB ram instance should be sufficient. For larger clusters, you can use four-core CPUs with 8GB RAM. For guidance on installing Logstash, please see here . Logstash configuration for transferring data from one cluster to another A basic configuration to copy a single index from cluster A to cluster B is: For secured elasticsearch, you can use the configuration below: Index metadata The above commands will write to a single named index. If you want to transfer multiple indices and preserve the index names, then you will need to add the following line to the Logstash output: Also if you want to preserve the original ID of the document, then you will need to add: Bear in mind that setting the document ID will make the data transfer significantly slower, so only preserve the original ID if you need to. Synchronization of updates All of the methods described above will take a relatively long period of time, and you might find that data in the original cluster has been updated while waiting for the process to complete. There are various strategies to enable the synchronization of any updates that may have occurred during the data transfer process, and you should give some thought to these issues before starting that process. In particular, you need to think about: What method do you have to identify any data that has been updated/added since the start of the data transfer process (e.g., a “last_update_time” field in the data)? What method can you use to transfer the last piece of data? Is there a risk of records being duplicated? Usually, there is, unless the method you are using sets the document ID during reindexing to a known value). The different methods to enable the synchronization of updates are described below. 1. Use of queueing systems Some ingestion/updating systems use queues that enable you to “replay” data modifications received in the last x days. That may provide a means to synchronize any changes carried out. 2. Reindex from remote Repeat the reindexing process for all items where “last_update_time” > x days ago. You can do this by adding a “query” parameter to the reindex request. 3. Logstash In the Logstash input, you can add a query to filter all items where “last_update_time” > x days ago. However, this process will cause duplicates in non-time-series data unless you have set the document_id. 4. Snapshots It is not possible to restore only part of an index, so you would have to use one of the other data transfer methods described above (or a script) to update any changes that have taken place since the data transfer process was carried out. However, snapshot restore is a much quicker process than reindexing/Logstash, so it may be possible to suspend updates for a brief period of time while snapshots are transferred to avoid the problem altogether. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to 1. Reindexing data from a remote cluster Reindexing from remote — configuration required Reindexing command Common errors when reindexing from remote 1. Reindexing not whitelisted Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to migrate data between different versions of Elasticsearch & between clusters - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-migrate-data-versions-clusters","meta_description":"Exploring methods for transferring data between Elasticsearch versions and clusters."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blogs Developer insights and practical how-to articles from our experts to inspire and empower your search experience Articles Series Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Blogs - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog","meta_description":"Blog articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. Integrations Generative AI JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta On May 20, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Spring AI is now generally available, with its first stable release 1.0 ready for you to download on Maven Central . Let’s use it right away to build a complete AI application, using your favorite LLM and our favorite vector database . What’s Spring AI? Spring AI 1.0 , a comprehensive solution for AI engineering in Java, is now available after a significant development period influenced by rapid advancements in the AI field. The release includes numerous essential new features for AI engineers. Java and Spring are in a prime spot to jump on this whole AI wave. Tons of companies are running their stuff on Spring Boot, which makes it super easy to plug AI into what they're already doing. You can basically link up your business logic and data right to those AI models without too much hassle. Spring AI provides support for various AI models and technologies , such as: Image models : generate images given text prompts. Transcription models : take audio sources and convert them to text. Embedding models: convert arbitrary data into vectors , which are data types optimized for semantic similarity search. Chat models: these should be familiar! You’ve no doubt even had a brief conversation with one somewhere. Chat models are where most of the fanfare seems to be in the AI space, and rightfully so, they're awesome! You can get them to help you correct a document or write a poem. (Just don’t ask them to tell a joke… yet.) They’re awesome, but they do have some issues. Spring AI solutions to AI challenges (The picture shown is used with permission from the Spring AI team lead Dr. Mark Pollack) Let's go through some of these problems and their solutions in Spring AI. Problem Solution Consistency Chat Models are open-minded and prone to distraction You can give them a system prompt to govern their overall shape and structure Memory AI models don’t have memory, so they can’t correlate one message from a given user to another You can give them a memory system to store the relevant parts of the conversation Isolation AI models live in isolated little sandboxes, but they can do really amazing things if you give them access to tools - functions that they can invoke when they deem it necessary Spring AI supports tool calling which lets you tell the AI model about tools in its environment, which it can then ask you to invoke. This multi-turn interaction is all handled transparently for you Private data AI models are smart, but they’re not omniscient! They don't know what's in your proprietary databases - nor we think would you want them to! You need to inform their responses by stuffing the prompts - basically using the all mighty string concatenation operator to put text in the request before the model looks at the question being asked. Background information, if you like. How do you decide what should be sent and what shouldn’t? Use a vector store to select only the relevant data and send it in onward. This is called retrieval augmented generation, or RAG Hallucination AI chat models like to, well, chat! And sometimes they do so so confidently that they can make stuff up You need to use evaluation - using one model to validate the output of another - to confirm reasonable results And, of course, no AI application is an island. Today modern AI systems and services work best when integrated with other systems and services. Model Context Protocol (MCP) makes it possible to connect your AI applications with other MCP-based services, regardless of what language they’re written in. You can assemble all of this in agentic workflows that drive towards a larger goal. The best part? You can do all this while building on the familiar idioms and abstractions any Spring Boot developer will have come to expect: convenient starter dependencies for basically everything are available on the Spring Initializr . Spring AI provides convenient Spring Boot autoconfigurations that give you the convention-over-configuration setup you’ve come to know and expect. And Spring AI supports observability with Spring Boot’s Actuator and the Micrometer project. It plays well with GraalVM and virtual threads, too, allowing you to build super fast and efficient AI applications that scale. Why Elasticsearch Elasticsearch is a full text search engine, you probably know that. So why are we using it for this project? Well, it’s also a vector store! And quite a good one at that, where data lives next to the full text. Other notable advantages: Super easy to set up Opensource Horizontally scalable Most of your organization’s free form data probably already lives in an Elasticsearch cluster Feature complete search engine capability Fully integrated in Spring AI ! Taking everything into consideration, Elasticsearch checks all the boxes for an excellent vector store, so let's set it up and start building our application! Getting started with Elasticsearch We’re going to need both Elasticsearch and Kibana, the UI console you’ll use to interact with the data hosted in the database. You can try everything on your local machine thanks to the goodness of Docker images and the Elastic.co home page . Go there, scroll down to find the curl command, run it and pipe it right into your shell: This will simply pull and configure Docker images for Elasticsearch and Kibana, and after a few minutes you’ll have them up and running on your local machine, complete with connection credentials. You’ve also got two different urls you can use to interact with your Elasticsearch instance. Do as the prompt says and point your browser to http://localhost:5601 . Note the username elastic and password printed on the console, too: you’ll need those to log in (in the example output above they’re respectively elastic and w1GB15uQ ). Pulling the app together Go to the Spring Initializr page and generate a new Spring AI project with the following dependencies: Elasticsearch Vector Store Spring Boot Actuator GraalVM OpenAI Web Make sure to choose the latest-and-greatest version of Java (ideally Java 24 - as of this writing - or later) and the build tool of your choice. We’re using Apache Maven in this example. Click Generate and then unzip the project and import it into your IDE of choice. (We’re using IntelliJ IDEA.) First things first: let’s specify your connection details for your Spring Boot application. In application.properties, write the following: We’ll also Spring AI’s vector store capability to initialize whatever’s needed on the Elasticsearch side in terms of data structures, so specify: We’re going to use OpenAI in this demo, specifically the Embedding Model and Chat Model (feel free to use the service you prefer, as long as Spring AI supports it ). The Embedding Model is needed to create embeddings of the data before we stash it into Elasticsearch. For OpenAI to work, we need to specify the API key : You can define it as an environment variable like SPRING_AI_OPENAI_API_KEY to avoid stashing the credential in your source code. We’re going to upload files, so be sure to customize how much data can be uploaded to the servlet container: We’re almost there! Before we dive into writing the code, let’s get a preview of how this is going to work. On our machine, we downloaded the following file (a list of rules for a board game), renamed it to test.pdf and put it in ~/Downloads/test.pdf . The file will be sent to the /rag/ingest endpoint (replace the path accordingly to your local setup): This might take a few seconds… Behind the scenes, the data’s being sent to OpenAI, which is creating embeddings of the data; that data is then being written to Elasticsearch, both the vectors and the original text. That data, along with all the embeddings therein, is where the magic happens. We can then query Elasticsearch using the VectorStore interface. The full flow looks like this: The HTTP client uploads your PDF of choice to the Spring application. Spring AI takes care of the text extraction from our PDF and chunks each page into 800 character chunks. OpenAI generates the vector representation for each chunk. Both chunked text and the embedding are then stored in Elasticsearch. Last, we’ll issue a query: And we’ll get a relevant answer: Nice! How does this all work? The HTTP client submits the question to the Spring application. Spring AI gets the vector representation of the question from OpenAI. With that embedding it searches for similar documents in the stored Elasticsearch chunks and retrieves the most similar documents. Spring AI then sends the question and retrieved context to OpenAI for generating an LLM answer. Finally, it returns the generated answer and a reference to the retrieved context. Let’s dive into the Java code to see how it really works. First of all, the Main class: it’s a stock standard main class for any ol’ Spring Boot application. Nothing to see there. Moving on… Up next, a basic HTTP controller: The controller is simply calling a service we’ve built to handle ingesting files and writing them to the Elasticsearch vector store, and then facilitating queries against that same vector store. Let’s look at the service: This code handles all the ingest: given a Spring Framework Resource , which is a container around bytes, we read the PDF data (presumed to be a .PDF file - make sure that you validate as much before accepting arbitrary inputs!) using Spring AI’s PagePdfDocumentReader and then tokenize it using Spring AI’s TokenTextSplitter , finally adding the resulting List<Document> s to the VectorStore implementation, ElasticsearchVectorStore . You can confirm as much using Kibana: after sending a file to the /rag/ingest endpoint, open up your browser to localhost:5601 and in the side menu on the left navigate to Dev Tools . There you can issue queries to interact with the data in the Elasticsearch instance. Issue a query like this: Now for the fun stuff: how do we get that data back out again in response to user queries? Here’s a first cut at an implementation of the query, in a method called directRag . The code’s fairly straightforward, but let’s break it down into multiple steps: Use the VectorStore to perform a similarity search. Given all the results, get the underlying Spring AI Document s and extract their text, concatenating them all into one result. Send the results from the VectorStore to the model, along with a prompt instructing the model what to do with them and the question from the user. Wait for the response and return it. This is RAG - retrieval augmented generation. It’s the idea that we’re using data from a vector store to inform the processing and analysis done by the model. Now that you know how to do it, let’s hope you never have to! Not like this anyway: Spring AI’s Advisors are here to simplify this process even more. Advisors allows you to pre- and post-process a request to a given model, other than providing an abstraction layer between your application and the vector store. Add the following dependency to your build: Add another method called advisedRag(String question) to the class: All the RAG-pattern logic is encapsulated in the QuestionAnswerAdvisor . Everything else is just as any request to a ChatModel would be! Nice! Conclusion In this demo, we used Docker images and did everything on our local machine, but the goal here is to build production-worthy AI systems and services. There are several things you could do to make that a reality. First of all, you can add Spring Boot Actuator to monitor the consumption of tokens. Tokens are a proxy for the complexity (and sometimes the dollars-and-cents) cost of a given request to the model. You’ve already got the Spring Boot Actuator on the classpath, so just specify the following properties to show all the metrics (captured by the magnificent Micrometer.io project): Restart your application. Make a query, and then go to: http://localhost:8080/actuator/metrics . Search for “ token ” and you’ll see information about the tokens being used by the application. Make sure you keep an eye on this. You can of course use Micrometer’s integration for Elasticsearch to push those metrics and have Elasticsearch act as your time series database of choice, too! You should then consider that every time we make a request to a datastore like Elasticsearch, or to OpenAI, or to other network services, we’re doing IO and - often - that IO blocks the threads on which it executes. Java 21 and later ship with non-blocking virtual threads that dramatically improve scalability. Enable it with: And, finally, you’ll want to host your application and your data in a place where it can thrive and scale. We're sure you’ve probably already thought about where to run your application, but where will you host your data? May we recommend the Elastic Cloud ? It’s secure, private, scalable, and full of features. Our favorite part? If you want, you can get the serverless edition where Elastic wears the pager, not you! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Jump to What’s Spring AI? Spring AI solutions to AI challenges Why Elasticsearch Getting started with Elasticsearch Pulling the app together Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Spring AI and Elasticsearch as your vector database - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/spring-ai-elasticsearch-application","meta_description":"Building a complete AI application using Spring AI and Elasticsearch.\n"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. Developer Experience Inside Elastic DT By: Drew Tate On May 22, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. It’s easy for us developers to take good autocomplete for granted. It just works—until you try building it yourself. This post is about a recent rearchitecture we performed to support continued evolution in ES|QL. A little about ES|QL In case you haven’t heard, ES|QL is Elastic’s new query language. It is super powerful and we see it as the future of how AI agents, applications, and humans will talk to Elastic. So, we provide an ES|QL editing experience in several places in Kibana including the Discover and Dashboard applications. ES|QL in Discover To understand the rearchitecture, it’s key to understand a few language components. An ES|QL query consists of a series of commands chained together to perform a pipeline of operations. Here, we are joining the data from one index to another index: In the example above, FROM , LOOKUP JOIN , and SORT are the commands. Commands can have major subcomponents (call them subcommands), generally identified by a second keyword before the next pipe character (for example, METADATA in the example above). Like commands, subcommands have their own semantic rules governing what comes after the keyword. ES|QL also has functions which look like you’d expect. See AVG in the example below: Autocomplete is an important feature for enabling users to learn ES|QL. Autocomplete 1.0 Our autocomplete engine was originally built with a few defining characteristics. Declarative — Used static declarations to describe commands Generic — Relied heavily on generic logic meant to apply to most/all language contexts Reified subcommands — Treated subcommands as first-class abstractions with their own logic Within the top-level suggestion routine, our code analyzed the query, detecting the general area of the user’s cursor. It then branched into one of several subroutines, corresponding to language subcomponents. The semantics of both commands and subcommands were described declaratively using a “command signature.” This defined a pattern of things that could be used after the command name. It might say “accept any number of boolean expressions,” or “accept a string field and then a numeric literal.” If the first analysis identified the cursor as being within a command or subcommand, the corresponding branch would then try to match the (sub)command signature with the query and figure out what to suggest in a generic way. The cracks start to show At first, this architecture worked. Early on, commands in ES|QL were relatively uniform. They looked basically like: But, as time went on, they started to get more bespoke. A couple of issues showed up and grew with every new command. Code complexity —the autocomplete code became large, complicated, and difficult to follow. It wasn’t clear which parts of the logic applied to which commands. Lack of orthogonality —a change in the behavior in one area of the language often had side-effects in other parts of the language. For example, adding a comma suggestion to the field list in KEEP , accidentally created a comma suggestion after the field in DISSECT — which is invalid. The problem was that new syntax and behaviors led our “generic” code to need more and more command-specific branches, and our command definitions to need more and more “generic” settings (that really only applied to a single command). Gradually, the idea that we could describe the nuances of each command’s structure and behavior with a declarative interface started to look a bit idealistic. Timing the investment When is it time to invest in a refactor? The answer is very contextual. You have to weigh the upsides against the cost. Truth be told, you can generally keep paying the price of inefficiencies for quite awhile— and it can make sense. One way to stave off a refactor is by treating the symptoms. We did this for months. We treated our code complexity with verbose comments. We treated our lack of orthogonality with better test coverage and careful manual testing. But there comes a point where the cost of patching outweighs the cost of change. Ours came with the introduction of a fabulous new ES|QL feature, filtering by aggregation. The WHERE command has existed since the early days, but this new feature added the ability to use WHERE as a sub command in STATS . This may look like a small change, but it broke the architecture’s careful delineation between commands and subcommands. Now, we had a command that could also be a subcommand. With this fundamental abstraction break added to all the existing inefficiencies, we decided it was time to invest. Autocomplete 2.0 ES|QL isn’t a generic language, it is a query language. So we decided it was time to accept that commands are bespoke by design (in accordance with grand query language tradition). The new architecture needed to be flexible and adaptive and it needed to be clear what code belonged to which command. This meant a system that was: Imperative — Instead of declaring what was acceptable after the command name and separately interpreting the declaration, we write the logic to check the correctness of the command directly. Command-specific — Each command gets its own logic. There is no generic routine that is supposed to work for all the commands. In Autocomplete 1.0, the up-front triage did a lot of work. Now, it just decides whether or not the cursor is already within a command. If within a command, it delegates straight to the command-specific suggest method. The bulk of the work now happens within the command’s logic, which is given complete control over suggestions within that command. This doesn’t mean that commands don’t share logic. They often delegate suggestion creation and even some triage steps to reusable subroutines (for example, if the cursor is within an ES|QL function). But, they retain the flexibility to customize the behavior in any way. Giving each command its own suggestion method improves isolation and reduces side effects, while making it obvious what code applies to which command. It’s still about the user There is no question that this refactor has resulted in a better developer experience. Everyone who interacted with both systems can attest that this is a breath of fresh air. But, at the end of the day, we made this investment in service of our users. First of all, some ES|QL features couldn’t be reasonably supported without it. Our users expect quality suggestions when they are writing ES|QL. Now, we can deliver in more contexts. The old system made it easy to introduce regressions. Now, we expect fewer of these. One of our team’s biggest roles is adding support for upcoming commands. Now, we can do this much faster. The work isn’t over, but we’ve created a system that supports change instead of resisting it. With this investment, we’ve laid a solid foundation to keep the language and the editor evolving into the future, side by side. Report an issue Related content Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Jump to A little about ES|QL Autocomplete 1.0 The cracks start to show Timing the investment Autocomplete 2.0 Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How we rebuilt autocomplete for ES|QL - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-autocomplete-rebuilt","meta_description":"How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial AutoOps Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe AutoOps January 2, 2025 Leveraging AutoOps to detect long-running search queries Learn how AutoOps helps you investigate long-running search queries plaguing your cluster to improve search performance. VC By: Valentin Crettaz AutoOps December 18, 2024 Resolving high CPU usage issues in Elasticsearch with AutoOps How AutoOps pinpointed and resolved high CPU usage in an Elasticsearch cluster: A step-by-step case study. MD By: Musab Dogan AutoOps How To November 20, 2024 Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. SF By: Sachin Frayne AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"AutoOps - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/autoops","meta_description":"AutoOps articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. Integrations How To TP By: Tom Potoma On May 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Red Hat validated pattern frameworks use GitOps for seamless provisioning of all operators and applications on Red Hat OpenShift. The Elasticsearch vector database is now officially supported by The ‘AI Generation with LLM and RAG’ Validated Pattern . This allows developers to jumpstart their app development using Elastic's vector database for retrieval-augmented generation (RAG) applications on OpenShift, combining the benefits of Red Hat's container platform with Elastic's vector search capabilities. Getting started with Elastic in the Validated Pattern Let's walk through setting up the pattern with Elasticsearch as your vector database: Prerequisites Podman installed on your local system An OpenShift cluster running in AWS Your OpenShift pull secret OpenShift CLI ( oc ) installed An installation configuration file Step 1: Fork the repository Create a fork of the rag-llm-gitops repository. Step 2: Clone the forked repository Clone your forked repository and go to the root directory of the repository. Step 3: Configure and deploy Create a local copy of the secret values file: Configure the pattern to use Elasticsearch by editing the values-global.yaml file: IF NECESSARY: Configure AWS settings (if your cluster is in an unsupported region ): Add GPU nodes to your cluster: Install the pattern: The installation process automatically deploys: Pattern operator components HashiCorp Vault for secrets management Elasticsearch operator and cluster RAG application UI and backend Step 4: Verify deployment After installation completes, check that all components are running. In the OpenShift web console, go to the Workloads > Pods menu. Select the rag-llm project from the drop-down. The following pods should be up and running: Alternatively, you can check via the CLI: You should see pods including: elastic-operator - The Elasticsearch operator es-vectordb-es-default-0 - The Elasticsearch cluster ui-multiprovider-rag-redis - The RAG application UI (despite the name, it uses the configured database type, which in our case is Elastic) Step 5: Try out the application Navigate to the UI in your browser to start generating content with your RAG application backed by Elasticsearch. From any page of your OpenShift console, click on the Application Menu and select the application: Then: Select your configured LLM provider, or configure your own When configuring with OpenAI, the application appends the appropriate endpoint. So, in the ‘URL’ field, provide ‘https://api.openai.com/v1’ rather than ‘https://api.openai.com/v1/chat/completions’ Enter the ‘Product’ as ‘RedHat OpenShift AI’ Click “Generate” Watch as the Proposal is created for you in real-time So what just happened? When you deploy the pattern with Elasticsearch, here's what happens behind the scenes: The Elasticsearch operator is deployed to manage Elasticsearch resources An Elasticsearch cluster is provisioned with vector search capabilities Sample data is processed and stored as vector embeddings in Elasticsearch The RAG application is configured to connect to Elasticsearch for retrieval When you generate content, the application queries Elasticsearch to find relevant context for the LLM What's next? This initial integration showcases just the beginning of what's possible when you combine Elasticsearch vector search with OpenShift AI. Elastic brings rich information retrieval capabilities that make it ideal for production RAG applications, and we are considering the following for future enhancement: Advanced semantic understanding - Utilize Elastic's ELSER model for more accurate retrieval without fine-tuning Intelligent data processing using Elastic's native text chunking and preprocessing capabilities Hybrid search superiority - Combine vector embeddings with traditional keyword search and BM25 ranking for the most relevant results Production-ready monitoring - Leverage Elastic's comprehensive observability stack to monitor RAG application performance and gain insights into LLM usage patterns We welcome feedback and contributions as we continue to bring powerful vector search capabilities to OpenShift AI applications! If you are at Red Hat Summit 2025, stop by Booth #1552 to learn more about Elastic! Resources: https://validatedpatterns.io/patterns/rag-llm-gitops/ https://validatedpatterns.io/patterns/rag-llm-gitops/deploying-different-db/ Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Getting started with Elastic in the Validated Pattern Prerequisites Step 1: Fork the repository Step 2: Clone the forked repository Step 3: Configure and deploy Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/red-hat-openshift-validated-pattern-elasticsearch","meta_description":"The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. Developer Experience Javascript How To JR By: Jeffrey Rengifo On May 19, 2025 Part of Series Elasticsearch in JavaScript the proper way Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This is the second part of our Elasticsearch in JavaScript series. In the first part, we learned how to set up our environment correctly, configure the Node.js client, index data and search. In this second part, we will learn how to implement production best practices and run the Elasticsearch Node.js client in Serverless environments. We will review: Production best practices Error handling Testing Serverless environments Running the client on Elastic Serverless Running the client on function-as-a-service environment You can check the source code with the examples here . Production best practices Error handling A useful feature of the Elasticsearch client in Node.js is that it exposes objects for the possible errors in Elasticsearch so you can validate and handle them in different ways. To see them all , run this: Let’s go back to the search example and handle some of the possible errors: ResponseError in particular, will occur when the answer is 4xx or 5xx , meaning the request is incorrect or the server is not available. We can test this type of error by generating wrong queries, like trying to do a term query on a text-type field: Default error: Customized error: We can also capture and handle each type of error in a certain way. For example, we can add retry logic in a TimeoutError . Testing Tests are key in guaranteeing the app's stability. To test the code in a way that is isolated from Elasticsearch, we can use the library elasticsearch-js-mock when creating our cluster. This library allows us to instantiate a client that is very similar to the real one but that will answer to our configuration by only replacing the client’s HTTP layer with a mock one while keeping the rest the same as the original. We’ll install the mocks library and AVA for automated tests. npm install @elastic/elasticsearch-mock npm install --save-dev ava We’ll configure the package.json file to run the tests. Make sure it looks this way: Let’s now create a test.js file and install our mock client: Now, add a mock for semantic search: We can now create a test for our code, making sure that the Elasticsearch part will always return the same results: Let’s run the tests. npm run test Done! From now on, we can test our app focusing 100 % on the code and not on external factors. Serverless environments Running the client on Elastic Serverless We covered running Elasticsearch on Cloud or on-prem; however, the Node.js client also supports connections to Elastic Cloud Serverless . Elastic Cloud Serverless allows you to create a project where you don’t need to worry about infrastructure since Elastic handles that internally, and you only need to worry about the data you want to index and how long you want to have access to it. From a usage perspective, Serverless decouples compute from storage, providing autoscaling features for both search and indexing . This allows you to only grow the resources you actually need. The client makes the following adaptations to connect to Serverless: Turns off sniffing and ignores any sniffing-related options Ignores all nodes passed in config except the first one, and ignores any node filtering and selecting options Enables compression and `TLSv1_2_method` (same as when configured for Elastic Cloud) Adds an `elastic-api-version` HTTP header to all requests Uses `CloudConnectionPool` by default instead of `WeightedConnectionPool` Turns off vendored `content-type` and `accept` headers in favor of standard MIME types To connect your serverless project, you need to use the parameter serverMode: serverless. Running the client on function-as-a-service environment In the example, we used a Node.js server, but you can also connect using a function-as-a-service environment with functions like AWS lambda, GCP Run, etc. Another example is to connect to services like Vercel, which is also serverless. You can check this complete example of how to do this, but the most relevant part of the search endpoint looks like this: This endpoint lives in the folder /api and is run from the server’s side so that the client only has control over the “text” parameter that corresponds to the search term. The implication of using function-as-a-service is that, unlike a server running 24/7, functions only bring up the machine that runs the function, and once it is finished, the machine goes into rest mode to consume fewer resources. This configuration can be convenient if the application does not get too many requests; otherwise, the costs can be high. You also need to consider the lifecycle of functions and the run times (which could only be seconds in some cases). Conclusion In this article, we learned how to handle errors, which is crucial in production environments. We also covered testing our application while mocking the Elasticsearch service, which provides reliable tests regardless of the cluster’s state and lets us focus on our code. Finally, we demonstrated how to spin up a fully serverless stack by provisioning both Elastic Cloud Serverless and a Vercel application. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Production best practices Error handling Testing Serverless environments Running the client on Elastic Serverless Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch in JavaScript the proper way, part II - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/how-to-use-elasticsearch-in-javascript-part-ii","meta_description":"Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. Search Relevance DW By: Daniel Wrigley On May 20, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In today’s digital age, search engines are the backbone of how we access information. Whether it’s a web search engine, an e-commerce site, an internal enterprise search tool, or a Retrieval Augmented Generation (RAG) system, the quality of search results directly impacts user satisfaction and engagement. But what ensures that search results meet user expectations? Enter the judgment list , a tool to evaluate and refine search result quality. At OpenSource Connections , our experts regularly help clients create and use judgment lists to improve their user search experience. In this post, we’ll explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. Why do you need a judgment list? Judgment lists play a crucial role in the continuous cycle of search result quality improvement. They provide a reliable benchmark for evaluating search relevance by offering a curated set of assessments on whether search results truly meet user needs. Without high-quality judgment lists, search teams would struggle to interpret feedback from users and automated signals, making it difficult to validate hypotheses about improving search. For example, if a team hypothesizes that hybrid search will increase relevance and expects a 2% increase in click-through rate (CTR), they need judgment lists to compare before-and-after performance meaningfully. These lists help ground experimentation results in objective measures, ensuring that changes positively impact business outcomes before they are rolled out broadly. By maintaining robust judgment lists, search teams can iterate with confidence, refining the search experience in a structured, data-driven way. A judgment list is a curated set of search queries paired with relevance ratings for their corresponding results, also known as a test collection. Metrics computed using this list act as a benchmark for measuring how well a search engine performs. Here’s why it’s indispensable: Evaluating search algorithms: It helps determine whether a search algorithm is returning the most relevant results for a given query. Measuring improvements or regressions: When you make changes to your search engine, a judgment list can quantify the impact of those changes on result quality. Providing insights into user satisfaction: By simulating expected outcomes, a judgment list aligns system performance with user needs. Helping product development: By making product requirements explicit, a judgment list supports search engineers implementing them. For example, if a user searches for “best smartphones under $500,” your judgment list can indicate whether the results not only list relevant products but also cater to the query intent of affordability and quality. Judgment lists are used in offline testing. Offline testing enables rapid, cost-effective iterations before committing to time-consuming live experiments like A/B testing. Ideally, combining both online and offline testing maximizes experimentation efficiency and ensures robust search improvements. What is a judgment? At its core, a judgment is a rating of how relevant a search result is for a specific query. Judgments can be categorized into two main types: binary judgments and graded judgments. Binary judgments Results are labeled as either relevant (1) or not relevant (0). Example: A product page returned for the query “wireless headphones” either matches the query intent or it doesn’t. Use case: Binary judgments are simple and useful for queries with clear-cut answers. Graded judgments Results are assigned a relevance score on a scale (e.g., 0 to 3), with each value representing a different level of relevance: 0: Definitely irrelevant 1: Probably irrelevant 2: Probably relevant 3: Definitely relevant Example: A search result for “best laptops for gaming” might score: 3 for a page listing laptops specifically designed for gaming, 2 for a page featuring laptops that could be suitable for gaming, 1 for gaming-related accessories, and 0 for items unrelated to gaming laptops. Scales can also be categorical rather than numeric, for example : Exact Substitute Complement Irrelevant Use case: Graded judgments are ideal for queries requiring a nuanced evaluation of relevance, beyond a simple binary determination of “relevant” or “not relevant.” This approach accommodates scenarios where multiple factors influence relevance. Some evaluation metrics explicitly require more than a binary judgment. We use graded judgments when we want to model specific information-seeking behaviors and expectations in our evaluation metric. For example, the gain based metrics, Discounted Cumulative Gain (DCG), normalized Discounted Cumulative Gain (nDCG), and Expected Reciprocal Rank (ERR) , model a user whose degree of satisfaction with a result can be greater or lesser, while still being relevant. This is useful for those who are researching and gathering information to make a decision. Example of a judgment list Let’s consider an example of a judgment list for an e-commerce search engine: Query Result URL Relevance wireless headphones /products/wireless-headphones-123 3 wireless headphones /products/noise-cancelling-456 3 best laptops for gaming /products/gaming-laptops-789 3 best laptops for gaming /products/ultrabook-321 2 In this list: The query “wireless headphones” evaluates the relevance of two product pages, with scores indicating how well the result satisfies the user’s intent. A score of 3 represents high relevance, a very good match, while lower scores suggest the result is less ideal. This structured approach allows search teams to objectively assess and refine their search algorithms. Different kinds of judgments To create a judgment list, you need to evaluate the relevance of search results, and this evaluation can come from different sources. Each type has its strengths and limitations: 1. Explicit judgments These are made by human evaluators who assess search results based on predefined guidelines. Typically, Subject Matter Experts (SMEs) are preferred as human evaluators for their knowledge. Explicit judgments offer high accuracy and nuanced insights but also pose unique challenges. Explicit judgments are very good in capturing the actual relevance of a document for a given query. Strengths: High accuracy, nuanced understanding of intent, and the ability to interpret complex queries. Limitations: Time-consuming, costly for large datasets, and prone to certain challenges. Challenges: Variation: Different judges might assess the same result differently, introducing inconsistency. Position bias: Results higher up in the ranking are often judged as more relevant regardless of actual quality. Expertise: Not all judges possess the same level of subject matter or technical expertise, leading to potential inaccuracies. Interpretation: User intent or the information need behind a query can be ambiguous or difficult to interpret. Multitasking: Judges often handle multiple tasks simultaneously, which may reduce focus. Fatigue: Judging can be mentally taxing, impacting judgment quality over time. Actual vs. perceived relevance: Some results may appear relevant at first glance (e.g., by a misleading product image) but fail closer scrutiny. Scaling: As the dataset grows, efficiently gathering enough judgments becomes a logistical challenge. Best practices: To overcome these challenges, follow these guidelines: Define information needs and tasks clearly to reduce variation in how judges assign grades. Train judges thoroughly and provide detailed guidance. Avoid judging results in a list view to minimize position bias. Correlate judgments from different groups (e.g., subject matter experts versus general judges) to identify discrepancies. Use crowdsourcing or specialized agencies to scale the evaluation process efficiently. 2. Implicit judgments Implicit judgments are inferred from user behavior data such as click-through rates, dwell time, and bounce rates. While they offer significant advantages, they also present unique challenges. In addition to relevance, implicit judgments capture search result quality aspects that match user taste or preference (for example, cost, and delivery time) as well as factors that fulfill the user in a certain way or the attractiveness of a product to a user (for example, sustainability features of the product). Strengths: Scalable and based on real-world usage, making it possible to gather massive amounts of data without manual intervention. Limitations: Susceptible to biases and other challenges that affect the reliability of the judgments. Challenges: Clicks are noisy: Users might click on results due to missing or unclear information on the search results page, not because the result is truly relevant. Biases: Position bias: Users are more likely to click on higher-ranked results, regardless of their actual relevance. Presentation bias: Users cannot click on what is not shown, resulting in missing interactions for potentially relevant results. Conceptual biases: For example, in a grid view result presentation users tend to interact more often with results at the grid margins. Sparsity issues: Metrics like CTR can be skewed in scenarios with limited data (e.g., CTR = 1.0 if there’s only 1 click out of 1 view). No natural extension points: Basic models like CTR lack built-in mechanisms for handling nuanced user behavior or feedback. Best practices: To mitigate these challenges and maximize the value of implicit judgments: Avoid over-reliance on position bias-prone metrics: Combine implicit signals with other data points to create a more holistic evaluation. Correlate implicit judgments with explicit feedback: Compare user behavior data with manually graded relevance scores to identify alignment and discrepancies. Train your models thoughtfully: Ensure they account for biases and limitations inherent in user behavior data by using a model that incorporates countermeasures for biases and provides options to combine different signals (for example clicks and purchases) . 3. AI-generated judgments AI-generated judgments leverage large language models (LLMs) like OpenAI’s GPT-4o to judge query-document pairs. These judgments are gaining traction due to their scalability and cost-effectiveness. LLMs as judges capture the actual relevance of a document for a given query well. Strengths: Cost-efficient, scalable, and consistent across large datasets, enabling quick evaluations of vast numbers of results. Limitations: AI-generated judgments may lack context-specific understanding, introduce biases from training data, and fail to handle edge cases effectively. Challenges: Training data bias: The AI model’s outputs are only as good as the data it’s trained on, potentially inheriting or amplifying biases. Context-specific nuances: AI may struggle with subjective or ambiguous queries that require human-like understanding. Interpretability: Understanding why a model assigns a specific judgment can be difficult, reducing trust in the system. Scalability trade-offs: While AI can scale easily, ensuring quality across all evaluations requires significant computational resources and potentially fine-tuning. Cost: While LLM judgments scale well, they are not free. Monitor your expenses closely. Best practices: To address these challenges and make the most of AI-generated judgments: Incorporate human oversight: Periodically compare AI-generated judgments with explicit human evaluations to catch errors and edge cases and use this information to improve your prompt. Enhance interpretability: Use explainable AI techniques to improve understanding and trust in the LLM’s decisions. Make the LLM explain its decision as part of your prompt. Optimize computational resources: Invest in infrastructure that balances scalability with cost-effectiveness. Combine AI with other judgment types: Use AI-generated judgments alongside explicit and/or implicit judgments to create a holistic evaluation framework. Prompt engineering: Invest time in your prompt. Even small changes can make a huge difference in judgment quality. Different factors of search quality Different kinds of judgments incorporate different aspects or factors of search quality. We can divide search result quality factors into three groups: Search relevance: This measures how well a document matches the information need expressed in the query. For instance: Binary judgments: Does the document fulfill the query (relevant or not)? Graded judgments: How well does the document fulfill the query on a nuanced scale? Explicit judgments and AI-generated judgments work well to capture search relevance. Relevance factors: These address whether the document aligns with specific user preferences. Examples include: Price: Is the result affordable or within a specified range? Brand: Does it belong to a brand the user prefers? Availability: Is the item in stock or ready for immediate use Implicit judgments capture relevance factors well. Fulfillment aspects: These go beyond relevance and preferences to consider how the document resonates with broader user values or goals. Examples include: Sustainability: Does the product or service promote environmental responsibility? Ethical practices: Is the company or provider known for fair trade or ethical standards? Fulfillment aspects are the most difficult to measure and quantify. Understanding your users is key and implicit feedback is the best way to move in that direction. Be aware of biases in implicit feedback and apply techniques to counter these as well as possible, for example when modeling the judgments based on implicit feedback . By addressing these factors systematically, search systems can ensure a holistic approach to evaluating and enhancing result quality. Where do judgment lists fit in the search quality improvement cycle? Search quality improvement is an iterative process that involves evaluating and refining search algorithms to better meet user needs. Judgment lists play a central role in offline experimentation (the smaller, left cycle in the image below), where search results are tested against predefined relevance scores without involving live users. This allows teams to benchmark performance, identify weaknesses, and make adjustments before deploying changes. This makes offline experimentation a fast and low-risk way of exploring potential improvements before trialing them in an online experiment. Online experimentation (the larger, right cycle) uses live user interactions, such as A/B testing, to gather real-world feedback on system updates. While offline experimentation with judgment lists ensures foundational quality, online experimentation captures dynamic, real-world nuances and user preferences. Both approaches complement each other, forming a comprehensive framework for search quality improvement. Source: Peter Fries. Search Quality - A business-friendly perspective . Tools to create judgment lists At its core, creating judgment lists is a labeling task where ultimately we are seeking to add a relevance label to a query-document pair. Some of the services that exist are: Quepid : An open source solution that supports the whole offline experimentation lifecycle from creating query sets to measuring search result quality with judgment lists created in Quepid. Label Studio : A data labeling platform that is predominantly used for generating training data or validating AI models. Amazon SageMaker Ground Truth : A cloud service offering data labeling to apply human feedback across the machine learning lifecycle. Prodigy : A complete data development experience with an annotation capability to label data. Looking ahead: Creating judgment lists with Quepid This post is the first in a series on search quality evaluation. In our next post, we will dive into the step-by-step process of creating explicit judgments using a specific tool called Quepid . Quepid simplifies the process of building, managing, and refining judgment lists, enabling teams to collaboratively improve search quality. Stay tuned for practical insights and tips on leveraging this tool to enhance the quality of your search results. Conclusion A judgment list is a cornerstone of search quality evaluation, providing a reliable benchmark for measuring performance and guiding improvements. By leveraging explicit, implicit, and AI-generated judgments, organizations can address the multifaceted nature of search quality—from relevance and accuracy to personalization and diversity. Combining these approaches ensures a comprehensive and robust evaluation strategy. Investing in a well-rounded strategy for search quality not only enhances user satisfaction but also positions your search system as a trusted and reliable tool. Whether you’re managing a search engine or fine-tuning an internal search feature, a thoughtful approach to judgments and search quality factors is essential for success. Partner with Open Source Connections to transform your search capabilities and empower your team to continuously evolve them. Our proven track record spans the globe, with clients consistently achieving dramatic improvements in search quality, team capability, and business performance. Contact us today to learn more. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to Why do you need a judgment list? What is a judgment? Binary judgments Graded judgments Example of a judgment list Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Cracking the code on search quality: The role of judgment lists - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/judgment-lists","meta_description":"Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Vector Database Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Search Relevance Vector Database +1 March 18, 2025 Searching complex documents with ColPali - part 1 The article introduces the ColPali model, a late-interaction model that simplifies the process of searching complex documents with images and tables, and discusses its implementation in Elasticsearch. PS BT By: Peter Straßer and Benjamin Trent Vector Database March 13, 2025 Semantic Text: Simpler, better, leaner, stronger Our latest semantic_text iteration brings a host of improvements. In addition to streamlining representation in _source, benefits include reduced verbosity, more efficient disk utilization, and better integration with other Elasticsearch features. You can now use highlighting to retrieve the chunks most relevant to your query. And perhaps best of all, it is now a generally available (GA) feature! MP By: Mike Pellegrini Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Vector Database - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/vector-database","meta_description":"Vector Database articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Integrations Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations April 9, 2025 Elasticsearch vector database for native grounding in Google Cloud’s Vertex AI Platform Elasticsearch is now publicly available as the first third-party native grounding engine for Google Cloud’s Vertex AI platform and Google’s Gemini models. It enables joint users to build fully customizable GenAI experiences grounded in enterprise data, powered by the best-of-breed Search AI capabilities from Elasticsearch. VA By: Valerio Arvizzigno Integrations How To April 8, 2025 Using CrewAI with Elasticsearch Learn how to create an Elasticsearch agent with CrewAI for your agent team and perform market research. JR By: Jeffrey Rengifo Integrations How To April 4, 2025 Getting Started with the Elastic Chatbot RAG app using Vertex AI running on Google Kubernetes Engine Learn how to configure the Elastic Chatbot RAG app using Vertex AI and run it on Google Kubernetes Engine (GKE). JS By: Jonathan Simon 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Integrations - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/integrations","meta_description":"Integrations articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Ingestion Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Ingestion +1 March 7, 2025 Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. JR By: Jeffrey Rengifo Integrations Ingestion +1 February 19, 2025 Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. AK By: Amit Khandelwal Integrations Ingestion +1 February 18, 2025 Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. JR TM By: Jeffrey Rengifo and Tomás Murúa Ingestion How To February 4, 2025 How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. AL By: Andre Luiz Integrations Ingestion +1 February 3, 2025 Elastic Playground: Using Elastic connectors to chat with your data Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources. JR TM By: Jeffrey Rengifo and Tomás Murúa Integrations Ingestion +1 January 24, 2025 Indexing OneLake data into Elasticsearch - Part II Second part of a two-part article to index and search OneLake data into Elastic using a Custom connector. GL JR By: Gustavo Llermaly and Jeffrey Rengifo Integrations Ingestion +1 January 23, 2025 Indexing OneLake data into Elasticsearch - Part 1 Learn to configure OneLake, consume data using Python and index documents in Elasticsearch to then run semantic searches. GL By: Gustavo Llermaly Integrations Ingestion +1 January 16, 2025 Elastic Jira connector tutorial part II: Optimization tips After connecting Jira to Elasticsearch, we'll now review best practices to escalate this deployment. GL By: Gustavo Llermaly Integrations Ingestion +1 January 15, 2025 Elastic Jira connector tutorial part I Reviewing a use case for the Elastic Jira connector. We'll be indexing our Jira content into Elasticsearch to create a unified data source and do search with Document Level Security. GL By: Gustavo Llermaly 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Ingestion - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/ingestion","meta_description":"Ingestion articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Javascript Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Python Javascript +2 October 30, 2024 Export your Kibana Dev Console requests to Python and JavaScript Code The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application. MG By: Miguel Grinberg Javascript Python +1 June 4, 2024 Automatically updating your Elasticsearch index using Node.js and an Azure Function App Learn how to update your Elasticsearch index automatically using Node.js and an Azure Function App. Follow these steps to ensure your index stays current. JG By: Jessica Garson ES|QL Javascript +1 June 3, 2024 ES|QL queries to TypeScript types with the Elasticsearch JavaScript client Explore how to use the Elasticsearch JavaScript client and TypeScript support to craft ES|QL queries and handle their results as native JavaScript objects. JM By: Josh Mock Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Javascript - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/javascript-programming","meta_description":"Javascript articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial ML Research Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil ML Research December 5, 2024 Exploring depth in a 'retrieve-and-rerank' pipeline Select an optimal re-ranking depth for your model and dataset. TP TV QH By: Thanos Papaoikonomou , Thomas Veasey and Quentin Herreros ML Research November 25, 2024 Introducing Elastic Rerank: Elastic's new semantic re-ranker model Learn about how Elastic's new re-ranker model was trained and how it performs. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou Vector Database Lucene +1 November 18, 2024 Better Binary Quantization (BBQ) vs. Product Quantization Why we chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch. BT By: Benjamin Trent ML Research Search Relevance October 29, 2024 What is semantic reranking and how to use it? Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"ML Research - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/ml-research","meta_description":"ML Research articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Developer Experience Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Developer Experience March 3, 2025 Fast Kibana Dashboards From 8.13 to 8.17, the wait time for data to appear on a dashboard has improved by up to 40%. These improvements are validated both in our synthetic benchmarking environment and from metrics collected in real user’s cloud environments. TN By: Thomas Neirynck Developer Experience January 22, 2025 Engineering a new Kibana dashboard layout to support collapsible sections & more Building collapsible dashboard sections in Kibana required overhauling an embeddable system and creating a custom layout engine. These updates improve state management, hierarchy, and performance while setting the stage for new advanced dashboard features. TS HM NR By: Teresa Alvarez Soler , Hannah Mudge and Nathaniel Reese Python Javascript +2 October 30, 2024 Export your Kibana Dev Console requests to Python and JavaScript Code The Kibana Dev Console now offers the option to export requests to Python and JavaScript code that is ready to be integrated into your application. MG By: Miguel Grinberg 1 2 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Developer Experience - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/developer-experience","meta_description":"Developer Experience articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial How To Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 14, 2025 Elasticsearch Index Number_of_Replicas Explaining how to configure the number_of_replicas, its implications and best practices. KB By: Kofi Bartlett How To May 12, 2025 Excluding Elasticsearch fields from indexing Explaining how to configure Elasticsearch to exclude fields, why you might want to do this, and best practices to follow. KB By: Kofi Bartlett How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 7, 2025 Joining two indices in Elasticsearch Explaining how to use the terms query and the enrich processor for joining two indices in Elasticsearch. KB By: Kofi Bartlett How To May 5, 2025 Understanding Elasticsearch scoring and the Explain API Diving into the scoring mechanism of Elasticsearch and exploring the Explain API. KB By: Kofi Bartlett 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How To - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/how-to","meta_description":"How To articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Agent Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Agent How To March 28, 2025 Connect Agents to Elasticsearch with Model Context Protocol Let’s use Model Context Protocol server to chat with your data in Elasticsearch. JB JM By: Jedr Blaszyk and Joe McElroy Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Generative AI Agent +1 September 20, 2024 LangChain and Elasticsearch: Building LangGraph retrieval agent template Elasticsearch and LangChain collaborate on a new retrieval agent template for LangGraph for agentic apps JM AT SC By: Joe McElroy , Aditya Tripathi and Serena Chou Generative AI Vector Database +2 September 2, 2024 A tutorial on building local agent using LangGraph, LLaMA3 and Elasticsearch vector store from scratch This article will provide a detailed tutorial on implementing a local, reliable agent using LangGraph, combining concepts from Adaptive RAG, Corrective RAG, and Self-RAG papers, and integrating Langchain, Elasticsearch Vector Store, Tavily AI for web search, and LLaMA3 via Ollama. PR By: Pratik Rana Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Agent - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/rag-agent","meta_description":"Agent articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Search tier autoscaling in Elasticsearch Serverless Explore search tier autoscaling in Elasticsearch Serverless. Learn how autoscaling works, how the search load is calculated and more. Elastic Cloud Serverless MP JV By: Matteo Piergiovanni and John Verwolf On August 8, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. One of the key aspects of our new serverless offerings is allowing users to deploy and use Elastic without the need to manage the underlying project nodes. To achieve this, we developed search tier autoscaling, a strategy to dynamically choose node size and count based on a multitude of parameters that we will delve into in this blog. This innovation ensures that you no longer need to worry about under-provisioning or over-provisioning your resources. Whether you are dealing with fluctuating traffic patterns, unexpected data spikes, or gradual growth, search tier autoscaling seamlessly adapts the allocated hardware to the search tier dynamically based on search activity. Autoscaling is performed on a per project basis and is completely transparent to the end user. Introduction Elastic serverless is a fully-managed product from Elastic that enables you to deploy and use Elastic products without the need to manage the underlying Elastic infrastructure, but instead focussing on extracting the most out of your data. One of the challenges of self-managed infrastructure is dealing with the ever-evolving needs a customer faces. In the dynamic world of data management, flexibility and adaptability are crucial and traditional scaling methods often fall short and require manual adjustments that can be both time-consuming and imprecise. With search tier autoscaling, our serverless offering automatically adjusts resources to match the demand of your workload in real-time. The autoscaling described in this post is specific to the Elasticsearch project type within Elastic's serverless offering. Observability and security may have different autoscaling mechanisms tailored to their unique requirements. Another important piece of information needed before diving into the details of autoscaling is how we manage our data to achieve a robust and scalable infrastructure. We use S3 as the primary source of truth, providing reliable and scalable storage. To enhance performance and reduce latency, search nodes use a local cache to quickly access frequently requested data without repeatedly retrieving it from S3. This combination of S3 storage and caching by search nodes forms an efficient system, ensuring that both durable storage and fast data access fit our user’s demands effectively. Search tier autoscaling inputs To demonstrate how autoscaling works, we’ll dive into the various metrics that are used in order to make scaling decisions. When starting a new serverless Elasticsearch project, a user can choose two parameters that will influence how autoscaling behaves: Boost Window : defines a specific time range within which search data is considered boosted. Boosted Data : data that falls within the boost window is classified as boosted data. All time-based documents with a @timestamp within the boost window range and all non-time-based documents will fall in the boosted data category. This time-based classification allows the system to prioritize this data when allocating resources. Non-Boosted Data : data outside the boost window is considered non-boosted. This older data is still accessible but is allocated fewer resources compared to boosted data. Search Power : a range that controls the number of Virtual Compute Units (VCUs) allocated to boosted data in the project. Search power can be set to: Cost Efficient : limits the available cache size for boosted data prioritizing cost efficiency over performance. Well suited for customers wanting to store very large amounts of data at a low cost. Balanced : ensures enough cache for all boosted data for faster searches. Performance : provides more resources to respond quicker to a higher volume and more complex queries. The boost window will determine the amount of boosted and non-boosted data for a project. We define boosted data for a project as the amount of data within the boost window. The total size of boosted data, together with the lower end of the selected search power range will determine the base hardware configuration for a project. This method is favored over scaling to zero (or near zero) because it helps maintain acceptable latency for subsequent requests. This is achieved by retaining our cache and ensuring CPUs are immediately available to process incoming requests. This approach avoids the delays associated with provisioning hardware from CSP and ensures the system readiness to handle incoming requests promptly. Note that the base configuration can increase over time by ingesting more data or decrease if time series data falls out of the boost window. This is the first piece of autoscaling, where we provide a base hardware configuration that can adapt to a user’s boosted data over time. Load based autoscaling Autoscaling based on interactive data is only one piece of the puzzle. It does not account for the load placed on the Search Nodes by incoming search traffic. To this effect, we have introduced a new metric called search load . search load is a measure of the amount of physical resources required to handle the current search traffic. Search Llad accounts for the resource usage that the search traffic places on the nodes at a given time, and thus allows for dynamic autoscaling in response. What is search load? Search load is a measure of the amount of physical resources required to handle the current search traffic. We report this as a measure of the number of processors required per node. However, there is some nuance here. When scaling, we move up and down between hardware configurations that have set values of CPU, memory, and disk. These values are scaled together according to given ratios. For example, to obtain more CPU, we would scale to a node with a hardware configuration that also includes more memory and more disk. Search load indirectly accounts for these resources. It does so by using the time that search threads take within a given measurement interval. If the threads block while waiting for resources (IO), this also contributes to the threads’ execution time. If all the threads are 100% utilized in addition to queuing, this indicates the need to scale up. Conversely, if there is no queuing and the search thread pool is less than 100% utilized, this indicates that it is possible to scale down. How is search load calculated? Search load is composed of two factors: Thread Pool Load : number of processor cores needed to handle the search traffic that is being processed. Queue Load : number of processor cores needed to handle the queued search requests within an acceptable timeframe. To describe how the search load is calculated, we will walk through each aspect step-by-step to explain the underlying principles. We will start by describing the Thread Pool Load . First, we monitor the total execution time of the threads responsible for handling search requests within a sampling interval, called totalThreadExecutionTime . The length of this sampling interval is multiplied by the processor cores to determine the maximum availableTime . To obtain the threadUtilization percent, we divide the total thread execution time by this availableTime . For example, a 4 core machine with a 1s sampling interval would have 4 seconds of available time (4 cores * 1s). If the total task execution time is 2s, then this results in 50% thread pool utilization (2s / 4s = 0.5). We then multiply the threadUtilization percent by the numProcessors to determine the processorsUsed , which measures the number of processor cores used. We record this value via an exponential weighted moving average (a moving average that favors recent additions) to smooth out small bursts of activity. This results in the value used for threadPoolLoad . Next, we will describe how the Queue Load is determined. Central to the calculation, there is a configuration maxTimeToClearQueue that sets the maximum acceptable timeframe that a search request may be queued. We need to know how many tasks a given thread can execute within this timeframe, so we divide the maxTimeToClearQueue by the exponential weighted moving average of the search execution time. Next, we divide the searchQueueSize by this value to determine how many threads are needed to clear the queue within the configured time frame. To convert this to the number of processors required, we multiply this by the ratio of processorsPerThread . This results in the value used for the queueLoad . The search load for a given node is then the sum of both the threadPoolLoad and the queueLoad . Search load reporting Each Search Node regularly publishes load readings to the Master Node. This will occur either after a set interval, or if a large delta in the load is detected. The Master Node keeps track of this state separately for each Search Node, and performs bookkeeping in response to various lifecycle events. When Search Nodes are added/removed, the Master Node adds or removes their respective load entries. The Master Node also reports a quality rating for each entry: Exact , Minimum , or Missing . Exact means the metric was reported recently, while Missing is assigned when a search load has not yet been reported by a new node. Search load quality is considered Minimum when the Master Node has not received an update from the search load within a configured time period, e.g. if a node becomes temporarily unavailable. The quality is also reported as Minimum when a Search Node’s load value accounts for work that is not considered indicative of future work, such as downloading files that will be subsequently cached. Quality is used to inform scaling decisions. We disallow scaling down when the quality of any node is inexact. However, we allow scaling up regardless of the quality rating. The autoscaler The autoscaler is a component of Elastic serverless designed to optimize performance and cost by adjusting the size and number of nodes in a project based on real-time metrics. It monitors metrics from Elasticsearch, determines an ideal hardware configuration, and applies the configuration to the managed Kubernetes infrastructure. With an understanding of the inputs and calculations involved in search tier metrics, we can now explore how the autoscaler leverages this data to dynamically adjust the project node size and count for optimal performance and cost efficiency. The autoscaler monitors the search tier metrics every 5 seconds. When new metrics arrive for total interactive and non-interactive data size, together with the search power range, the autoscaler will then determine the range of possible hardware configurations. These configurations range from a minimum to a maximum, defined by the search power range. The autoscaler then uses the search load reported by Elasticsearch to select a “desired” hardware configuration within the available range that has at least the number of processor cores to account for the measured search load. This desired configuration serves as an input to a stabilization phase where the autoscaler decides if the chosen scale direction can be applied immediately; if not, it is discarded. There is a 15-minute stabilization window for scaling down, meaning 15 minutes of continuous scaling down events are required for a scale down to occur. There is no stabilization period for scaling up. Scaling events are non-blocking; therefore, we can continue to make scaling decisions while subsequent operations are still ongoing. The only limit to this is defined by the stabilization window described above. The configuration is then checked against the maximum number of replicas for an index in Elasticsearch to ensure there are enough search nodes to accommodate all the configured replicas. Finally, the configuration is applied to the managed Kubernetes infrastructure, which provisions the project size accordingly. Conclusion Search tier autoscaling revolutionizes the management of Elasticsearch serverless projects. By leveraging detailed metrics, the autoscaler ensures that projects are always optimally sized. With serverless, users can focus on their business needs without the worry of managing infrastructure or being caught unprepared when their workload changes. This approach not only enhances performance during high-demand periods, but also reduces costs during times of low activity, all while being completely transparent to the end user. As a result, users can focus more on their core activities without the constant worry of manually tuning their projects to meet evolving demands. This innovation marks a significant step forward in making Elasticsearch both powerful and user-friendly in the realm of serverless computing. Try it out! Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Introduction Search tier autoscaling inputs Load based autoscaling What is search load? How is search load calculated? Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Search tier autoscaling in Elasticsearch Serverless - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-serverless-tier-autoscaling","meta_description":"Explore search tier autoscaling in Elasticsearch Serverless. Learn how autoscaling works, how the search load is calculated and more."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Hybrid retrieval In this blog we introduce hybrid retrieval and explore two concrete implementations in Elasticsearch. We explore improving Elastic Learned Sparse Encoder’s performance by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores. Generative AI QH TV By: Quentin Herreros and Thomas Veasey On July 20, 2023 Part of Series Improving information retrieval in the Elastic Stack Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In our last blog post , we introduced Elastic Learned Sparse Encoder, a model trained for effective zero-shot text retrieval. Elasticsearch ® also has great lexical retrieval capabilities and rich tools for combining the results of different queries. In this blog, we introduce the concept of hybrid retrieval and explore two concrete implementations available in Elasticsearch. In particular, we explore how to improve the performance of Elastic Learned Sparse Encoder by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores. We also discuss experiments we undertook to explore some general research questions. These include how best to parameterize Reciprocal Rank Fusion and how to calibrate Weighted Sum of Scores. Hybrid retrieval Despite modern training pipelines producing retriever models with good performance in zero-shot scenarios, it is known that lexical retrievers (such as BM25) and semantic retrievers (like Elastic Learned Sparse Encoder) are somewhat complementary. Specifically, it will improve relevance to combine the results of retrieval methods, if one assumes that the more matches occur between the relevant documents they retrieve than between the irrelevant documents they retrieve. This hypothesis is plausible for methods using very different mechanisms for retrieval because there are many more irrelevant than relevant documents for most queries and corpuses. If methods retrieve relevant and irrelevant documents independently and uniformly at random, this imbalance means it is much more probable for relevant documents to match than irrelevant ones. We performed some overlap measurements to check this hypothesis between Elastic Learned Sparse Encoder, BM25, and various dense retrievers as shown in Table 1. This provides some rationale for using so-called hybrid search. In the following, we investigate two explicit implementations of hybrid search. Reciprocal Rank Fusion Reciprocal Rank Fusion was proposed in this paper. It is easy to use, being fully unsupervised and not even requiring score calibration. It works by ranking a document d with both BM25 and a model, and calculating its score based on the ranking positions for both methods. Documents are sorted by descending score. The score is defined as follows: The method uses a constant k to adjust the importance of lowly ranked documents. It is applied to the top N document set retrieved by each method. If a document is missing from this set for either method, that term is set to zero. The paper that introduces Reciprocal Rank Fusion suggests a value of 60 for k and doesn’t discuss how many documents N to retrieve. Clearly, ranking quality can be affected by increasing N while recall@N is increasing for either method. Qualitatively, the larger k the more important lowly ranked documents are to the final order. However, it is not a priori clear what would be optimal values of k and N for modern lexical semantic hybrid retrieval. Furthermore, we wanted to understand how sensitive the results are to the choice of these parameters and if the optimum generalizes between data sets and models. This is important to have confidence in the method in a zero-shot setting. To explore these questions, we performed a grid search to maximize the weighted average NDCG@10 for a subset of the BEIR benchmark for a variety of models. We used Elasticsearch for retrieval in this experiment representing each document by a single text field and vector. The BM25 search was performed using a match query and dense retrieval using exact vector search with a script_score query. Referring to Table 2, we see that for roberta-base-ance-firstp optimal values for k and N are 20 and 1000, respectively. We emphasize that for the majority of individual data sets, the same combination of parameters was optimal . We did the same grid search for distilbert-base-v3 and minilm-l12-v3 with the same conclusion for each model. It is also worth noting that the difference between the best and worst parameter combinations is only about 5%; so the penalty for mis-setting these parameters is relatively small. We also wanted to see if we could improve the performance of Elastic Learned Sparse Encoder in a zero-shot setting using Reciprocal Rank Fusion. The results on the BEIR benchmark are given in Table 3. Reciprocal Rank Fusion increases average NDCG@10 by 1.4% over Elastic Learned Sparse Encoder alone and 18% over BM25 alone. Also, importantly the result is either better or similar to BM25 alone for all test data sets. The improved ranking is achieved without the need for model tuning, training data sets, or specific calibration. The only drawback is that currently the query latency is increased as the two queries are performed sequentially in Elasticsearch. This is mitigated by the fact that BM25 retrieval is typically faster than semantic retrieval. Our findings suggest that Reciprocal Rank Fusion can be safely used as an effective “plug and play” strategy. Furthermore, it is worth reviewing the quality of results one obtains with BM25, Elastic Learned Sparse Encoder and their rank fusion on your own data. If one were to select the best performing approach on each individual data set in the BEIR suite, the increase in average NDCG@10 is, respectively, 3% and 20% over Elastic Learned Sparse Encoder and BM25 alone. As part of this work, we also performed some simple query classification to distinguish keyword and natural question searches. This was to try to understand the mechanisms that lead to a given method performing best. So far, we don’t have a clear explanation for this and plan to explore this further. However, we did find that hybrid search performs strongly when both methods have similar overall accuracy. Finally, Reciprocal Rank Fusion can be used with more than two methods or could be used to combine rankings from different fields. So far, we haven’t explored this direction. Weighted Sum of Scores Another way to do hybrid retrieval supported by Elasticsearch is to combine BM25 scores and model scores using a linear function. This approach was studied in this paper , which showed it to be more effective than Reciprocal Rank Fusion when well calibrated. We explored hybrid search via a convex linear combination of scores defined as follows: where α is the model score weight and is between 0 and 1. Ideal calibration of linear combination is not straightforward, as it requires annotations similar to those used for fine-tuning a model. Given a set of queries and associated relevant documents, we can use any optimization method to find the optimal combination for retrieving those documents. In our experiments, we used BEIR data sets and Bayesian optimization to find the optimal combination, optimizing for NDCG@10. In theory, the ratio of score scales can be incorporated into the value learned for α. However, in the following experiments, we normalized BM25 scores and Elastic Learned Sparse Encoder scores per data set using min-max normalization , calculating the minimum and maximum from the top 1,000 scores for some representative queries on each data set. The hope was that with normalized scores the optimal value of transfers. We didn’t find evidence for this, but it is much more consistent and so normalization does likely improve the robustness of the calibration. Obtaining annotations is expensive, so it is useful to know how much data to gather to be confident of beating Reciprocal Rank Fusion (RRF). Figure 1 shows the NDCG@10 for a linear combination of BM25 and Elastic Learned Sparse Encoder scores as a function of the number of annotated queries for the ArguAna data set. For reference, the BM25, Elastic Learned Sparse Encoder and RRF NDCG@10 are also shown. This sort of curve is typical across data sets. In our experiments, we found that it was possible to outperform RRF with approximately 40 annotated queries, although the exact threshold varied slightly from one data set to another. We also observed that the optimal weight varies significantly both across different data sets (see Figure 2) and also for different retrieval models. This is the case even after normalizing scores. One might expect this because the optimal combination will depend on how well the individual methods perform on a given data set. To explore the possibility of a zero-shot parameterisation, we experimented with choosing a single weight α for all data sets in our benchmark set. Although we used the same supervised approach to do this, this time choosing the weight to optimize average NDCG@10 for the full suite of data sets, we feel that there is enough variation between data sets that our findings may be representative of zero-shot performance. In summary, this approach yields better average NDCG@10 than RRF. However, we also found the results were less consistent than RRF and we stress that the optimal weight is model specific . For this reason, we feel less confident the approach transfers to new settings even when calibrated for a specific model. In our view, linear combination is not a “plug and play” approach. Instead, we believe it is important to carefully evaluate the performance of the combination on your own data set to determine the optimal settings. However, as we will see below, if it is well calibrated it yields very good results. Normalization is essential for comparing scores between different data sets and models, as scores can vary a lot without it. It is not always easy to do, especially for Okapi BM25, where the range of scores is unknown until queries are made. Dense model scores are easier to normalize, as their vectors can be normalized. However, it is worth noting that some dense models are trained without normalization and may perform better with dot products. Elastic Learned Sparse Encoder is trained to replicate cross-encoder score margins. We typically see it produce scores in the range 0 to 20, although this is not guaranteed. In general, a query history and their top N document scores can be used to approximate the distribution and normalize any scoring function with minimum and maximum estimates. We note that the non-linear normalization could lead to improved linear combination, for example if there are score outliers, although we didn’t test this. As for Reciprocal Rank Fusion, we wanted to understand the accuracy of a linear combination of BM25 and Elastic Learned Sparse Encoder — this time, though, in the best possible scenario. In this scenario, we optimize one weight α per data set to obtain the ideal NDCG@10 using linear combination. We used 300 queries to calibrate — we found this was sufficient to estimate the optimal weight for all data sets. In production, this scenario is realistically difficult to achieve because it needs both accurate min-max normalization and a representative annotated data set to adjust the weight. This would also need to be refreshed if the documents and queries drift significantly. Nonetheless, bounding the best case performance is still useful to have a sense of whether the effort might be worthwhile. The results are displayed in Table 4. This approach gives a 6% improvement in average NDCG@10 over Elastic Learned Sparse Encoder alone and 24% improvement over BM25 alone. Conclusion We showed it is possible to combine different retrieval approaches to improve their performance and in particular lexical and semantic retrieval complement one another. One approach we explored was Reciprocal Rank Fusion. This is a simple method that often yields good results without requiring any annotations nor prior knowledge of the score distribution. Furthermore, we found its performance characteristics were remarkably stable across models and data sets, so we feel confident that the results we observed will generalize to other data sets. Another approach is Weighted Sum of Scores, which is more difficult to set up, but in our experiments yielded very good ranking with the right setup. To use this approach, scores should be normalized, which for BM25 requires score distributions for typical queries, furthermore some annotated data should be used for training the method weights. In our final planned blog in this series, we will introduce the work we have been doing around inference and index performance as we move toward GA for the text_expansion feature. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3 : Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid retrieval The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Hybrid retrieval Reciprocal Rank Fusion Weighted Sum of Scores Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving information retrieval in the Elastic Stack: Hybrid retrieval - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/improving-information-retrieval-elastic-stack-hybrid","meta_description":"In this blog we introduce hybrid retrieval and explore two concrete implementations in Elasticsearch. We explore improving Elastic Learned Sparse Encoder’s performance by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Stateless: Data safety in a stateless world We discuss the data durability guarantees in stateless including how we fence new writes and deletes with a safety check which prevents stale nodes from acknowledging new writes or deletes Elastic Cloud Serverless HA By: Henning Andersen On September 6, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. In recent blog posts, we announced the stateless architecture that underpins our Elastic Cloud Serverless offering. By offloading durability guarantees and replication to an object store (e.g., Amazon S3), we gain many advantages and simplifications. Historically, Elasticsearch has relied upon local disk persistence for data safety and handling stale or isolated nodes. In this blog, we will discuss the data durability guarantees in stateless including how we fence new writes and deletes with a safety check which prevents stale nodes from unsafely acknowledging these operations. In the following blog post, we will cover the basics of the durability promise and how Elasticsearch uses an operation log (translog) to be able to quickly and safely acknowledge writes to clients. Next we will dive into the problem, introduce concepts that help us, and finally explain the additional safety check that makes us able to confidently acknowledge writes to clients. Durability promise and translog When clients write data to Elasticsearch, for instance using the _bulk API, Elasticsearch will provide an HTTP response code for the request. Elasticsearch will only provide a successful HTTP response code (200/201) when data has been safely stored. We use an operation log (called translog) where requests are appended and stored before acknowledging the write. The translog allows us to replay operations that have not been successfully persisted to the underlying Lucene index (for instance if a node crashed after we acknowledged the write to the client). For more information on the translog and Lucene indices, see this section in our recent blog post on thin indexing shards , where we explain how we now store Lucene indices and the translog in the object store. Not knowing is the worst - the problem(s) The master allocates a shard to an indexing node, that then owns indexing incoming data into that shard. However, we must account for scenarios where this node falls out of communication with the master and/or rest of the cluster. In such cases, the master will (after timeouts) assume the node is no longer operational and reassign affected shards to other nodes. The prior assignment would now be considered stale. A stale node may still be operational attempting to index and persist data it receives. In this scenario, with potentially two owners of a shard trying to acknowledging writes but out of communication with each other, we have two problems to solve: Avoiding file overwrites in the object store Ensuring that acknowledged writes are not lost Primary terms for stateless- an increasing number to the rescue Elasticsearch has for many years utilized something we call primary terms. Whenever a primary shard is assigned to a node, it is given a primary term for the allocation. If a primary shard fails or goes from unassigned to assigned, the master will increment the primary term before reassigning the primary shard. This gives a strict order of primary shard assignments and ownership, higher primary terms were assigned after lower primary terms. For stateless, we utilize primary terms in the path of index files we write to the object store to ensure that the first problem described above cannot happen. If a shard fails and is reassigned, we know it will have a higher primary term. A shard will only write files in the primary term specific path, thus there is no chance of an older shard assignment and a newer shard assignment writing the same files. They simply write to different paths. The primary term is also used to ultimately provide the durability guarantee, more on that later. Notice that primary shard relocations do not increment the primary term, instead the two nodes involved in a primary shard relocation hand off the ownership through an explicit protocol. Coordination term and node-left generation in stateless The coordination subsystem in Elasticsearch is a strongly consistent mechanism used for cluster level coordination, including cluster membership and cluster metadata (all known as cluster state). In stateless, this system also builds on top of the object store, uploading new cluster state versions. Like in stateful, it maintains an increasing number for elections called “term” (we’ll call it coordination term here to disambiguate it from the primary term described in the previous section). Whenever a node decides to start a new election, it will do so in a new coordination term, higher than any previous terms seen (more details on how this works in stateful in the blog post here ). In stateless, the election happens through an object store file we call the lease file. This file contains the coordination term and the node that claims that term is the elected master for the term. This file will help the safety check we are interested in here. If the coordination term is still the same, we know the elected master did not change. Just the coordination term is not enough though, since this does not necessarily change if a node leaves the cluster. In order to detect that a data node has not left the cluster, we also add the node-left generation to the lease file. This is an increasing number, incremented every time a node leaves the cluster. It resets from zero when the term changes (but we can disregard that for the story here). The lease file is written to the object store as part of persisting a new cluster state. This write happens before any actions (like shard recovery) are otherwise taken based on the new cluster state. Object store read after write semantics in stateless We use the object store to store all data in stateless and the visibility guarantees of the object store are therefore important to consider. Ultimately, the safety check builds on top of those guarantees. Following are the main object store visibility guarantees that we rely on: Read-after-write: after a successful write, any read will return the new contents. List-after-write: after a successful write, any listing matching the new file will return the file. These were not a given years ago, but are available today across AWS S3, GCP and Azure blob storage. Stateless: The safety check Having the necessary building blocks described above, we can now move on to the actual safety check and safety argumentation. While the translog guarantees durability of writes, we need to ensure that the node is still the assigned indexing node prior to acknowledging the write. The source of truth for that is in cluster state and the data node therefore needs to establish that it has a new enough cluster state in order to determine whether it is safe to acknowledge the write. We are only interested in non-graceful events like node crashes, network partitions and similar. Graceful events like shard relocations are handled through explicit hand-offs that guarantee their correctness (we'll not dive into this in this blog post). Let us consider an ungraceful event, for instance where the master node detects that a data node that holds a shard is no longer responding and it thus ejects the node from the cluster. We'll examine the safety check in this context and see how it avoids that a stale node potentially incorrectly acknowledges a write to client. The safety check adds one additional check before responding to the client: Read the lease file from the object store. If the coordination term or node-left generation has advanced past the values in the node's local cluster state, it cannot rely on the cluster state until it receives an updated version with a higher or equal coordination term and node-left generation. With a new enough cluster state, it can be used to check whether the primary term of the shard has changed. If it has changed, the write will fail. The happy path will incur no waiting here, since the term and node-left generation changes very infrequently relative to a normal write request frequency. The overhead of this check is thus small. Notice that the ordering is important: the translog file is successfully uploaded before the safety check. We’ll see why shortly. The ungraceful node-left event leads to an increment of the node-left generation in the lease file. Afterwards, a new node may be assigned the shard and start recovering data (this may be just one cluster state update, but the ordering of the lease file write and a node starting recovery is the only important part here and is guaranteed). The newly assigned node will then read the shard data and recover the data contained in translog. We see that we have the following ordering of events: Original data node writes translog before reading lease file Master writes lease file with incremented node-left generation before new data node starts recovering and thus before reading the translog Object store guarantees read-after-write on the lease file and translog files. There are two main situations to consider: The original data node wrote the translog file and read a lease file indicating it is still in the cluster and owner of the shard (primary term did not change). We then know that the master did not successfully update the lease file prior to the data node reading it. Therefore, the write to the translog by the original data node happens before the read of the translog by the new node assignment, guaranteeing that the operations will be available to the new node for recovery. The original data node wrote the translog file, but after possibly waiting for a new cluster state based on the information in the lease file, it is no longer the owner of the shard (making it fail the write request). We do not respond successfully to the write request, thus do not promise durability. The translog data might be available to the new node assignment during recovery, but that is fine. It is ok for a failed request to actually have persisted data durably. We thus see that any write that Elasticsearch has successfully responded to will be available for any future owners of the same shard, concluding our safety argumentation. Similarly, we can argue that a master failover case is safe. Here the coordination term rather than the node-left generation will change. We will not go through that here. This same safety check is used in a number of other critical situations: During index file deletion. When Lucene merges segments, old segments can be deleted. We add a safety check here to protect against deleting files that a newer node assignment needs. During translog file deletion. Translogs can be deleted when the index data in the object store contains all the operations. Again, we add a safety check here to protect against deleting translog files that a newer node assignment needs. Conclusion Congratulations, you made it to the end, hopefully you enjoyed the deep dive here. We described a novel mechanism for ensuring that Elasticsearch durably and safely persists writes to an object store, also in the presence of any kind of disruption causing Elasticsearch to otherwise have two nodes owning indexing into the same shard. We care deeply about such aspects and if you do too, perhaps take a look at our open job offerings . Shout out to David Turner, Francisco Fernández Castaño and Tim Brooks who did most of the real work here. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Durability promise and translog Not knowing is the worst - the problem(s) Primary terms for stateless- an increasing number to the rescue Coordination term and node-left generation in stateless Object store read after write semantics in stateless Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Stateless: Data safety in a stateless world - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/data-safety-stateless-elasticsearch","meta_description":"Explore the data durability and safety guarantees in Elasticsearch stateless, including how we fence new writes and deletes with a safety check."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. Developer Experience Javascript How To JR By: Jeffrey Rengifo On May 15, 2025 Part of Series Elasticsearch in JavaScript the proper way Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This is the first article of a series that covers how to use Elasticsearch with JavaScript. In this series, you’ll learn the basics of how to use Elasticsearch in a JavaScript environment and review the most relevant features and best practices to create a search app. By the end, you’ll know everything you need to run Elasticsearch using JavaScript. In this first part, we will review: Environment Frontend, backend, or serverless? Connecting the client Indexing documents Elasticsearch client Semantic mappings Bulk helper Searching data Lexical Query Semantic Query Hybrid Query You can check the source code with the examples here . What is the Elasticsearch Node.js client? The Elasticsearch Node.js client is a JavaScript library that puts the HTTP REST calls from the Elasticsearch API into JavaScript. This makes it easier to handle and to have helpers that simplify tasks like indexing documents in batches. Environment Frontend, backend, or serverless? To create our search app using the JavaScript client, we need at least two components: an Elasticsearch cluster and a JavaScript runtime to run the client. The JavaScript client supports all Elasticsearch solutions (Cloud, on-prem, and Serverless), and there are no major differences among them since the client handles all variations internally, so you don’t need to worry about which one to use. The JavaScript runtime, however, must be run from the server and not directly from the browser. This is because when calling Elasticsearch from the browser, the user can get sensitive information like the cluster API key, host, or the query itself. Elasticsearch recommends never exposing the cluster directly to the internet and using an intermediate layer that abstracts all this information so that the user can only see parameters. You can read more about this topic here . We suggest using a schema like this: In this case, the client only sends the search terms and an authentication key for your server while your server is in total control over the query and communication with Elasticsearch. Connecting the client Start by creating an API key following these steps . Following the previous example, we’ll create a simple Express server, and we’ll connect to it using a client from a Node.JS server. We’ll initialize the project with NPM and install the Elasticsearch client and Express. The latter is a library to bring up servers in Node.js. Using Express, we can interact with our backend via HTTP. Let’s initialize the project: npm init -y Install dependencies: npm install @elastic/elasticsearch express split2 dotenv Let me break it down for you: @elastic/elasticsearch : It is the official Node.js client express : It will allow us to spin a lightweight nodejs server to expose Elasticsearch split2 : Splits lines of text into a stream. Useful to process our ndjson files one line at a time dotenv : Allow us to manage environment variables using a .env file Create a .env file at the root of the project and add the following lines: This way, we can import those variables using the dotenv package. Create a server.js file: This code sets up a basic Express.js server that listens on port 3000 and connects to an Elasticsearch cluster using an API key for authentication. It includes a /ping endpoint that when accessed via a GET request, queries the Elasticsearch cluster for basic information using the .info() method of the Elasticsearch client. If the query is successful, it returns the cluster info in JSON format; otherwise, it returns an error message. The server also uses body-parser middleware to handle JSON request bodies. Run the file to bring up the server: node server.js The answer should look like this: And now, let’s consult the endpoint /ping to check the status of our Elasticsearch cluster. Indexing documents Once connected, we can index documents using mappings like semantic_text for semantic search and text for full-text queries. With these two field types, we can also do hybrid search . We’ll create a new load.js file to generate the mappings and upload the documents. Elasticsearch client We first need to instantiate and authenticate the client: Semantic mappings We’ll create an index with data about a veterinary hospital. We’ll store the information from the owner, the pet, and the details of the visit. The data on which we want to run full-text search, such as names and descriptions, will be stored as text. The data from categories, like the animal’s species or breed, will be stored as keywords. Additionally, we’ll copy the values of all fields into a semantic_text field to be able to run semantic search against that information too. Bulk helper Another advantage of the client is that we can use the bulk helper to index in batches. The bulk helper allows us to easily handle things like concurrence, retries, and what to do with each document that goes through the function and that succeeds or fails. An attractive feature of this helper is that you can work with streams. This function allows you to send a file line by line instead of storing the complete file in the memory and sending it to Elasticsearch in one go. To upload the data to Elasticsearch, create a file called data.ndjson in the project’s root and add the information below (alternatively, you can download the file with the dataset from here ): We use split2 to stream the file lines while the bulk helper sends them to Elasticsearch. The code above reads a .ndjson file line by line and bulk indexes each JSON object into a specified Elasticsearch index using the helpers.bulk method. It streams the file using createReadStream and split2 , sets up indexing metadata for each document, and logs any documents that fail to process. Once complete, it logs the number of successfully indexed items. Alternatively to the indexData function, you can upload the file directly via UI using Kibana, and use the upload data files UI. We run the file to upload the documents to our Elasticsearch cluster. node load.js Searching data Going back to our server.js file, we’ll create different endpoints to perform lexical, semantic, or hybrid search. In a nutshell, these types of searches are not mutually exclusive, but will depend on the kind of question you need to answer. Query type Use case Example question Lexical query The words or word roots in the question are likely to show up in the index documents. Token similarity between question and documents. I’m looking for a blue sport t-shirt. Semantic query The words in the question are not likely to show up in the documents. Conceptual similarity between question and documents. I’m looking for clothing for cold weather. Hybrid search The question contains lexical and/or semantic components. Token and semantic similarity between question and documents. I’m looking for an S size dress for a beach wedding. The lexical parts of the question are likely to be part of titles and descriptions, or category names, while the semantic parts are concepts related to those fields. Blue will probably be a category name or part of a description, and beach wedding is not likely to be, but can be semantically related to linen clothing. Lexical query (/search/lexic?q=<query_term>) Lexical search, also called full-text search, means searching based on token similarity; that is, after an analysis, the documents that include the tokens in the search will be returned. You can check our lexical search hands-on tutorial here . We test with: nail trimming Answer: Semantic query (/search/semantic?q=<query_term>) Semantic search, unlike lexical search, finds results that are similar to the meaning of the search terms through vector search. You can check our semantic search hands-on tutorial here . We test with: Who got a pedicure? Answer: Hybrid query (/search/hybrid?q=<query_term>) Hybrid search allows us to combine semantic and lexical search, thus getting the best of both worlds: you get the precision of searching by token, together with the meaning proximity of semantic search. We test with “ Who got a pedicure or dental treatment?\" Response: Conclusion In this first part of our series, we explained how to set up our environment and create a server with different search endpoints to query the Elasticsearch documents following the client/server best practices. Check out part two of our series, in which you’ll learn production best practices and how to run the Elasticsearch Node.js client in Serverless environments. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is the Elasticsearch Node.js client? Environment Frontend, backend, or serverless? Connecting the client Indexing documents Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch in JavaScript the proper way, part I - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/how-to-use-elasticsearch-in-javascript-part-i","meta_description":"Explaining how to create a production-ready Elasticsearch backend in JavaScript."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. Developer Experience ET LP By: Elastic Team and Logan Pashby On May 6, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. “When I first typed ‘drone’ into the search and saw results for ‘unmanned aerial vehicles’ without synonyms, I was like, ‘Wow, this thing really gets it.’ That’s when it clicked—it genuinely felt like magic.” — Logan Pashby, Principal Engineer, Cypris.ai Relevance at scale: Cypris’ search story Cypris is a platform that helps R&D and innovation teams navigate a massive dataset of patents and research papers of over 500 million documents. Their mission is to make it easier to track innovation, find prior art, and understand the organizations driving new technologies. But there was a problem. To get relevant results, users had to write complex boolean queries—which was fine for expert users, but a barrier for many others. Cypris needed a way to make search more intuitive and accessible. The answer was semantic search powered by vector similarity. However, they discovered that scaling semantic search over a large corpus turned out to be a tough engineering problem. Handling 500 million high dimensional vectors wasn’t just a matter of pushing them into a system and hitting “search.” “When we first indexed all 500 million vectors, we were looking at 30- to 60-second query times in the worst case.” It would require a series of carefully considered trade-offs between model complexity, hardware resources, and indexing strategy. Logan Pashby is a Principal Engineer at Cypris, where he focuses on the platform's innovation intelligence features. With expertise in topics such as deep learning, distributed systems, and full-stack development, Logan solves complex data challenges and develops efficient search solutions for R&D and IP teams. Choosing the right model Cypris’ first attempt at vector search used 750-dimensional embeddings for every document, but they quickly realized scaling such large embeddings across 500 million documents would be unmanageable. By using the memory approximation formula without quantization, the estimated bytes of RAM required would be around 1500 GB, making it clear that they needed to adjust their strategy. “We assumed, and we hoped, that the larger the dimension of the vector, the more information we could encode. A richer embedding space should mean better search relevance.” They considered using sparse vectors like Elastic’s ELSER which avoids the fixed-dimension limitations of dense embeddings by representing documents as weighted lists of tokens instead. However, at the time, ELSER’s CPU-only inference seemed too slow for Cypris’s dataset. Dense vectors, on the other hand, let them leverage off-cluster GPU acceleration, which improved throughput by 10x to 50x when generating embeddings. Cypris’ setup included an external GPU based service to compute vectors which were then indexed into Elasticsearch. The team ultimately decided on lower-dimensional dense vectors that struck a balance: they were compact enough to make indexing and search feasible, yet rich enough to maintain relevance in results. Making it work with production scale data Challenges - disk space Once Cypris had vectors ready to be indexed, they faced the next hurdle: efficiently storing and searching over them in Elasticsearch . The first step was reducing disk space. “At the end of the day, vectors are just arrays of floats.... But when you have 500 million of them, the storage requirements add up quickly.” By default, vectors in Elasticsearch are stored multiple times: first in the _source field (the original JSON document), then in doc_values (columnar storage optimized for retrieval), and finally within the HNSW graph itself. Given that each 750-dimensional float32 vector takes about 3KB, storing 500 million vectors quickly becomes problematic, potentially exceeding 1.5 terabytes per storage layer. One practical optimization Cypris used was excluding vectors from the source document in Elasticsearch. This helped reduce overhead, but it turned out disk space wasn’t the biggest challenge. The bigger challenge was memory management. Did You Know? Elasticsearch allows you to optimize disk space by excluding vectors from the source document. This can significantly reduce storage costs, especially when dealing with large datasets. However, be aware that excluding vectors from the source will impact reindexing performance. For more details, check out the Elasticsearch documentation on source filtering . Challenges - RAM explosion Known nearest neighbor (kNN) search in Elasticsearch relies on HNSW graphs, which perform best when fully loaded into RAM. With 500 million high-dimensional vectors, there were significant memory demands on the system. “Trying to fit all of those vectors in memory at query time was not an easy thing to do,” Logan adds. Cypris had to juggle multiple memory requirements: the vectors and their HNSW graphs needed to reside in off-heap memory for fast search performance, while the JVM heap had to remain available for other operations. On top of that, they still needed to support traditional keyword search, and the associated Elasticsearch inverted index would need to stay in memory as well. Managing memory with dimensionality reduction, quantization, and segments Cypris explored multiple approaches to better manage memory and storage, here were three that worked well: Lower-dimensional vectors : The Cypris team swapped to using a smaller model that reduced vector sizes, thereby lowering resource requirements. BBQ (Better Binary Quantization) : Cypris was considering int8 quantization, but when Elastic released BBQ, Cypris adopted it quickly. “We tested it out and it didn’t have a huge hit to relevance and was significantly cheaper. So we implemented it right away”, says Logan . BBQ immediately reduced the size of their vector indexes by around 20% ! Did You Know? Elasticsearch’s Binary Quantized Vectors (BBQ) can reduce the size of vector indexes by ~20%, with minimal impact on search relevance. BBQ reduces both disk usage—by shrinking index size—and memory usage, since smaller vectors take up less space in RAM during searches. It’s especially helpful when scaling KNN search with HNSW graphs, where keeping everything in memory is critical for performance. Explore how BBQ can optimize your search infrastructure in the Elasticsearch documentation on vector search. Segment and shard tuning: Cypris also optimized how Elasticsearch segments and shards were managed. HNSW graphs are built per segment, so searching dense vectors means querying across all segments in a shard. As Logan explains: “HNSW graphs are independent within each segment and each dense vector field search involves finding the nearest neighbors in every segment, making the total cost dependent on the number of segments.” Fewer segments generally mean faster searches—but aggressively merging them can slow down indexing. Since Cypris ingests new documents daily, they regularly force-merge segments to keep them slightly below the default 5GB threshold, preserving automatic merging and tombstone garbage collection. To balance search speed with indexing throughput, force-merging occurs during low-traffic periods, and shard sizes are maintained within a healthy range (below 50GB) to optimize performance without sacrificing ingestion speed. More vectors, faster searches, happy users With these optimizations, Cypris brought query times down from 30–60 seconds to 5–10 seconds . They are also seeing 60–70% of their user queries shift from the previous boolean search experience to the new semantic search interface. But the team is not stopping here! The goal is to achieve sub-second queries to support fast, iterative search and get most of their users to shift to semantic search. Cypris’ product handles 500M docs (or about 7TB+ data), providing real-time AI search and retrieval, and supports 30% quarterly company growth. The product significantly accelerated search use cases, cutting report generation from weeks down to minutes. What did the Cypris team learn? … and what’s next? 500 million vectors don’t scale themselves Handling 500 million vectors isn’t just a storage problem or a search problem—it’s both. Cypris had to balance search relevance, hardware resources, and indexing performance at every step. Did you know Elasticsearch's _search API includes a profile feature that allows you to analyze the execution time of search queries. This can help identify bottlenecks and optimize query performance. By enabling profiling, you can gain insights into how different components of your query are processed. Learn more about using the profile feature in the Elasticsearch search profiling documentation . With search, there’s always a trade-off BBQ was a major win, but it didn’t eliminate the need to rethink sharding, memory allocation, and indexing strategy. Reducing the number of segments improved search speed, but made indexing slower. Excluding vectors from the source reduced disk space but complicated reindexing, as Elasticsearch doesn’t retain the original vector data needed to efficiently recreate the index. Every optimization came with a cost that had to be carefully weighed. Prioritize your users, not the model Cypris didn’t chase the largest models or highest dimension vectors. They focused on what made sense for their users, and working backwards. “Figure out what relevance means for your data,” Logan advises. “And work backward from there.” Cypris is now expanding to other datasets, which could double the number of documents they have to index in elastic. They need to move quickly to stay competitive, “We’re a small team,” Logan says. “So everything we do has to scale—and it has to work.” To learn more, visit cypris.ai Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins Jump to Relevance at scale: Cypris’ search story Choosing the right model Making it work with production scale data Challenges - disk space Challenges - RAM explosion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/building-hybrid-search-at-cypris","meta_description":"Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Connect Agents to Elasticsearch with Model Context Protocol Let’s use Model Context Protocol server to chat with your data in Elasticsearch. Agent How To JB JM By: Jedr Blaszyk and Joe McElroy On March 28, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. What if interacting with your data was as effortless as chatting with a colleague? Imagine simply asking, \"Show me all orders over $500 from last month\" or \"Which products received the most 5-star reviews?\" and getting instant, accurate answers, no querying required. Model Context Protocol (MCP) makes this possible. It seamlessly connects conversational AI with your databases and external APIs, transforming complex requests into natural conversations. While modern LLMs are great at understanding language, their true potential is unlocked when integrated with real-world systems. MCP bridges the gap between them, making data interaction more intuitive and efficient. In this post, we’ll explore: MCP architecture – How it works under the hood Benefits of an MCP server connected to Elasticsearch Building an Elasticsearch-powered MCP server Exciting times ahead! MCP's integration with your Elastic stack transforms how you interact with information, making complex queries as intuitive as everyday conversation. Model Context Protocol Model Context Protocol (MCP), developed by Anthropic, is an open standard that connects AI models to external data sources through secure, bidirectional channels. It solves a major AI limitation: real-time access to external systems while preserving conversation context. MCP architecture Model Context Protocol architecture consists of two key components: MCP Clients – AI assistants and chatbots that request information or execute tasks on behalf of users. MCP Servers – Data repositories, search engines, and APIs that retrieve relevant information or perform requested actions (e.g., calling external APIs). MCP Servers expose two primary capabilities to Clients: Resources - Structured data, documents, and content that can be retrieved and used as context for LLM interactions. This allows AI assistants to access relevant information from databases, search indexes, or other sources. Tools - Executable functions that enable LLMs to interact with external systems, perform computations, or take real-world actions. These tools extend AI capabilities beyond text generation, allowing assistants to trigger workflows, call APIs, or manipulate data dynamically. Prompts - Reusable prompt templates and workflows to standardize and share common LLM interactions. Sampling - Request LLM completions through the client to enable sophisticated agentic behaviors while maintaining security and privacy. MCP server + Elasticsearch Traditional Retrieval-Augmented Generation (RAG) systems retrieve documents based on user queries, but MCP takes it a step further: it enables AI agents to dynamically construct and execute tasks in real time. This allows users to ask natural language questions like: \"Show me all orders over $500 from last month.\" \"Which products received the most 5-star reviews?\" And get instant, precise answers, without writing a single query. MCP achieves this through: Dynamic tool selection – Agents intelligently choose the right tools exposed via MCP servers based on user intent. “Smarter” LLMs are generally better at selecting the right tools with the appropriate arguments based on context. Bidirectional communication – Agents and data sources exchange information fluidly, refining queries as needed (e.g. lookup index mapping first, only then construct the ES query). Multi-tool orchestration – Workflows can leverage tools from multiple MCP servers simultaneously. Persistent context – Agents remember previous interactions, maintaining continuity across conversations. An MCP server connected to Elasticsearch unlocks a powerful real-time retrieval architecture. AI agents can explore, query, and analyze Elasticsearch data on demand. Your data can be searchable through a simple chat interface. Beyond just retrieving data, MCP enables action. It integrates with other tools to trigger workflows, automate processes, and feed insights into analytics systems. By separating search from execution, MCP keeps AI-powered applications flexible, up-to-date, and seamlessly integrated into agentic workflows. Hands on: MCP server to chat with your Elasticsearch data To interact with Elasticsearch via an MCP server, we need at least functions to: Retrieve indices Obtain mappings Perform searches using Elasticsearch’s Query DSL Our server is written in TypeScript, and we will be using the official MCP TypeScript SDK . For setup, we recommend installing the Claude Desktop App (the free version is sufficient) since it includes a built-in MCP Client. Our MCP server essentially exposes the official JavaScript Elasticsearch client through MCP tools. Let’s start by defining the Elasticsearch client and MCP server: We will use following MCP server tools that can interact with Elasticsearch: List Indices ( list_indices ): This tool retrieves all available Elasticsearch indices, providing details such as index name, health status, and document count. Get Mappings ( get_mappings ): This tool fetches the field mappings for a specified Elasticsearch index, helping users understand the structure and data types of stored documents. Search ( search ): This tool executes an Elasticsearch search using a provided Query DSL. It automatically enables highlights for text fields, making it easier to identify relevant search results. The full Elasticsearch MCP server implementation is available in the elastic/mcp-server-elasticsearch repo. Chat with your index Let's explore how to set up the Elasticsearch MCP server so you can ask natural language questions about your data, such as \"Find all orders over $500 from last month.\" Configure your Claude Desktop App Open the Claude Desktop App Navigate to Settings > Developer > MCP Servers Click \"Edit Config\" and add this configuration to your claude_desktop_config.json : Note: This setup utilizes the @elastic/mcp-server-elasticsearch npm package published by Elastic. If you want to develop locally, you can find more details on spinning up the Elasticsearch MCP server here . Populate your Elasticseach index You can use our example data to populate the \"orders\" index for this demo This will allow you to try queries like \"Find all orders over $500 from last month\" Start using it Open a new conversation in the Claude Desktop App The MCP server will connect automatically Start asking questions about your Elasticsearch data! Check out this demo to see how easy it is to query your Elasticsearch data using natural language: How does it work? When asked 'Find all orders over $500 from last month,' the LLM recognizes the intent of searching the Elasticsearch index with specified constraints. To perform an effective search, the agent figures to: Figure out the index name: orders Understand the mappings of orders index Build the Query DSL compatible with index mappings and finally execute the search request This interaction can be represented as: Conclusion Model Context Protocol enhances how you interact with Elasticsearch data, enabling natural language conversations instead of complex queries. By bridging AI capabilities with your data, MCP creates a more intuitive and efficient workflow that maintains context throughout your interactions. The Elasticsearch MCP server is available as a public npm package ( @elastic/mcp-server-elasticsearch ), making integration straightforward for developers. With minimal setup, your team can start exploring data, triggering workflows, and gaining insights through simple conversations. Ready to experience it for yourself? Try out the Elasticsearch MCP server today and start chatting with your data. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Model Context Protocol MCP architecture MCP server + Elasticsearch Hands on: MCP server to chat with your Elasticsearch data Chat with your index Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Connect Agents to Elasticsearch with Model Context Protocol - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/model-context-protocol-elasticsearch","meta_description":"Learn about the Model Context Protocol (MCP), its benefits with Elasticsearch, and how to use an Elasticsearch MCP server to chat with your data."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. Integrations Python How To JR By: Jeffrey Rengifo On April 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this article, you will learn how to leverage LlamaIndex Workflows with Elasticsearch to quickly build a self-filtering search application using LLM. LlamaIndex Workflows propose a different approach to the issue of splitting tasks into different agents by introducing a steps-and-events architecture . This simplifies the design compared to similar methodologies based on DAG (Directed Acyclic Graph) like LangGraph. If you want to read more about agents in general, I recommend you read this article. Image Source: https://www.llamaindex.ai/blog/introducing-workflows-beta-a-new-way-to-create-complex-ai-applications-with-llamaindex One of the main LlamaIndex features is the capacity to easily create loops during execution. Loops can help us with autocorrect tasks since we can repeat a step until we get the expected result or reach a given number of retries. To test this feature, we’ll build a flow to generate Elasticsearch queries based on the user’s question using a LLM with an autocorrect mechanism in case the generated query is not valid. If after a given amount of attempts the LLM cannot generate a valid query, we’ll change the model and keep trying until timeout. To optimize resources, we can use the first query with a faster and cheaper model, and if the generation still fails, we can use a more expensive one. Understanding steps and events A step is an action that needs to be run via a code function. It receives an event together with a context, which can be shared by all steps. There are two types of base events: StartEvent , which is a flow-initiating event, and StopEven, to stop the event’s execution. A Workflow is a class that contains all the steps and interactions and puts them all together. We’ll create a Workflow to receive the user’s request, expose mappings and possible fields to filter, generate the query, and then make a loop to fix an invalid query. A query could be invalid for Elasticsearch because it does not provide valid JSON or because it has syntax errors. To show you how this works, we’ll use a practical case of searching for hotel rooms with a workflow to extract values to create queries based on the user’s search. The complete example is available in this Notebook . Steps Install dependencies and import packages Prepare data Llama-index workflows Execute workflow tasks 1. Install dependencies and import packages We’ll use mistral-saba-24b and llama3-70b Groq models, so besides elasticsearch and llama-index , we’ll need the llama-index-llms-groq package to handle the interaction with the LLMs. Groq is an inference service that allows us to use different open available models from providers like Meta, Mistral, and OpenAI. In this example, we’ll use its free layer . You can get the API KEY that we’ll use later here . Let’s proceed to install the required dependencies: Elasticsearch, the LlamaIndex core library, and the LlamaIndex Groq LLM’s package. We start by importing some dependencies to handle environment variables (os), and managing JSON. After that, we import the Elasticsearch client with the bulk helper to index using the bulk API. We finish by importing the Groq class from LlamaIndex to interact with the model, and the components to create our workflow. 2. Prepare data Setup keys We set the environment variables needed for Groq and Elasticsearch. The getpass library allows us to enter them via a prompt without echoing them. Elasticsearch client The Elasticsearch client handles the connection with Elasticsearch and allows us to interact with Elasticsearch using the Python library. Ingesting data to Elasticsearch We are going to create an index with hotel rooms as an example: Mappings We’ll use text-type fields for the properties where we want to run full-text queries; “keyword” for those where we want to apply filters or sorting, and “byte/integer” for numbers. Ingesting documents to Elasticsearch Let’s ingest some hotel rooms and amenities so users can ask questions that we can turn into Elasticsearch queries against the documents. We parse the JSON documents into a bulk Elasticsearch request. 3. LlamaIndex Workflows We need to create a class with the functions required to send Elasticsearch mapping to the LLM, run the query, and handle errors. Workflow prompts The EXTRACTION_PROMPT will provide the user’s question and index the mappings to the LLM so it can return an Elasticsearch query. Then, the REFLECTION_PROMPT will help the LLM make corrections in case of errors by providing the output from the EXTRACTION_PROMPT , plus the error caused by the query. Workflow events We created classes to handle extraction and query validation events: Workflow Now, let’s put everything together. We first need to set the maximum number of attempts to change the model to 3. Then, we will do an extraction using the model configured in the workflow. We validate if the event is StartEvent ; if so, we capture the model and question (passage). Afterward, we run the validation step, that is, trying to run the extracted query in Elasticsearch. If there are no errors, we generate a StopEvent and stop the flow. Otherwise, we issue a ValidationErrorEvent and repeat step 1, providing the error to try to correct it and return to the validation step. If there is no valid query after 3 attempts, we change the model and repeat the process until we reach the timeout parameter of 60s running time. 4. Execute workflow tasks We will make the following search: Rooms with smart TV, wifi, jacuzzi and price per night less than 300 . We’ll start using the mistral-saba-24b model and switch to llama3-70b-8192 , if needed, following our flow. Results (Formatted for readability) === EXTRACT STEP === MODEL: mistral-saba-24b OUTPUT: Step extract produced event ExtractionDone Running step validate === VALIDATE STEP === Max retries for model mistral-saba-24b reached, changing model Elasticsearch results: Step validate produced event ValidationErrorEvent Running step extract === EXTRACT STEP === MODEL: llama3-70b-8192 OUTPUT: Step extract produced event ExtractionDone Running step validate === VALIDATE STEP === Elasticsearch results: Step validate produced event StopEvent In the example above, the query failed because the mistral-saba-24b model returned it in markdown format, adding ```json at the beginning and ``` at the end. In contrast, the llama3-70b-8192 model directly returned the query using the JSON format. Based on our needs, we can capture, validate, and test different errors or build fallback mechanisms after a number of attempts. Conclusion The LlamaIndex workflows offer an interesting alternative to develop agentic flows using events and steps. With only a few lines of code, we managed to create a system that is able to autocorrect with interchangeable models. How could we improve this flow? Along with the mappings, we can send to the LLM possible exact values for the filters, reducing the number of no result queries because of misspelled filters. To do so, we can run a terms aggregation on the features and show the results to the LLM. Adding code corrections to common issues—like the Markdown issue we had—to improve the success rate. Adding a way to handle valid queries that yield no results. For example, remove one of the filters and try again to make suggestions to the user. A LLM could be helpful in choosing which filters to remove based on the context. Adding more context to the prompt, like user preferences or previous searches, so that we can provide customized suggestions together with the Elasticsearch results. Would you like to try one of these? Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Understanding steps and events Steps 1. Install dependencies and import packages 2. Prepare data Setup keys Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using LlamaIndex Workflows with Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/llamaindex-workflows-with-elasticsearch","meta_description":"Learn how to create an Elasticsearch-based step for your LlamaIndex workflow."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Engineering a new Kibana dashboard layout to support collapsible sections & more Building collapsible dashboard sections in Kibana required overhauling an embeddable system and creating a custom layout engine. These updates improve state management, hierarchy, and performance while setting the stage for new advanced dashboard features. Developer Experience TS HM NR By: Teresa Alvarez Soler , Hannah Mudge and Nathaniel Reese On January 22, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We are developing collapsible sections to hide and show panels in your Kibana dashboards to help organize content and improve performance. It’s a classic software development tale: sometimes to go forward you have to go…down? Read about how building an in-demand feature that seemed straightforward can sometimes lead you to bigger simplifications than you ever intended! 😅 Kibana dashboard collapsible sections A little bit of background: Dashboards in Kibana can contain many visualizations that Site Reliability Engineers (SREs) use to keep systems running or Security Analysts use in their investigations. These dashboards can be lengthy and slow to load. Users want to better organize dashboard content to avoid performance pitfalls and make them easier to scan. Today, the best way to accomplish this is to split dashboard content into multiple dashboards and then link them using the dashboard links panel to facilitate navigation. This unfortunately doesn’t let you see things side-by-side and makes updates and dashboard maintenance require a lot of effort from the dashboard author. To solve this need, we are developing collapsible sections to hide and show panels in your Kibana dashboards –these sections help organize content and don’t load content that is collapsed to improve performance. These new sections will allow you to group dashboard panels and data visualizations that are thematically related making it easier to find the information you are looking for. Most importantly, you can easily hide and expand these sections allowing you to load only the data that you need. This will help you create side-by-side comparisons for your charts and streamline dashboard performance. Planning the engineering approach At the onset when looking at what our customers wanted the feature seemed like a business-as-usual-sized engineering effort: Dashboards contain panels (more about those in a moment) and they are to be organized into sections and the product requirements ask that we only render them when a section is open. There’s also a drag and drop system to lay out a dashboard and it needs to account for these sections and handle a variety of moving-things-between-sections sort of use cases. Seems well in hand as an enhancement to existing code, right? Well unfortunately after a short proof of concept, we found the answer is no. It’s not that simple. Kibana uses an “embeddable” framework and this framework lacks the qualities needed to not render certain embedded objects on a dashboard. Let's take a look at why… What is an “embeddable”? Even though \"embeddable\" does not appear in the navigation menu alongside \"Discover\" and \"Dashboard\", you interact with embeddables throughout Kibana. The histogram in Discover, each panel in a Dashboard, a panel’s context menu, a Lens chart in Observability, or a Map in Security - all made possible with embeddables. Embeddables are React components that provide an API to deeply integrate with Kibana. This API allows them to be persisted and restored by any page, gives them access to the current search context, allows them to define editing UI, and is extensible so engineers can define how components interact with one another. They live in a registry, which separates their behaviours from where the code is written. Because of this, many engineers can work on different embeddables at the same time without getting in each other’s way. The need for a new embeddable system The legacy embeddable system we were working on at the time dates back to 2018. embeddable functionality is exposed through a custom user experience component abstraction. At the time, Kibana was transitioning from Angular 1 to React, so the embeddable system was designed to be framework agnostic which could smooth a theoretical transition away from React. While the architecture was required at the time, Kibana has changed a lot since then, and a move away from React is unlikely. Now, the inflexible and agnostic Embeddable architecture is a growing point of friction. Some pain points are: Complex state management: All state in an embeddable goes through one of two observables (input, output) in order to be inherited, set, or read. This requires consumers to set up complex two-way sync pipes. Limited inheritance: embeddables can have exactly one parent, limiting inheritance to a single level of hierarchy. Additionally, embeddable state flows from the parent to child, with child state overriding parent state if defined. Manual rendering: embeddables need a cumbersome manual render process and a compatibility layer between the rest of Kibana, which renders via React. Collapsible sections are not possible with a single level of hierarchy. Collapsible sections require multiple levels of hierarchy to allow panels to belong to the dashboard and a collapsible section . Otherwise, you wouldn’t be able to place panels into a collapsible section. New embeddable system So, to deliver this feature, we actually had to go “down” to the embeddable system itself and modernize how we manage embeddables: We had to design a new embeddable system. Fun! But also…..scope! The new embeddable functionality is exposed through plain old JavaScript objects and can compose their functionality by implementing interfaces. For example, an embeddable can communicate data loading by implementing the PublishesDataLoading interface. This offers the following benefits: Clean state management: Each piece of state is exposed as a read-only observable. Setter methods can be exposed for mutable state. Flexible inheritance: embeddables can have a chain of parents, allowing for as many levels of hierarchy as required. Each layer retains its own state so that the decision of which state to use can be determined at the time of consumption. With a system that tolerates the inheritance we need, collapsible sections can now be built. However, like any good refactor there’s a bit of a catch: embeddables are everywhere in Kibana and to implement this change without causing regressions we needed to migrate to the new embeddable system across Elastic’s full experience–from the Alerts page in Elastic Security to the Service Inventory in Elastic Observability and nearly everything in between. This has taken us some time but allows for some exciting new possibilities. New layout engine The driving force behind any Dashboard is the layout engine, which is the thing that allows panels to be dragged around and resized — without it, Dashboards would be entirely static (and boring)! Currently, Kibana uses the external react-grid-layout package to drive our Dashboards, which is an open-source layout engine managed by a small group of volunteers. This layout engine has worked great for our Dashboards up to this point; however, it is unfortunately missing critical features that would make collapsible sections possible out-of-the-box: either “panels within panels” or dragging panels across two separate instances of a layout. Due to the small team behind react-grid-layout, updates to the package are infrequent — this means that, even if we started contributing directly to react-grid-layout in order to add the features we need, incorporating these changes into Kibana Dashboards would be slow and unreliable. While we briefly considered making a Kibana-specific branch of react-grid-layout in order to get updates published at a pace that matched our development, the maintenance costs and inflexibility of this ultimately led us to discard this idea. After researching alternative layout engine packages, we decided that the best path forward would be to develop our own, internal layout engine — one that was built specifically with the Kibana Dashboard use case in mind! Work on this new layout engine, which we are calling kbn-grid-layout , has already started. To our knowledge, this is the first layout engine available that makes use of the native CSS grid in order to position its panels — all other layout engines that we found in our research relied on pixel-level transforms or absolute positioning. This makes it a lot easier to understand how panels are placed on a dashboard. kbn-grid-layout uses passive event handlers for all dragging and resizing events, with an emphasis on reducing the number of re-renders to a minimum during these actions to improve performance. Because we are in control of these event handlers, this allows us to focus on the user experience much more than we previously could, and we’ve added features such as auto-scrolling when dragging near the top or bottom of the screen, and locking the height of the grid during resize events to prevent unexpected behavior that could result from the browser responding to height changes before the resize event was complete. Drag event Resize event We are currently working on refining the implementation, which includes improving the management of collapsible sections, adding keyboard support for dragging and resizing (which is not currently supported by Kibana dashboards), and much more. Not only will this new layout engine unlock the ability to add collapsible sections, it is being built with accessibility and efficiency at the forefront — which means the entire Dashboard experience should be improved once we make the final layout engine swap from react-grid-layout to kbn-grid-layout ! react-grid-layout kbn-grid-layout Check it out before the release We’re nearly out of the embeddable woods and ready to enjoy the fruits of our labors with all of our customers from weekly-releasing Elastic Serverless to our selfhosted users. Our customers will be able to design a single dashboard with many sections that can be collapsed by default allowing an investigation to only load panel content that’s needed while keeping lengthy dashboards tidy. If you want to provide us feedback or sign up for early testing please let us know ! We will announce when this feature is ready to be used in the next few months. Stay tuned! Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Developer Experience May 6, 2025 Built with Elastic: Hybrid search for Cypris – the world’s largest innovation database Dive into Logan Pashby's story at Cypris, on building hybrid search for the world's largest innovation database. ET LP By: Elastic Team and Logan Pashby Developer Experience April 18, 2025 Kibana Alerting: Breaking past scalability limits & unlocking 50x scale Kibana Alerting now scales 50x better, handling up to 160,000 rules per minute. Learn how key innovations in the task manager, smarter resource allocation, and performance optimizations have helped break past our limits and enabled significant efficiency gains. MC By: Mike Cote Jump to Kibana dashboard collapsible sections Planning the engineering approach What is an “embeddable”? The need for a new embeddable system New embeddable system Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Engineering a new Kibana dashboard layout to support collapsible sections & more - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/kibana-dashboard-build-layout","meta_description":"Explore collapsible sections in Kibana dashboards and how we engineered them to organize content and boost performance."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to build autocomplete feature on search application automatically using LLM generated terms Learn how to enhance your search application with an automated autocomplete feature in Elastic Cloud using LLM-generated terms for smarter, more dynamic suggestions. Generative AI Search Relevance How To MS By: Michael Supangkat On March 5, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Autocomplete is a crucial feature in search applications, enhancing user experience by providing real-time suggestions as users type. Traditionally, autocomplete in Elasticsearch is implemented using the completion suggester , which relies on predefined terms. This approach requires manual curation of suggestion terms and often lacks contextual relevance. By leveraging LLM-generated terms via OpenAI’s completion endpoint , we can build a more intelligent, scalable, and automated autocomplete feature. Supercharge your search autocomplete using LLM In this article, we’ll explore: Traditional method of implementing autocomplete in Elasticsearch. How integrating OpenAI’s LLM improves autocomplete suggestions. Scaling the solution using Ingest Pipeline and Inference Endpoint in Elastic Cloud . Traditional autocomplete in Elasticsearch The conventional approach to building autocomplete in Elasticsearch involves defining a completion field in the index mapping. This allows Elasticsearch to provide suggestions based on predefined terms. This would be straightforward to implement, especially if you have already built a comprehensive suggestion list for a fairly static dataset. Implementation Steps Create an index with a completion field. Manually curate suggestion terms and store them in the index. Query using a completion suggester to retrieve relevant suggestions. Example: Traditional autocomplete setup First, create a new index named products_test. In this index, we define a field called suggest of type completion, which is optimized for fast autocomplete suggestions. Insert a test document into the products_test index. The suggest field stores multiple completion suggestions. Finally, we use the completion suggester query to search for suggestions starting with \"MacB.\" The prefix \"MacB\" will match \"MacBook Air M2.\" The suggest section contains matched suggestions. Options contain an array of matching suggestions, where \"text\": \"MacBook Air M2\" is the top suggestion. While effective, this method requires manual curation, constant updates to suggestion terms and does not adapt dynamically to new products or descriptions. Enhancing autocomplete with OpenAI LLM In some use cases, datasets change frequently, which requires you to continuously update a list of valid suggestions. If new products, names, or terms emerge, you have to manually add them to the suggestion list. This is where LLM steps in, as it can dynamically generate relevant completions based on real-world knowledge and live data. By leveraging OpenAI’s completion endpoint , we can dynamically generate autocomplete suggestions based on product names and descriptions. This allows for: Automatic generation of synonyms and related terms . Context-aware suggestions derived from product descriptions. No need for manual curation , making the system more scalable. Steps to implement LLM-powered autocomplete Create an inference endpoint using OpenAI’s completion API. Set up an Elasticsearch ingest pipeline that queries OpenAI for suggestions using a pre-defined prompt using a script processor Store the generated terms in an Elasticsearch index with a completion field. Use a search request to fetch dynamic autocomplete results. All the steps above can be easily completed by copying and pasting the API requests step by step in the Kibana Dev tool. In this example, we will be using the gpt-4o-mini model. You will need to get your OpenAI API key for this step. Login to your OpenAI account and navigate to https://platform.openai.com/api-keys . Next, create a new secret key or use an existing key. Creating an inference endpoint First, we create an inference endpoint. This allows us to interact seamlessly with a machine learning model (in this case OpenAI) via API, while still working within Elastic’s interface. Setting up the Elasticsearch ingest pipeline By setting up an ingest pipeline, we can process data upon indexing. In this case, the pipeline is named autocomplete-LLM-pipeline and it contains: A script processor, which defines the prompt we are sending to OpenAI to get our suggestion list. Product name and product description are included as dynamic values in the prompt. An inference processor , which refers to our OpenAI inference endpoint. This processor takes a prompt from the script processor as input, sends it to the LLM model, and stores the result in an output field called results . A split processor, which splits the text output from LLM within the results field into a comma-separated array to fit the format of a completion type field of suggest . 2 remove processors, which remove the prompt and results field after the suggest field has been populated. Indexing sample documents For this example, we are using the documents API to manually index documents from the dev tool to a temporary index called ‘products’. This is not the autocomplete index we will be using. Creating index with completion type mapping Now, we are creating the actual autocomplete index which contains the completion type field called suggest . Reindexing documents to a designated index via the ingest pipeline In this step, we are reindexing data from our products index created previously to the actual autocomplete index products_with_suggestion , through our ingest pipeline autocomplete-LLM-Pipeline . The pipeline will process the sample documents from the original index and populate the autocomplete suggest field in the destination index. Sample autocomplete suggestions As shown below, the new index (products_with_suggestion) now includes a new field called suggest , which contains an array of terms or synonyms generated by OpenAI LLM. You can run the following request to check: Results: Take note that the generated terms from LLM are not always the same even if the same prompt was used. You can check the resulting terms and see if they are suitable for your search use case. Else, you have the option to modify the prompt in your script processor to get more predictable and consistent suggestion terms. Testing the autocomplete search Now, we can test the autocomplete functionality using the completion suggester query. The example below also includes a fuzzy parameter to enhance the user experience by handling minor misspellings in the search query. You can execute the query below in the dev tool and check the suggestion results. To visualize the autocomplete results, I have implemented a simple search bar that executes a query against the autocomplete index in Elastic Cloud using our client. The search returns result based on terms in the suggestion list generated by LLM as you type. Scaling with OpenAI inference integration By using OpenAI’s completion API as an inference endpoint within Elastic Cloud , we can scale this solution efficiently: Inference endpoint allows automated and scalable LLM suggestions without having to manually create and maintain your own list. Ingest Pipeline ensures real-time enrichment of data during indexing. Script Processor within the ingest pipeline allows easy editing of the prompt in case there is a need to customise the nature of the suggestion list in a more specific way. Pipeline execution can also be configured directly upon ingestion as an index template for further automation. This enables the suggestion list to be built on the fly as new products are added to the index. In terms of cost efficiency, the model is only invoked during the ingestion process, meaning its usage scales with the number of documents processed rather than the search volume. This can result in significant cost savings compared to running the model at search time if you are expecting growth in users or search activity. Conclusion Traditionally, autocomplete relies on manually defined terms, which can be limiting and labour intensive. By leveraging OpenAI’s LLM-generated suggestions, we have the option to automate and enhance autocomplete functionality, improving search relevance and user experience. Furthermore, using Elastic’s ingest pipeline and inference endpoint integration ensures an automated, scalable autocomplete system. Overall, if your search use case requires a very specific set of suggestions from a well maintained and curated list, ingesting the list in bulk via our API conventionally as described in the first part of this article would still be a great and performant option. If managing and updating a suggestion list is a pain point, an LLM-based completion system removes that burden by automatically generating contextually relevant suggestions—without any manual input. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Jump to Supercharge your search autocomplete using LLM Traditional autocomplete in Elasticsearch Implementation Steps Example: Traditional autocomplete setup Enhancing autocomplete with OpenAI LLM Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to build autocomplete feature on search application automatically using LLM generated terms - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-autocomplete-using-llm","meta_description":"Learn how to enhance your search app with an automated autocomplete feature in Elastic Cloud using LLM-generated terms."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to optimize RAG retrieval in Elastisearch with DeepEval Learn how to optimize the Elasticsearch retriever in a RAG pipeline using DeepEval. Generative AI How To KV By: Kritin Vongthongsri On March 17, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. LLMs are prone to hallucinations, lack domain-specific expertise, and are limited by their context windows. Retrieval-Augmented Generation (RAG) addresses these issues by enabling LLMs to access relevant external context, thereby grounding their responses. Several RAG methods—such as GraphRAG and AdaptiveRAG—have emerged to improve retrieval accuracy. However, retrieval performance can still vary depending on the domain and specific use case of a RAG application. To optimize retrieval for a given use case, you'll need to identify the hyperparameters that yield the best quality. This includes the choice of embedding model, the number of top results (top-K), the similarity function, reranking strategies, and more. Optimizing retrieval means evaluating and iterating on these hyperparameters until you find the best performing combination. In this blog, we'll explore how to optimize the Elasticsearch retriever in a RAG pipeline using DeepEval. We’ll begin by installing Elasticsearch and DeepEval: Measuring retrieval performance To optimize your Elasticsearch retriever and benchmark each hyperparameter combination, you’ll need a method for assessing retrieval quality. Here are 3 key metrics that allow you to measure retrieval performance: contextual precision, contextual recall, and contextual relevancy. Contextual precision: The contextual precision metric checks if the most relevant information chunks are ranked higher than the less relevant ones, for a given input. Simply put, it ensures that the most useful information is first in the set of retrieved contexts. Contextual recall: The contextual recall metric measures how well the retrieved information chunks aligns with the expected output, or ideal LLM response. A higher contextual recall score indicates that the retrieval system is more effective at capturing every piece of relevant information available in your knowledge base. Contextual relevancy: Finally, the contextual relevancy metric assesses how relevant the information in the retrieval context is to the user input of your RAG application. … A combination of these 3 metrics is essential to ensure that the retriever fetches the right amount of information in the proper sequence, and that your LLM receives clean, well-organized data for generating accurate outputs. Ideally, you’ll want to find the combination of hyperparameters that yields the highest scores across all three metrics. However, in some cases, increasing the recall score may inevitably decrease relevance. Striking the right balance between these factors is key to achieving optimal performance. If you need custom metrics for a specific use case, G-Eval and DAG might be worth exploring. These tools allow you to define precise metrics with tailored evaluation criteria. Here are some resources that might help you better understand how these metrics are calculated: How is contextual precision calculated How is contextual recall calculated How is contextual relevancy calculated Evaluating Retrieval in RAG applications Elasticsearch hyperparameters to optimize Elasticsearch provides extensive flexibility in retrieving information for RAG pipelines, offering a wide range of hyperparameters that can be fine-tuned to optimize retrieval performance. In this section, we’ll discuss some of these hyperparameters. Before retrieval: To structure your data optimally before inserting it into your Elasticsearch vector database, you can fine-tune parameters such as chunk size and chunk overlap. Additionally, selecting the right embedding model ensures efficient and meaningful vector representations. During retrieval: Elasticsearch gives you full control over the retrieval process. You can configure the similarity function, first determining the number of candidates for the approximate search before applying KNN on the top-K candidates. Then, you select the most relevant top-K results. Moreover, you can define your retrieval strategy—whether it's semantic (leveraging vector embeddings), text-based (using query rules), or a hybrid approach that combines both methods. After retrieval: Once results are retrieved, Elasticsearch allows you to refine them further through reranking. You can select a reranking model, define a reranking window, set a minimum score threshold, and more—ensuring that only the most relevant results are prioritized. … Different hyperparameters influence certain metric scores more than others. For example, if you're seeing issues with contextual relevance, it’s likely due to a specific set of hyperparameters, including top-K. By mapping specific hyperparameters to each metric, you can iterate more efficiently and fine-tune your retrieval pipeline with greater precision. Below is a table outlining which retrieval metrics are impacted by different hyperparameters: Metric Hyperparameter Contextual Precision Reranking model, reranking window, reranking threshold Contextual Recall Retrieval strategy (text vs embedding), embedding model, candidate count, similarity function, top-K Contextual Relevancy top-K, chunk size, chunk overlap In the next section, we'll walk through how to evaluate and optimize our Elasticsearch retriever with code examples. We'll use the `\"all-MiniLM-L6-v2\"` to embed our text documents, set `top-K` to 3, and configure the number of candidates to 10. Setting up RAG with Elastic Retriever To get started, connect to your local or cloud-based Elastic cluster: Next, create an Elasticsearch index with the appropriate type mappings to store text and embeddings as dense vectors. To insert your document chunks into the Elastic index, first encode into vectors using an embedding model. For this example, we’re using \" all-MiniLM-L6-v2 \". Finally, define a retriever function to search from your elastic client for your RAG pipeline. Evaluating your RAG retriever With your Elasticsearch retriever set up, you can begin evaluating it as part of your RAG pipeline. The evaluation consists of 2 steps: Preparing an input query along with the expected LLM response, and using the input to generate a response from your RAG pipeline to create an LLMTestCase containing the input, actual output, expected output, and retrieval context. Evaluating the test case using a selection of retrieval metrics. Preparing a test case Here, we prepare an input asking \"How does Elasticsearch work?\" with the corresponding expected output: \"Elasticsearch uses dense vector and sparse vector similarity for semantic search.\" Let's examine the actual_output generated by our RAG pipeline: Finally, consolidate all test case parameters into a single LLM test case. Running evaluations To run evaluations on your elastic retriever, pass the test case and metrics we’ve defined earlier into the evaluate function. Optimizing the Retriever Once you’ve evaluated your test case, we can begin to analyze the results. Below are example evaluation results from the test case we created, as well as additional hypothetical queries a user might ask your RAG system. Query Contextual precision Contextual recall Contextual relevancy \"How does Elasticsearch work?\" 0.63 0.93 0.52 \"Explain Elasticsearch's indexing method.\" 0.57 0.87 0.49 \"What makes Elasticsearch efficient for search?\" 0.65 0.90 0.55 Contextual precision is suboptimal → Some retrieved contexts might be too generic or off-topic. Contextual recall is strong → Elasticsearch is retrieving enough relevant documents. Contextual relevancy is inconsistent → The quality of retrieved documents varies across queries. Improving retrieval quality As previously mentioned, each metric is influenced by specific retrieval hyperparameters. Given that contextual precision is suboptimal and contextual relevancy is inconsistent, it's clear that reranker hyperparameters, along with top-K, chunk size, and chunk overlap, need improvement. Here’s an example of how you might iterate on top-K using a simple for loop. This for loop helps identify the top-K value that produces the best metric scores. You should apply this approach to all hyperparameters that impact relevancy and precision scores in your retrieval system. This will allow you to determine the optimal combination! Tracking improvements DeepEval is open-source and great if you’re looking to evaluate your retrievers locally. However, if you're looking for a way to conduct deeper analyses and store your evaluation results, Confident AI brings your evaluations to the cloud and enables extensive experimentation with powerful analysis tools. Confident allows you to: Curate and manage your evaluation dataset effortlessly. Run evaluations locally using DeepEval metrics while pulling datasets from Confident AI. View and share testing reports to compare prompts, models, and refine your LLM application. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Measuring retrieval performance Elasticsearch hyperparameters to optimize Setting up RAG with Elastic Retriever Evaluating your RAG retriever Preparing a test case Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to optimize RAG retrieval in Elastisearch with DeepEval - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/rag-retrieval-elasticsearch-deepeval","meta_description":"Learn how to optimize the Elasticsearch retriever in a RAG pipeline using DeepEval."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Leveraging AutoOps to detect long-running search queries Learn how AutoOps helps you investigate long-running search queries plaguing your cluster to improve search performance. AutoOps VC By: Valentin Crettaz On January 2, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Released early November on Elastic Cloud Hosted, AutoOps significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. One of the hundreds of analyses AutoOps runs every minute to check your cluster's settings, metrics, and health alerts when long running search queries are plaguing your cluster. Long running search queries can significantly impact performance, leading to high resource consumption. Let's see how this works concretely. How does it work? The beauty of AutoOps for Elastic Cloud Hosted is that there's nothing to do. In all regions where AutoOps is supported , an AutoOps agent is automatically attached to any new or existing deployment, and within minutes, metrics will start shipping, analysis will kick in, and events will be raised as soon as something fishy is detected. There's no need to enable slow logs and set up Filebeat to tail and index them somewhere, it just works out of the box by carefully and regularly monitoring the Task Management API. In order to know if AutoOps is enabled for a given deployment, one can simply head over to his Elastic Cloud console page and click on “Manage” deployment. If the “Open AutoOps ” button appears at the top-right of the screen, then AutoOps is enabled. When opening the Deployment view in AutoOps, we're immediately presented with a quick history of all the recent events. In the screenshot below, we can see that a \"Long running search task\" event was opened recently. Clicking on the event opens up a fly out panel showing the DSL of the slow search query that has been detected along with a whole bunch of information related to the execution context of that query. Understanding long running search tasks The screenshot below shows all the information that AutoOps was able to gather and display in the event fly out panel. We’ll now review each part in more detail. 1. The involved node First, we get a link to the node where the long-running query was detected, i.e. instance-0000000223 . That link allows us to jump directly to the Nodes view where we can find a wealth of metrics and information about that specific node. 2. The involved indices We can also see which indices the query was being run on. In the present case, we can see that the query ran on logs-apache.error-default, logs-nginx.error-default and two more indices. Clicking on those indices will send us to the Shards view which will allow us to see the detailed shards breakdown of those indices on the identified node as well as all the shards of other indices also located on that node. That view will help us detect if there are any hotspots that might be responsible for causing the slow query. 3. Potential reasons for high query latency Digging deeper, we can then see that some basic query analysis took place and AutoOps surfaced a few potential reasons why the query might be slow. In this case, we can see that: the query ran on a 30 days time interval, which might represent a big volume of data there are nested aggregations, which are known to perform poorly the response might potentially contain up to 20'000 aggregation buckets, which might be taxing on node memory There are more detection rules for queries that use regular expressions or scripts. Moreover, new detection rules will be added regularly and also put into perspective with the index mappings. 4. The query context Finally, there's some more information to glean about the context of the search query, such as: for how long it has been running, whether it is cancellable or not, all the headers that were attached to the HTTP call. In this case, we can see the trace.id header (which makes it easy to find it in APM), but also X-Opaque-Id that contains an indication of the client that sent this query. Here, we can see that the query originated from a SIEM alerting rule in Kibana, but it could also be a visualization or a dashboard, or even a user running the query in Dev Tools. Also works for ES|QL But wait, there's more! AutoOps doesn't only detect long-running DSL queries, but also ES|QL ones. On the screenshot below, we can see that a slow ES|QL query has been detected by AutoOps. All the same context information is available for ES|QL queries, except that no query analysis is currently done. As a result, AutoOps doesn’t yet provide any insights into how to improve ES|QL queries, but that will be added soon. After detecting long-running search query Since this event is raised when a long-running search query has been detected, there are a few options forward. When inspecting the query, if it looks like a rogue query or a query run from Dev Tools by a careless user, then the task can simply be cancelled if it’s still running. On the other hand, if it looks like a legitimate query and it is not running anymore, the next step should be to investigate the “ reasons for increased latency ” where AutoOps listed a few potential issues that were detected by inspecting the query. This is only done for DSL at this time, ES|QL will be supported in the future. How long is long? By default, AutoOps will raise a \"Long running search task\" event if the search query has been running for more than one minute. This is a default configuration setting that can easily be modified by clicking on the three dots icon at the top-right of the event fly out panel and then choosing “Customize” in order to change the default duration threshold. After clicking on “Customize”, a dialog window pops up and offers the possibility to modify the duration threshold (in minutes) before raising \"Long running search task\" events. If AutoOps is monitoring several clusters, there’s also the opportunity to apply the custom setting only on specific clusters and not all. Wrapping up As we can see, AutoOps helps detect long-running search queries and dig out a wealth of information about them. Make sure to leverage all that information to improve your search queries and relieve your cluster as much as possible from unbearable loads. Also note that the \"Long running search task\" event is just one out of hundreds of other insightful events that AutoOps knows to detect. If your deployment is in one of the supported regions, feel free to head over to your Elastic Cloud account and launch AutoOps to see how it makes cluster management so much simpler. Also stay tuned for future articles on other very helpful events and recommendations. Report an issue Related content AutoOps December 18, 2024 Resolving high CPU usage issues in Elasticsearch with AutoOps How AutoOps pinpointed and resolved high CPU usage in an Elasticsearch cluster: A step-by-step case study. MD By: Musab Dogan AutoOps How To November 20, 2024 Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. SF By: Sachin Frayne AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Jump to How does it work? Understanding long running search tasks 1. The involved node 2. The involved indices 3. Potential reasons for high query latency Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Leveraging AutoOps to detect long-running search queries - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/slow-search-elasticsearch-query-autoops","meta_description":"Learn how to detect and investigate Elasticsearch long running search queries using AutoOps to improve your search performance."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch hybrid search Learn about hybrid search, the types of hybrid search queries Elasticsearch supports, and how to craft them. Vector Database How To VC By: Valentin Crettaz On February 17, 2025 Part of Series Vector search introduction and implementation Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. This article is the last one in a series of three that dives into the intricacies of vector search (aka semantic search) and how it is implemented in Elasticsearch. The first part was focused on providing a general introduction to the basics of embeddings (aka vectors) and how vector search works under the hood. Armed with all the vector search knowledge learned in the first article, the second article guided you through the meanders of how to set up vector search and execute k-NN searches in Elasticsearch. In this third and last part, we will leverage what we have learned in the first two parts and build upon that knowledge by delving into how to craft powerful hybrid search queries in Elasticsearch. The need for hybrid search Before diving into the realm of hybrid search, let’s do a quick refresh of what we learned in the first article of this series regarding how lexical and semantic search differ and how they can complement each other. To sum it up very briefly, lexical search is great when you have control over your structured data and your users are more or less clear on what they are searching for. Semantic search, however, provides great support when you need to make unstructured data searchable and your users don’t really know exactly what they are searching for. It would be fantastic if there was a way to combine both in order to squeeze as much substance out of each one as possible. Enter hybrid search! In a way, we can see hybrid search as some sort of “sum” of lexical search and semantic search. However, when done right, hybrid search can be much better than just the sum of those parts, yielding far better results than either lexical or semantic search would do on their own. Running a hybrid search query usually boils down to sending a mix of at least one lexical search query and one semantic search query and then merging the results of both. The lexical search results are scored by a similarity algorithm, such as BM25 or TF-IDF , whose score scale is usually unbounded as the max score depends on the number and frequency of terms stored in the inverted index. In contrast, semantic search results can be scored within a closed interval, depending on the similarity function that is being used (e.g., [0; 2] for cosine similarity). In order to merge the lexical and semantic search results of a hybrid search query, both result sets need to be fused in a way that maintains the relative relevance of the retrieved documents, which is a complex problem to solve. Luckily, there are several existing methods that can be utilized; two very common ones are Convex Combination (CC) and Reciprocal Rank Fusion (RRF). Basically, Convex Combination, also called Linear Combination, seeks to combine the normalized score of lexical search results and semantic search results with respective weights and β \\beta β (where 0 ≤ α , β 0 \\leq \\alpha, \\beta 0 ≤ α , β ), such that: CC can be seen as a weighted average of the lexical and semantic scores Weights between 0 and 1 are used to deboost the related query, while weights greater than 1 are used to boost it. RRF, however, doesn’t require any score calibration or normalization and simply scores the documents according to their rank in the result set, using the following formula, where k is an arbitrary constant meant to adjust the importance of lowly ranked documents: Both CC and RRF have their pros and cons as highlighted in Table 1, below: Table 1: Pros and cons of CC and RRF Convex Combination Reciprocal Rank Fusion Pros Good calibration of weights makes CC more effective than RRF Doesn’t require any calibration, fully unsupervised and there’s no need to know min/max scores Cons Requires a good calibration of the weights and the optimal weights are specific to each data set Not trivial to tune the value of k and the ranking quality can be affected by increasing result set size It is worth noting that not everyone agrees on these pros and cons depending on the assumptions being made and the data sets on which they have been tested. A good summary would be that RRF yields slightly less accurate scores than CC but has the big advantage of being “plug & play” and can be used without having to fine-tune the weights with a labeled set of queries. Elastic decided to support both the CC and RRF approaches. We’ll see how this is carried out later in this article. If you are interested in learning more about the rationale behind that choice, you can read this great article from the Elastic blog and also check out this excellent talk on RRF presented at Haystack 2023 by Elastician Philipp Krenn. Timeline After enabling brute-force kNN search on dense vectors in 7.0 back in 2019, Elasticsearch started supporting approximate nearest neighbors (ANN) search in February 2022 with the 8.0 release and hybrid search support came right behind with the 8.4 release in August 2022. Figure 1, below, shows the Elasticsearch timeline for bringing hybrid search to market: Figure 1: Hybrid search timeline for Elasticsearch The anatomy of hybrid search in Elasticsearch As we’ve briefly hinted at in our previous article , vector search support in Elasticsearch has been made possible by leveraging dense vector models (hence the dense_vector field type), which produce vectors that usually contain essentially non-zero values and represent the meaning of unstructured data in a multi-dimensional space. However, dense models are not the only way of performing semantic search. Elasticsearch also provides an alternative way that uses sparse vector models. Elastic created a sparse NLP vector model called Elastic Learned Sparse EncodeR , or ELSER for short, which is an out-of-domain (i.e., not trained on a specific domain) sparse vector model that does not require any fine-tuning. It was pre-trained on a vocabulary of approximately 30,000 terms , and as it’s a sparse model most of the vector values (i.e., more than 99.9%) are zeros. The way it works is pretty simple. At indexing time, the sparse vectors containing term/weight pairs are generated using the inference ingest processor and stored in fields of type sparse_vector , which is the sparse counterpart to the dense_vector field type. At query time, a specific DSL query also called sparse_vector replaces the original query terms with terms available in the ELSER model vocabulary that are known to be the most similar to them given their weights. Sparse or dense? Before heading over to hybrid search queries, we would like to briefly highlight the differences between sparse and dense models . Figure 2, below, shows how the piece of text “the quick brown fox” is encoded by each model. In the sparse case, the four original terms are expanded into 30 weighted terms that are closely or distantly related to them. The higher the weight of the expanded term, the more related it is to the original term. Since the ELSER vocabulary contains more than 30,000 terms, it means that the vector representing “the quick brown fox” has as many dimensions and contains only ~0.1% of the non-zero values (i.e., ~30 / 30,000), hence why we call these models sparse. Figure 2: Comparing sparse and dense model encoding In the dense case, “the quick brown fox” is encoded into a much smaller embeddings vector that captures the semantic meaning of the text. Each of the 384 vector elements contains a non-zero value that represents the similarity between the piece of text and each of the dimensions. Note that the names we have given to dimensions (i.e., is_mouse , is_brown , etc.) are purely fictional, and their purpose is just to give a concrete description of the values. Another important difference is that sparse vectors are queried via the inverted index (yes, like lexical search), whereas as we have seen in previous articles, dense vectors are indexed in specific graph-based or cluster-based data structures that can be searched using approximate nearest neighbors (ANN) algorithms . We won’t go any further into the details of how ELSER came to be, but if you’re interested in understanding how that model was born, we recommend you check out this article from the Elastic Search Labs, which explains in detail the thought process that led Elastic to develop it. If you are thinking about evaluating ELSER, it might be worth checking Elastic’s relevance workbench , which demonstrates how ELSER compares to a normal BM25 lexical search. We are also not going to dive into the process of downloading and deploying the ELSER model in this article, but you can take a moment and turn to the official documentation that explains very well how to do it. Hybrid search support Whether you are going to use dense or sparse retrieval, Elastic provides hybrid search support for both model types. The first type is a mix of a lexical search query specified in the query search option and a vector search query (or an array thereof) specified in the knn search option. The second one introduces a new search option called retriever (introduced in 8.14 and GA in 8.16) which also contains an array of search queries that can be of lexical (e.g., match ) or semantic (e.g., sparse_vector ) nature. If all this feels somewhat abstract to you, don’t worry, as we will shortly dive into the details to show how hybrid searches work in practice and what benefits they provide. In this article, we are going to focus on the second option using retrievers. Hybrid search with dense models Hybrid search using retrievers basically boils down to running a lexical search query mixed with an approximate k-NN search in order to improve relevance. Such a hybrid search query is shown below: As we can see above, a hybrid search query simply leverages the rrf retriever that combines a lexical search query (e.g., a match query) made with a standard retriever and a vector search query specified in the knn retriever. What this query does is first retrieve the top five vector matches at the global level, then combine them with the lexical matches, and finally return the ten best matching hits. The rrf retriever uses RRF ranking in order to combine vector and lexical matches. The RRF ranking formula can be further parametrized by two variables called rank_constant and rank_window_size (defaults to size ) that can be specified in the rrf retriever section, as shown below: This query runs the same way as the previous one, except that `rank_window_size` documents (instead of only 10) are retrieved from the vector and lexical queries and then ranked by RRF. Finally, the top documents ranked from 1 to `size` (e.g., 10) are then returned in the result set. The last thing to note about this hybrid query type is that RRF ranking requires a commercial license (Platinum or Enterprise), but if you don’t have one, you can still leverage hybrid searches with CC scoring or by using a trial license that allows you to enjoy the full feature set for one month. Hybrid search with sparse models The second hybrid search type for querying sparse models works exactly the same way as for dense vectors.. Below, we can see what such a hybrid query looks like: In the above query, we can see that the retrievers array contains one lexical match query as well as one semantic sparse_vector query that works on the ELSER sparse model that we introduced earlier. Hybrid search with dense and sparse models So far, we have seen two different ways of running a hybrid search, depending on whether a dense or sparse vector space was being searched. At this point, you might wonder whether we can mix both dense and sparse data inside the same index, and you’ll be pleased to learn that it is indeed possible. One concrete application could be that you need to search both a dense vector space with images and a sparse vector space with textual descriptions of those images. Such a query would look like this where we combine a standard retriever with a knn one: In the above payload, we can see a sparse_vector query searching for image descriptions within the ELSER sparse vector space, and in the knn retriever a vector search query searching for image embeddings (e.g., “brown fox” represented as an embedding vector) in a dense vector space. In addition, we leveraged RRF by using the rrf retriever. You can even add another lexical search query to the mix using another standard retriever, and it would look like this: The above payload highlights that we can leverage every possible way to specify a hybrid query containing a lexical search query, a vector search query, and a semantic search query. Limitations The main limitation to be aware of when evaluating the ELSER sparse model is that it only supports up to 512 tokens when running text inference. So, if your data contains longer text excerpts that you need to be fully searchable, you are left with two options: a) use another model that supports longer text, b) split your text into smaller segments, or 3) if you are on 8.15 or above, you can leverage the semantic_text field type, which handles automatic chunking. Optimizations It is undeniable that vectors, whether sparse or dense, can get quite long, from a few dozen to a few thousand dimensions depending on the inference model that you’re using. Also, whether you’re running a text inference on a small sentence containing just a few words or a large body of text, the generated embeddings vector representing the meaning will always have as many dimensions as configured in the model you’re using. As a result, these vectors can take quite some space in your documents and, hence, on your disk. The most obvious optimization to cater to this issue is to configure your index mapping to remove the vector fields (i.e., both dense_vector and sparse_vector ) from your source documents. By doing so, the vector values would still be indexed and searchable, but they would not be part of your source documents anymore, thus reducing their size substantially. It’s pretty simple to achieve this by configuring your mapping to exclude the vector fields from the _source , as shown in the code below: In order to show you some concrete numbers, we have run a quick experiment. We have loaded an index with the msmarco-passagetest2019-top1000 data set, which is a subset of the Microsoft MARCO Passage Ranking full data set. The 60 MB TSV file contains 182,469 text passages. Next, we have created another index containing the raw text and the embeddings vectors (dense) generated from the msmarco-MiniLM-L-12-v3 sentence-transformer model available from Hugging Face. We’ve then repeated the same experiment, but this time configuring the mapping to exclude the dense vector from the source documents. We’ve also run the same test with the ELSER sparse model, one time by storing the sparse_vector field inside the documents and one time by excluding them. Table 2, below, shows the size of each resulting index, whose names are self-explanatory. We can see that by excluding dense vector fields from the source, the index size is divided by 3 and by almost 3.5 in the rank feature case. Index Size (in MB) index-with-dense-vector-in-source 376 index-without-dense-vector-in-source 119 index-with-sparse_vector-in-source 1,300 index-without-sparse_vector-in-source 387 Admittedly, your mileage may vary, these figures are only indicative and will heavily depend on the nature and size of the unstructured data you will be indexing, as well as the dense or sparse models you are going to choose. A last note of caution worth mentioning concerning this optimization is that if you decide to exclude your vectors from the source, you will not be able to use your index as a source index to be reindexed into another one because your embedding vectors will not be available anymore. However, since the index still contains the raw text data, you can use the original ingest pipeline featuring the inference processor to regenerate the embeddings vectors. Let’s conclude In this final article of our series on vector search, we have presented the different types of hybrid search queries supported by Elasticsearch. One option is to use a combination of lexical search (e.g., query ) and vector search (e.g., knn ); the other is to leverage the newly introduced retriever search option with sparse_vector queries. We first did a quick recap of the many advantages of being able to fuse lexical and semantic search results in order to increase accuracy. Along the way, we reviewed two different methods of fusing lexical and semantic search results, namely Convex Combination (CC) and Reciprocal Rank Fusion (RRF), and looked at their respective pros and cons. Then, using some illustrative examples, we showed how Elasticsearch provides hybrid search support for sparse and dense vector spaces alike, using both Convex Combination and Reciprocal Rank Fusion as scoring and ranking methods. We also briefly introduced the Elastic Learned Sparse EncodeR model (ELSER), which is their first attempt at providing an out-of-domain sparse model built on a 30,000 tokens vocabulary. Finally, we concluded by pointing out one limitation of the ELSER model, and we also explained a few ways to optimize your future hybrid search implementations. If you like what you’re reading, make sure to check out the other parts of this series: Part 1: A Quick Introduction to Vector Search Part 2: How to Set Up Vector Search in Elasticsearch Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to The need for hybrid search Timeline The anatomy of hybrid search in Elasticsearch Sparse or dense? Hybrid search support Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch hybrid search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/hybrid-search-elasticsearch","meta_description":"Learn about hybrid search, the types of hybrid search queries Elasticsearch supports, and how to craft them."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures Benchmarking in preview provides Elasticsearch up to 37% better performance on Azure Cobalt 100 Arm-based VMs. Vector Database Generative AI Integrations Elastic Cloud Hosted YG HM By: Yuvraj Gupta and Hemant Malik On May 21, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures In November, Microsoft announced their custom in-house Arm architecture-based Azure Cobalt CPUs aimed to improve performance and efficiency. As part of our strategic partnership with Microsoft, we are constantly looking for ways to deliver the fruits of innovation to our customers using Elasticsearch on Microsoft Cloud. “At Elastic, we love working with the Microsoft teams, from silicon to models,” said Shay Banon, co-founder and chief technology officer at Elastic . “The rate of progress on the Azure team is impressive, and we are excited to collaborate with them to bring these benefits to our users as fast as possible.” Performance gains for Elastic users on Azure Arm-based VMs Elastic used our macro benchmarking framework for Elasticsearch Rally with the elastic/logs track to determine the maximum indexing performance on the preview of Azure Cobalt-powered virtual machines. Our benchmarks observed up to 37% higher indexing throughput performance when using the Epsv6 VMs compared to prior generation of Arm-based VMs (Epsv5 series) on Azure. “With the introduction of the new Cobalt 100 Arm-based VMs, we aim to deliver Elastic users on Azure up to 37% greater performance compared to the previous generation,” said Paul Nash, Corporate Vice President, product, Azure Infrastructure Platform at Microsoft Corp. “Continual investments like these to deliver better and better performance represent our commitment to provide the best infrastructure powering Elastic Cloud on Azure.” AI innovations in Elastic Cloud on Azure As Microsoft Azure innovates to delivery purpose-built infrastructure for AI using Cobalt 100 Arm-based VMs, we look forward to delivering these performance and efficiency gains to our users through Elastic Cloud on Microsoft Azure. This will empower our users to harness Arm computing innovations_ _as they build their GenAI applications using the Elastic Search AI Platform . To learn more about the new Cobalt 100 Arm-based Azure virtual machines, refer to the Microsoft blog . Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Jump to Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures Performance gains for Elastic users on Azure Arm-based VMs AI innovations in Elastic Cloud on Azure Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/azure-arm-elasticsearch-performance","meta_description":"Benchmarking in preview provides Elasticsearch up to 37% better performance on Azure Cobalt 100 Arm-based VMs."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. Search Relevance ML Research AL By: Andre Luiz On April 3, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Filters and facets are mechanisms used to refine search results, helping users find relevant content or products more quickly. In the classical approach, rules are manually defined. For example, in a movie catalog, attributes such as genre are pre-defined for use in filters and facets. On the other hand, with AI models, new attributes can be automatically extracted from the characteristics of movies, making the process more dynamic and personalized. In this blog, we explore the pros and cons of each method, highlighting their applications and challenges. Filters and facets Before we begin, let's define what filters and facets are. Filters are predefined attributes used to restrict a set of results. In a marketplace, for example, filters are available even before a search is performed. The user can select a category, such as \"Video games\" , before searching for \"PS5\" , refining the search to a more specific subset instead of the entire database. This significantly increases the chances of obtaining more relevant results. Facets work similarly to filters but are only available after the search is performed. In other words, the search returns results, and based on them, a new list of refinement options is generated. For example, when searching for a PS5 console, facets such as storage capacity , shipping cost , and color may be displayed to help users choose the ideal product. Now that we have defined filters and facets, let's discuss the impact of the classical and Machine Learning (ML)-based approaches on their implementation and usage. Each method has advantages and challenges that influence search efficiency. Classical approach In this approach, filters and facets are manually defined based on predefined rules. This means that the attributes available for refining the search are fixed and planned in advance, considering the catalog structure and user needs. For example, in a marketplace, categories such as \"Electronics\" or \"Fashion\" may have specific filters like brand, format and price range. These rules are created statically, ensuring consistency in the search experience but requiring manual adjustments whenever new products or categories emerge. Although this approach provides predictability and control over the displayed filters and facets, it can be limited when new trends arise that demand dynamic refinement. Pros: Predictability and control: Since filters and facets are manually defined, management becomes easier. Low complexity: No need to train models. Ease of maintenance: As rules are predefined, adjustments and corrections can be made quickly. Cons : Reindexing required for new filters: Whenever a new attribute needs to be used as a filter, the entire dataset must be reindexed to ensure that documents contain this information. Lack of dynamic adaptation: Filters are static and do not automatically adjust to changes in user behavior. Implementation of filters/facets – Classical approach In Dev Tools, Kibana , we will create a demonstration of filters/facets using the classical approach . First, we define the mapping to structure the index: The brand and storage fields are set as keyword , allowing them to be used directly in aggregations ( facets ). The price field is of type float , enabling the creation of price ranges . In the next step, the product data will be indexed: Now, let's retrieve classic facets by grouping the results by brand, storage, and price range. In the query, size:0 was defined. In this scenario, the goal is to retrieve only the aggregation results without including the documents corresponding to the query. The response will include counts for Brand , Storage , and Price , helping to create filters and facets. Machine learning/AI-based approach In this approach, Machine Learning (ML) models, including Artificial Intelligence (AI) techniques, analyze data attributes to generate relevant filters and facets. Instead of relying on predefined rules, ML/AI leverages indexed data characteristics. This enables the dynamic discovery of new facets and filters. Pros : Automatic updates: New filters and facets are generated automatically, without the need for manual adjustments. Discovery of new attributes: It can identify previously unconsidered data characteristics as filters, enriching the search experience. Reduced manual effort: The team does not need to constantly define and update filtering rules as AI learns from available data. Cons: Maintenance complexity: The use of models may require pre-validation to ensure the consistency of the generated filters. Requires ML and AI expertise: The solution demands qualified professionals to fine-tune and monitor model performance. Risk of irrelevant filters: If the model is not well-calibrated, it may generate facets that are not useful for users. Cost: The use of ML and AI may require third-party services, increasing operational costs. It's worth noting that even with a well-calibrated model and a well-crafted prompt, the generated facets should still go through a review step. This validation can be manual or based on moderation rules, ensuring that the content is appropriate and safe. While not necessarily a drawback, it is an important consideration to ensure the quality and suitability of the facets before they are made available to users. Implementation of filters/facets – AI approach In this demonstration, we will use an AI model to automatically analyze product characteristics and suggest relevant attributes. With a well-structured prompt, we extract information from the catalog and transform it into filters and facets. Below, we present each step of the process. Initially, we will use the Inference API to register an endpoint for integration with an ML service. Below is an example of integration with OpenAI's service . Now, we define the pipeline to execute the prompt and obtain the new filters generated by the model. Running a simulation of this pipeline for the \"PlayStation 5\" product, with the following description: Stunning Gaming: Marvel at stunning graphics and experience the features of the new PS5. Breathtaking Immersion: Discover a deeper gaming experience with support for haptic feedback, adaptive triggers, and 3D Audio technology. Slim Design: With the PS5 Digital Edition, gamers get powerful gaming technology in a sleek, compact design. 1TB of Storage: Have your favorite games ready and waiting for you to play with 1TB of built-in SSD storage. Backward Compatibility and Game Boost: The PS5 console can play over 4,000 PS4 games. With Game Boost, you can even enjoy faster, smoother frame rates in some of the best PS4 console games. Let's observe the prompt output generated from this simulation. Now a new field, dynamic_facets , will be added to the new index to store the facets generated by the AI. Using the Reindex API , we will reindex the videogames index to videogames_1 , applying the generate_filter_ai pipeline during the process. This pipeline will automatically generate dynamic facets during indexing. Now, we will run a search and get the new filters: Results: To symbolize the implementation of the facets, below is a simple front-end: The UI code presented is here . Conclusion Both approaches to creating filters and facets have their benefits and points of concern. The classic approach, based on manual rules, offers control and lower costs but requires constant updates and does not dynamically adapt to new products or features. On the other hand, the AI ​​and Machine Learning-based approach automates facet extraction, making the search more flexible and allowing the discovery of new attributes without manual intervention. However, this approach can be more complex to implement and maintain, requiring calibration to ensure consistent results. The choice between the classic and AI-based approaches depends on the needs and complexity of the business. For simpler scenarios, where data attributes are stable and predictable, the classic approach can be more efficient and easier to maintain, avoiding unnecessary costs with infrastructure and AI models. On the other hand, the use of ML/AI to extract facets can add significant value, improving the search experience and making filtering more intelligent. The important thing is to evaluate whether automation justifies the investment or whether a more traditional solution already meets the business needs effectively. Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to Filters and facets Classical approach Implementation of filters/facets – Classical approach Machine learning/AI-based approach Implementation of filters/facets – AI approach Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Generating filters and facets using ML - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/filters-facets-using-ml","meta_description":"Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. Integrations Ingestion How To JR TM By: Jeffrey Rengifo and Tomás Murúa On February 18, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In this article, we'll cover the CLIP multimodal model, explore alternatives, and analyze their pros and cons through a practical example of a mock real estate website that allows users to search for properties using pictures as references. What is CLIP? CLIP (Contrastive Language–Image Pre-training) is a neural network created by OpenAI, trained with pairs of images and texts to solve tasks of finding similarities between text and images and categorize \"zero-shot\" images so the model was not trained with fixed tags but instead, we provide unknown classes for the model so it can classify the image we provide to it. CLIP has been the state-of-the-art model for a while and you can read more articles about it here: Implementing image search How to implement image similarity search However, over time more alternatives have come up. In this article, we'll go through the pros and cons of two alternatives to CLIP using a real estate example. Here’s a summary of the steps we’ll follow in this article: Basic configuration: CLIP and Elasticsearch For our example, we will create a small project with an interactive UI using Python. We will install some dependencies, like the Python transformers, which will grant us access to some of the models we'll use. Create a folder /clip_comparison and follow the installation instructions located here . Once you're done, install the Elasticsearch's Python client , the Cohere SDK and Streamlit : NOTE: As an option, I recommend using a Python virtual environment (venv) . This is useful if you don't want to install all of the dependencies on your machine. Streamlit is an open-source Python framework that allows you to easily get a UI with little code. We'll also create some files to save the instructions we'll use later: app.py : UI logic. /services/elasticsearch.py : Elasticsearch client initialization, queries, and bulk API call to index documents. /services/models.py : Model instances and methods to generate embeddings. index_data.py : Script to index images from a local source. /data : our dataset directory. Our App structure should look something like this: Configuring Elasticsearch Follow the next steps to store the images for the example. We'll then search for them using knn vector queries . Note: We could also store text documents but for this example, we will only search in the images. Index Mappings Access Kibana Dev Tools (from Kibana: Management > Dev Tools) to build the data structure using these mappings: The field type dense_vector will store the embeddings generated by the models. The field binary will store the images as base64. Note: It's not a good practice to store images in Elasticsearch as binary. We're only doing it for the practical purpose of this example. The recommendation is to use a static files repository. Now to the code. The first thing we need to do is initialize the Elasticsearch client using the cloud id and api-key . Write the following code at the beginning of the file /services/elasticsearch.py : Configuring models To configure the models, put the model instances and their methods in this file: /services/models.py . The Cohere Embed-3 model works as a web service so we need an API key to use it. You can get a free one here . The trial is limited to 5 calls per minute, and 1,000 calls per month. To configure the model and make the images searchable in Elasticsearch, follow these steps: Convert images to vectors using CLIP Store the image vectors in Elasticsearch Vectorize the image or text we want to compare to the stored images. Run a query to compare the entry of the previous step to the stored images and get the most similar ones. Configuring CLIP To configure CLIP, we need to add to the file models.py, the methods to generate the image and text embeddings. For all the models, you need to declare similar methods: one to generate embeddings from an image (clip_image_embeddings) and another one to generate embeddings from text (clip_text_embeddings). The outputs.detach().cpu().numpy().flatten().tolist() chain is a common operation to convert pytorch tensors into a more usable format: .detach(): Removes the tensor from the computation graph as we no longer need to compute gradients . .cpu(): Moves tensors from GPU to CPU as numpy only supports CPU . .numpy(): Converts tensor to numPy array. .flatten() : Converts into a 1D array. .toList() : Converts into Python List. This operation will convert a multidimensional tensor into a plain list of numbers that can be used for embeddings operations. Let's now take a look at some CLIP alternatives. Alternative 1: JinaCLIP JinaCLIP is a CLIP variant developed by Jina AI, designed specifically to improve the search of images and text in multimodal applications. It optimizes CLIP performance by adding more flexibility in the representation of images and text. Compared to the original OpenAI CLIP model, JinaCLIP performs better in text-to-text, text-to-image, image-to-text, and image-to-image tasks as we can see in the chart below: Model Text-Text Text-to-Image Image-to-Text Image-Image jina-clip-v1 0.429 0.899 0.803 0.916 openai-clip-vit-b16 0.162 0.881 0.756 0.816 %increase vs OpenAI CLIP 165% 2% 6% 12% The capacity to improve precision in different types of queries makes it a great tool for tasks that require a more precise and detailed analysis. You can read more about JinaCLIP here . To use JinaCLIP in our app and generate embeddings, we need to declare these methods: Alternative 2: Cohere Image Embeddings V3 Cohere has developed an image embedding model called Embed-3, which is a popular alternative to CLIP. The main difference is that Cohere focused on the representation of enterprise data like charts, product images, and design files. Embed-3 uses an advanced architecture that reduces the bias risk towards text data, which is currently a disadvantage in other multimodal models like CLIP, so it can provide more precise results between text and image. You can see below a chart by Cohere showing the improved results using Embed 3 versus CLIP in this kind of data: For more info, go to Embed3. Just like we did with the previous models, let's declare the methods to use Embed 3: With the functions ready, let's index the dataset in Elasticsearch by adding the following code to the file index_data.py : Index the documents using the command: The response will show us the amount of elements indexed by index: Once the dataset has been indexed, we can create the UI. Test UI Creating the UI We are going to use Streamlit to build a UI and compare the three alternatives side-by-side. To build the UI, we'll start by adding the imports and dependencies to the file app.py : For this example, we'll use two views; one for the image search and another one to see the image dataset: Let's add the view code for Search Image: And now, the code for the Images view: We'll run the app with the command: Thanks to multimodality, we can run searches in our image database based on text (text-to-image similarity) or image (image-to-image similarity). Searching with the UI To compare the three models, we'll use a scenario in which a real estate webpage wants to improve its search experience by allowing users to search using image or text. We'll discuss the results provided by each model. We'll upload the image of a \"rustic home\": Here we have the search results. As you can see, based on the image we uploaded, each model generated different results: In addition, you can see results based on the text to find the house features: If searching for “modern”, the three models will show good results. But, JinaCLIP and Cohere will be showing the same houses in the first positions. Features Comparison Below you have a summary of the main features and prices of the three options we covered in this article: Model Created by Estimated Price Features CLIP OpenAI $0.00058 per run in Replicate (https://replicate.com/krthr/clip-embeddings) General multimodal model for text and image; suitable for a variety of applications with no specific training. JinaCLIP Jina AI $0.018 per 1M tokens in Jina (https://jina.ai/embeddings/) CLIP variant optimized for multimodal applications. Improved precision retrieving texts and images. Embed-3 Cohere $0.10 per 1M tokens, $0.0001 per data and images at Cohere (https://cohere.com/pricing) Focuses on enterprise data. Improved retrieval of complex visual data like graphs and charts. If you will search on long image descriptions, or want to do text-to-text as well as image-to-text, you should discard CLIP, because both JinaCLIP and Embed-3 are optimized for this use case. Then, JinaCLIP is a general-use model, while Cohere’s one is more focused on enterprise data like products, or charts. When testing the models on your data, make sure you cover: All modalities you are interested in: text-to-image, image-to-text, text-to-text Long and short image descriptions Similar concept matches (different images of the same type of object) Negatives Hard negative: Similar to the expected output but still wrong Easy negative: Not similar to the expected output and wrong Challenging scenarios: Different angles/perspectives Various lighting conditions Abstract concepts (\"modern\", \"cozy\", \"luxurious\") Domain-specific cases: Technical diagrams or charts (especially for Embed-3) Product variations (color, size, style) Conclusion Though CLIP is the preferred model when doing image similarity search, there are both commercial and non-commercial alternatives that can perform better in some scenarios. JinaCLIP is a robust all-in-one tool that claims to be more precise than CLIP in text-to-text embeddings. Embed-3 follows Cohere's line of catering to business clients by training models with real data using typical business docs. In our small experiment, we could see that both JinaClip and Cohere show interesting image-to-image and text-to-image results and perform very similarly to CLIP with these kinds of tasks. Elasticsearch allows you to search for embeddings, combining vector search with full-text-search, enabling you to search both for images and for the text in them. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to What is CLIP? Basic configuration: CLIP and Elasticsearch Configuring Elasticsearch Index Mappings Configuring models Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Exploring CLIP alternatives - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/openai-clip-alternatives","meta_description":"Learn about OpenAI’s CLIP, its configuration, and alternatives for image-to-image and text-to-image search like JinaCLIP & Cohere Image Embeddings V3."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch vector database for native grounding in Google Cloud’s Vertex AI Platform Elasticsearch is now publicly available as the first third-party native grounding engine for Google Cloud’s Vertex AI platform and Google’s Gemini models. It enables joint users to build fully customizable GenAI experiences grounded in enterprise data, powered by the best-of-breed Search AI capabilities from Elasticsearch. Integrations VA By: Valerio Arvizzigno On April 9, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elastic is thrilled to announce that the Elasticsearch vector database is now integrated into Google Cloud’s Vertex AI platform as a natively supported information retrieval engine, empowering users to leverage the multimodal strengths of Google’s Gemini models with the advanced AI-powered semantic and hybrid search capabilities of Elasticsearch. Developers can now create their RAG applications within a unified journey, grounding their chat experiences on their private data in a low-code, flexible way. Whether you’re building AI agents for your customers and internal employees or leveraging LLMs generation within your software, the Vertex AI platform puts Elasticsearch relevance at your fingertip with minimal configuration. This integration allows easier and faster adoption for Gemini models in production use cases, driving GenAI from PoCs to real-life scenarios. In this blog, we will walk you through integrating Elasticsearch with Google Cloud’s Vertex AI platform for seamless data grounding and building fully customizable GenAI applications. Let’s discover how. Google Cloud’s Vertex AI and Gemini models grounded on your data with Elasticsearch Users leveraging Vertex AI services and tools for creating GenAI applications can now access the new “Grounding” option to bring their private data into their conversational interaction automatically. Elasticsearch is now part of this feature and could be used via both: Vertex AI LLM APIs , which directly enrich Google’s Gemini models at generation time (preferred); Grounded Generation API , used instead in the Vertex AI Agent Builder ecosystem to build agentic experiences. With this integration, Elasticsearch – the most downloaded and deployed vector database – will bring your relevant enterprise data wherever it’s needed in your internal end customer-facing chats, which is crucial for the real-life adoption of GenAI into business processes. The aforementioned APIs will allow developers to adopt this new partnered feature in their code. However, prompt engineering and testing remain crucial steps in the application development and serve as an initial discovery playground. To support this, Elasticsearch is designed for easy evaluation by users within the Vertex AI Studio console tool. All it takes is a few simple steps to configure the Elastic endpoints with the desired parameters (index to be searched, the number of documents to be retrieved, and the desired search template) within the “Customize Grounding” tab in the UI, as shown below (note that for it to work, you have to type in the API key with the word \"ApiKey\" in the UI and code examples below). Now you’re ready to generate with your private knowledge! Production-ready GenAI applications with ease Elastic and Google Cloud work to provide developer-first, comprehensive, and enjoyable experiences. Connecting to Elastic natively in both LLM and Grounding Generation API reduces complexity and overhead while building GAI applications on Vertex AI, avoiding unnecessary additional APIs and data orchestration while grounding in just one unified call. Let’s see how it works in both scenarios. The first example is executed with the LLM API: In the above example, with the retrieval field of the API requesting content generation to Gemini 2.0 Flash, we can contextually set a retrieval engine for the request. Setting api_spec to “ELASTIC_SEARCH” enables the usage of additional configuration parameters such as the API Key and the cluster endpoint (needed to route a request to your Elastic cluster), the index to retrieve data from, and the Search template to be used for your search logic. Similarly, the same outcome could be achieved with the Grounding Generation API, setting the groundingSpec parameter: With both approaches, the response will provide an answer with the most relevant private documents found in Elasticsearch – and the related connected data sources – to support your query. Simplicity, however, should not be confused with a lack of personalization to fulfill your specific needs and use cases. With this in mind, we designed it to allow you to perfectly adapt the search configuration to your scenario. Fully customizable search at your fingertips: search templates To provide maximum customization to your search scenario, we’ve built, in collaboration with Google Cloud,the experience on top of our well-known Search Templates . Elasticsearch search templates are an excellent tool for creating dynamic, reusable, and maintainable search queries. They allow you to predefine and reuse query structures. They are particularly useful when executing similar queries with different parameters, as they save development time and reduce the chance of errors. Templates can include placeholders for variables, making the queries dynamic and adaptable to different search requirements. While using Vertex AI APIs and Elasticsearch for grounding, you must reference a desired search template – as shown in the code snippets above – where the search logic is implemented and pushed down to Elasticsearch. Elastic power users can asynchronously manage, configure, and update the search approaches and tailor them to the specific indices, models, and data in a fully transparent way for Vertex AI users, web-app developers, or AI engineers, who only need to specify the name of the template in the grounding API. This design allows for complete customization, putting the extensive Elasticsearch retrieval features at your disposal in a Google Cloud AI environment while ensuring modularity, transparency, and ease of use for different developers, even those unfamiliar with Elastic. Whenever you need BM25 search, semantic search, or a hybrid approach between the two (Have you explored retrievers already? Composable retrieval techniques in a single search API call), you can define your custom logic in a search template, which Vertex AI can automatically leverage. This also applies to embeddings and reranking models you choose to manage vectors and results. Depending on your use case, you may want to host models on Elastic’s ML nodes, use a third-party service endpoint through the Inference API, or run your local model on-prem. This is doable via a search template, and we’ll see how it works in the next section. Start with reference templates, then build your own To help you get started quickly, we’ve provided a set of compatible search template samples to be used as an initial reference; you can then modify and build your custom ones upon: Semantic Search with ELSER model (sparse vectors and chunking) Semantic Search with e5 multilingual model (dense vectors and chunking) Hybrid Search with Vertex AI text-embedding model You can find them in this GitHub repo . Let’s look at one example: creating embeddings with Google Cloud’s Vertex AI APIs on a product catalog. First, we need to create the search template in Elasticsearch as shown below: In this example, we will execute KNN search on two fields within one single search: title_embedding – the vector field containing the name of the product – and description_embedding – the one containing the representation of its description. You can leverage the excludes syntax to avoid returning unnecessary fields to the LLM, which may cause noise in its processing and impact the quality of the final answer. In our example, we excluded the fields containing vectors and image urls. Vectors are created on the fly at query time on the submitted input via an inference endpoint to the Vertex AI embeddings API, googlevertexai_embeddings_004 , previously defined as follows: You can find additional information on how to use Elastic’s Inference API here . We’re now ready to test our templated search: The params fields will replace the variables we set in the template scripts in double curl brackets. Currently, Vertex AI LLM and Grounded Generation APIs can send to Elastic the following input variables: “query” - the user query to be searched “index_name” - the name of the index where to search “num_hits” - how many documents we want to retrieve in the final output Here’s a sample output: The above query is precisely what Google Cloud’s Vertex AI will run on Elasticsearch behind the scenes when referring to the previously created search template. Gemini models will use the output documents to ground its answer: when you ask “What do I need to patch my drywall?” instead of getting a generic suggestion, the chat agent will provide you with specific products! End-to-end GenAI journey with Elastic and Google Cloud Elastic partners with Google Cloud to create production-ready, end-to-end GenAI experiences and solutions. As we’ve just seen, Elastic is the first ISV to be integrated directly into the UI and SDK for the Vertex AI platform, allowing seamless, grounded Gemini models prompts and agents using our vector search features. Moreover, Elastic integrates with Vertex AI and Google AI Studio ’s embedding, reranking, and completion models to create and rank vectors without leaving the Google Cloud landscape, ensuring Responsible AI principles. By supporting multimodal approaches, we jointly facilitate applications across diverse data formats. You can tune, test, and export your GenAI search code via our Playground . But it’s not just about building search apps: Elastic leverages Gemini models to empower IT operations, such as in the Elastic AI Assistants, Attack Discovery, and Automatic Import features , reducing daily fatigue for security analysts and SREs on low-value tasks, and allowing them to focus on improving their business. Elastic also enables comprehensive monitoring of Vertex AI usage , tracking metrics and logs, like response times, tokens, and resources, to ensure optimal performance. Together, we manage the complete GenAI lifecycle, from data ingestion and embedding generation to grounding with hybrid search, while ensuring robust observability and security of GenAI tools with LLM-powered actions. Explore more and try it out! Are you interested in trying this out? The feature is currently GA on your Google Cloud projects! If you haven’t already, one of the easiest ways to get started with Elastic Search AI Platform and explore our capabilities is with your free Elastic Cloud trial or by subscribing through Google Cloud Marketplace . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations May 8, 2025 Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. AG By: Ajay Krishnan Gopalan Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Jump to Google Cloud’s Vertex AI and Gemini models grounded on your data with Elasticsearch Production-ready GenAI applications with ease Fully customizable search at your fingertips: search templates Start with reference templates, then build your own End-to-end GenAI journey with Elastic and Google Cloud Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch vector database for native grounding in Google Cloud’s Vertex AI Platform - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-google-cloud-vertex-ai-native-grounding","meta_description":"Elasticsearch is now publicly available as the first third-party native grounding engine for Google Cloud’s Vertex AI platform and Google’s Gemini models. It enables joint users to build fully customizable GenAI experiences grounded in enterprise data, powered by the best-of-breed Search AI capabilities from Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elastic Cloud adds Elasticsearch Vector Database optimized instance to Google Cloud Elasticsearch's vector search optimized profile for GCP is available. Learn more about it and how to use it in this blog. Vector Database Generative AI Elastic Cloud Hosted SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta On April 25, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elastic Cloud Vector Search optimized hardware profile is available for Google Elastic Cloud users. This hardware profile is optimized for applications that require the storage of dense or sparse embeddings for search and Generative AI use cases powered by RAG (retrieval augmented generation). This release follows the previous release of a Vector Search optimized hardware profile for AWS Elastic Cloud users in Nov 2023. GCP Vector Search optimized instances: what you need to know Elastic Cloud users benefit from having Elastic managed infrastructure across all major cloud providers (GCP, AWS and Azure) along with wide region support for GCP users. For more specific details on the instance configuration for this hardware profile, refer to our documentation for instance type: gcp.es.datahot.n2d.64x8x11 Vector Search, HNSW, and memory Elasticsearch uses the Hierarchical Navigable Small World graph (HNSW) data structure to implement its Approximate Nearest Neighbor search (ANN). Because of its layered approach, HNSW's hierarchical aspect offers excellent query latency. To be most performant, HNSW requires the vectors to be cached in the node's memory. This caching is done automatically and uses the available RAM not taken up by the Elasticsearch JVM. Because of this, memory optimizations are important steps for scalability. Consult our vector search tuning guide to determine the right setup for your vector search embeddings and whether you have adequate memory for your deployment. With this in mind, the Vector Search optimized hardware profile is configured with a smaller than standard Elasticsearch JVM heap setting. This provides more RAM for caching vectors on a node, allowing users to provision fewer nodes for their vector search use cases. If you’re using compression techniques like scalar quantization , the memory requirement is lowered by a factor of 4. To store quantized embeddings (available in versions Elasticsearch 8.12 and later) simply ensure that you’re storing in the correct element_type: byte . To utilize our automatic quantization of float vectors update your embeddings to use index type: int8_hnsw like in the following mapping example. In upcoming versions, Elasticsearch will provide this as the default mapping, removing the need for users to adjust their mapping. Combining this optimized hardware profile with Elasticsearch’s automatic quantization are two examples where Elastic is focused on vector search to be cost-effective while still being extremely performant. Getting Started with Elastic Cloud vector search optimized profile for GCP Start a free trial on Elastic Cloud and simply select the new Vector Search optimized profile to get started. Migrating existing Elastic Cloud deployments Migrating to this new Vector Search optimized hardware profile is a few clicks away. Simply navigate to your Elastic Cloud management UI, click to manage the specific deployment, and edit the hardware profile. In this example, we are migrating from a ‘Storage optimized’ profile to the new ‘Vector Search’ optimized profile. When choosing to do so, while there is a reduction to available storage and vCPU, what is gained is the ability to store more vectors per memory with vector search. Migrating to a new hardware profile uses the grow and shrink approach for deployment changes. This approach adds new instances, migrates data from old instances to the new ones, and then shrinks the deployment by removing the old instances. This approach allows for high availability during configuration changes even for single availability zones. The following image shows a typical architecture for a deployment running in Elastic Cloud, where vector search will be the primary use case. This example deployment uses our new Vector Search optimized hardware profile, now available in GCP. This setup includes: Two data nodes in our hot tier with our vector search profile One Kibana node One Machine Learning node One integration server One master tiebreaker By deploying these two “full-sized” data nodes with the Vector Search optimized hardware profile and while taking advantage of Elastic’s automatic dense vector scalar quantization , you can index roughly 60 million vectors, including one replica (with 768 dimensions). Conclusion Vector search is a powerful tool when building modern search applications, be it for semantic document retrieval on its own or integrating with an LLM service provider in a RAG setup . Elasticsearch provides a full-featured vector database natively integrated with a full-featured search platform. Along with improving vector search feature set and usability, Elastic continues to improve scalability. The vector search node type is the latest example, allowing users to scale their search application. Elastic is committed to providing scalable, price effective infrastructure to support enterprise grade search experiences. Customers can depend on us for reliable and easy to maintain infrastructure and cost levers like vector compression, so you benefit from the lowest possible total cost of ownership for building search experiences powered by AI. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to GCP Vector Search optimized instances: what you need to know Vector Search, HNSW, and memory Getting Started with Elastic Cloud vector search optimized profile for GCP Migrating existing Elastic Cloud deployments Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elastic Cloud adds Elasticsearch Vector Database optimized instance to Google Cloud - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-vector-profile-gcp","meta_description":"Elasticsearch's vector search optimized profile for GCP is available. Learn more about it and how to use it in this blog."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Storage wins for time-series data in Elasticsearch Explore Elasticsearch's storage improvements for time series data and best practices for configuring a TSDS with storage efficiency. Search Analytics How To MG KK By: Martijn Van Groningen and Kostas Krikellas On June 10, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this article, we describe the most impactful storage improvements incorporated in our Elasticsearch time-series data offering and provide insights into the scenarios we expect our system to perform better - and worse - with regard to storage efficiency. Background Elasticsearch has recently invested in better support for storing and querying time-series data. Storage efficiency has been a main area of focus, with many projects delivering substantial wins that can lead to savings of up to 60-80%, compared to saving the data in standard indices. In certain scenarios, our system achieves storage efficiency of less than one byte per data point, competing head-to-head with state-of-the-art, specialized TSDB systems. Let's take a look at the recent improvements in storage efficiency for time-series data. Storage improvements in Elasticsearch time-series data Synthetic source Elasticsearch stores the original JSON document body in the _source field by default. This duplication penalizes storage with diminishing returns for metrics, as they are normally inspected through aggregation queries that don’t use this field. To mitigate this, we introduced synthetic _source that reconstructs a flavor of the original _source on demand, using the data stored in the document fields. The caveats are that a limited number of field types are supported and _source synthesizing is slower than retrieving it from a stored field. Still, these restrictions are largely irrelevant for metrics datasets that mostly rely on keyword, numeric, boolean and IP fields and use aggregate queries that don’t take the _source content into account. We’re separately working on eliminating these limitations to make synthetic source applicable to any mapping. The storage wins are immediate and apparent: enabling synthetic source reduces the size of time series data stream (TSDS) indices by 40-60% (more on performance evaluation below). Synthetic source is thus used by default in TSDS since it was released (v.8.7). Specialized codecs TSDB systems make heavy use of specialized codecs that take advantage of the chronological ordering of recorded metrics to reduce the number of bytes per data point. Our offering extends the standard Lucene codecs with support for run-length encoding , delta-of-deltas (2nd derivative), GCD and XOR encoding for numeric values. Codecs are specified at the Lucene segment level, so older indices can take advantage of the latest codecs when indexing fresh data. To boost the efficiency of these compression techniques, indices get sorted by an identifier calculated over all dimension fields (ascending order), and then by timestamp (descending order, to return the latest data point per time series). This way, dimension fields (mostly keywords) get efficiently compressed with run-length encoding, while numeric values for metrics get clustered per time-series and ordered by time. Since most time-series change slowly over time, with occasional spikes, and Elasticsearch relies on Lucene’s vertically partitioned storage engine , this approach minimizes deltas between successively stored data and boosts storage efficiency. Metadata trimming The _id field is a metadata field used to uniquely identify each document in Elasticsearch. It has limited value for metrics applications, since time-series analysis relies on queries aggregating values over time, rather than inspecting individual metric values. To that end, TSDS trims the stored values but keeps the inverted index for this field to still support doc retrieval queries. This leads to 10-20% storage reduction with no loss of functionality. Lifecycle integration TSDSs can be integrated with data lifecycle management mechanisms, namely ILM and Data Stream Lifecycle . These tools automate deleting older indexes, while ILM also supports moving indices to tiers with cheaper storage (e.g. using spinning disks or archival cloud storage) as they age. Lifecycle management reduces storage costs, with no compromise on querying performance for frequently-accessed metrics and with minimal user involvement. Downsampling In many metrics applications, it’s preferable to keep finely-grained data in the short term only (e.g. per-minute data for the last week), and acceptable to increase granularity for older data to save on storage (e.g. per-hour data for the last month, per-day data for the last 2 years). Downsampling replaces raw metrics data with a statistical representation of pre-aggregated metrics over configurable time periods (e.g. hourly or daily). This improves both storage efficiency, since the size of downsampled indices is a fraction of the raw metrics indices, and querying performance, since aggregation queries scan pre-aggregated results instead of calculating them over raw data on-the-fly. Downsampling is integrated with ILM and DSL that automate its application and allow for different resolutions of downsampled data as they age. Test results for TSDS storage efficiency TSDS storage gains We track performance, including storage usage and efficiency, for TSDS through nightly benchmarks . The TSDB track (see disk usage visualization) visualizes the impact of our storage improvements. We’ll next present storage usage before TSDS was released, how it improved when TSDS was GA-ed, and what’s the current status. The TSDB track’s dataset ( k8s metrics) has nine dimension fields, with each document containing 33 fields (metrics and dimensions) on average. The index contains a day's worth of metrics over 116,633,696 documents. Indexing the TSDB track’s dataset before ES version 8.7 required 56.9GB of storage. It is interesting to break this down by metadata fields, the timestamp field, dimension fields and metric fields to gain insight into storage usage: Field name Percentage _id 5.1% _seq_no 1.4% _source 78.0% @timestamp 1.31% Dimension fields 2.4% Metric fields 5.1% Other fields 9.8% The _source metadata field is the largest contributor to the storage footprint by far. Synthetic source was one of the improvements that our metrics effort motivated to improve storage efficiency, as mentioned earlier. This is evident in ES 8.7 that uses synthetic source for TSDS by default. In this case, the storage footprint drops to 6.5GB - a 8.75x improvement in storage efficiency. Breaking this down by field type: Field name Percentage _id 18.7% _seq_no 14.1% @timestamp 12.6% Dimension fields 3.6% Metric fields 12.0% Other fields 50.4% The improvement is due to the _source no longer being stored, as well as applying index sorting to store metrics from the same time series sequentially, thus boosting the efficiency of standard Lucene codecs. Indexing the TSDB track’s dataset with ES 8.13.4 occupies 4.5GB of storage - a further 44 % improvement. The breakdown by field type is: Field name Percentage _id 12.2% _seq_no 20.6% @timestamp 14.0% Dimension fields 1.6% Metric fields 6.7% Other fields 58.6% This is a substantial improvement, compared to the 8.7.0 version. The main contributing factors to the latest iteration are the _id field taking up less storage space (its stored values get trimmed), while dimension fields and other numeric fields get compressed more efficiently using the latest time-series codecs. The majority of storage is now attributed to “other fields”, i.e. fields providing context similar to dimensions but are not used in calculating the identifier that’s used for index sorting, so their compression is not as efficient as with dimension fields. Downsample storage gains Downsampling trades querying resolution for storage gains, depending on the downsampling interval. Downsampling the metrics in TSDB track’s dataset (with metrics collected every 10 seconds) using a 1-minute interval results in an index of 748MB - a 6x improvement. The downside is that metrics get pre-aggregated on a per-minute granularity, so it’s no longer possible to inspect individual metric recordings or aggregate over sub-minute intervals (e.g. per 5 seconds). Most importantly, aggregation results on the pre-computed statistics (min, max, sum, count, average) are the same as if calculated over the original data, so downsampling doesn’t incur any cost in accuracy. If lower resolution can be tolerated and metrics get downsampled using an hourly interval, the resulting downsampled index will use just 56MB of storage. Note that the improvement is 13.3x , i.e. lower than 60x that one would expect from switching from a per-minute downsampling interval to a per-hour one. This is due to additional metadata that all indices require to store per segment, a constant overhead that becomes more noticeable as the index size reduces. Putting everything together The following graph shows how storage efficiency evolved across versions, as well as what additional savings downsampling can provide. Kindly note that the vertical axis is in logarithmic scale. In total, we achieved a 12.5x improvement in storage efficiency for our metrics offering over the past releases. This can even reach 1000x or better, if we trade bucketing resolution for reduced storage footprint through downsampling. Configuration hints for TSDS In this section, we explore best practices for configuring a TSDS with storage efficiency in mind. Favor many metrics per document While Elasticsearch uses vertical partitioning to store each field separately, fields are still grouped logically in docs. Since metrics share dimensions that are included in the same doc, the storage overhead for dimensions and metadata gets better amortized when we include as many metrics as possible in each indexed doc. On the flip side, storing a single metric in each doc, along with its associated dimensions, maximizes the overhead of dimensions and metadata and bloats storage. More concretely, we used synthetic datasets to quantify the impact of the number of metrics per document. When we included all metrics (20) in each indexed doc, TSDS used as little as 0.9 bytes per data point - approximating the performance of state-of-the-art, purpose-built metrics systems (0.7 bytes per data point) that lack the rich indexing and querying capabilities of Elasticsearch for unstructured data. Conversely, when each indexed doc had a single metric, TSDS required 20 bytes per data point , a substantial increase in the storage footprint. It therefore pays off to group together as many metrics as possible in each indexed doc, sharing the same dimensions values. Trim unnecessary dimensions The Elasticsearch architecture allows our metrics offering to scale far better than competing systems, when it comes to the number of time series per metric (i.e. the product of the dimension cardinalities) in the order of millions or more, at a manageable performance cost. Still, dimensions do take considerable space and high cardinalities reduce the efficiency of our compression techniques for TSDS. It’s therefore important to carefully consider what fields are included in indexed documents for metrics and aggressively prune dimensions to the minimum required set for dashboards and troubleshooting. One interesting example here was an Observability mapping including an IP field that turned out to contain up to 16 IP ( v4, v6) addresses of the hosting machine. It had a substantial impact on both storage footprint and indexing throughput and was hardly used. Replacing it with a machine label led to a sizable storage improvement with no loss of debuggability. Use lifecycle management ILM facilitates moving older, infrequently-accessed data to cheaper storage options, and both ILM and Data Stream Lifecycle can handle deleting metrics data as they age. This fully-automated approach reduces storage costs without changing index mappings or configuration and is thus highly encouraged. More so, it is worth considering trading metrics resolution for storage through downsampling, as data ages. This technique leads to both substantial storage wins and more responsive dashboards, assuming that the reduction in bucketing resolution is acceptable for older data - a common case in practice, as it’s fairly rare to inspect months-old data at a per-minute granularity, for instance. Next steps We’ve achieved a significant improvement in the storage footprint for metrics over the past years. We intend to apply these optimizations to additional data types beyond metrics, and specifically logs data. While some features are metrics-specific, such as downsampling, we still hope to see reductions in the order of 2-4x using a logs-specific index configuration. Despite reducing the storage overhead of metadata fields that all Elasticsearch indices require, we plan to trim them more aggressively. Good candidates are the _id and _seq_no fields. Furthermore, there are opportunities to apply more advanced indexing techniques, such as sparse indices , to timestamps and other fields supporting range queries. The downsampling mechanism has a big potential for improved querying performance, if a small storage penalty is acceptable. One idea is to support multiple downsampling resolutions (e.g. raw, per-hour and per-day) on overlapping time periods, with the query engine automatically picking the most appropriate resolution for each query. This would allow users to spec downsampling to match their dashboard time scaling and make them more responsive, as well as kick off downsampling within minutes after indexing. It would also unlock keeping raw data along with downsampled, potentially using a slower/cheaper storage layer. Try it out Sign up for Elastic Cloud , if you don’t have an account yet Configure a TSDS and use it for storing and querying metrics Explore downsampling to see if it fits your use case Enjoy the storage savings Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Background Storage improvements in Elasticsearch time-series data Synthetic source Specialized codecs Metadata trimming Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Storage wins for time-series data in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/time-series-data-elasticsearch-storage-wins","meta_description":"Explore Elasticsearch's storage improvements for time series data and best practices for configuring a TSDS with storage efficiency."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. AutoOps Elastic Cloud Hosted ZS OS By: Ziv Segal and Ori Shafir On November 6, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. While Elasticsearch is a powerful and scalable search engine that offers a vast selection of capabilities, many users find it challenging due to its sometimes complex administration and management experience. We hear you, and we're excited to share some big news! The Opster team has been hard at work making AutoOps even better, and a seamless part of the Elastic platform. AutoOps is available in select Elastic Cloud regions, and coverage is rapidly expanding! AutoOps makes Elastic Cloud easy to operate AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. With AutoOps, you will be able to: Minimize administration time with insights tailored to your Elasticsearch utilization and configuration Analyze hundreds of Elasticsearch metrics in real-time with pre-configured alerts to detect and flag issues before they become critical Get root cause analysis with drill-downs to point-in-time of issue occurrence, and resolution suggestions including in-context Elasticsearch commands Improve resource utilization by providing optimization suggestions In each of the scenarios below, let’s see examples of issues that users may come across, and how AutoOps insights (screenshots) can help right away! Real scenarios: how AutoOps makes Elasticsearch easy to operate The scenarios below provide real-world issues and how AutoOps provide root cause analysis, with drill-downs to point-in-time of issue occurrence, and recommendations on how to resolve the issue. Scenario #1: Finding a query causing severe search latency Issue: Users complain that their dashboards are slow and take a long time to load… AutoOps insight: AutoOps reports a “Long running search task” event, identifying a search running for 4 minutes with 4 nested aggregations and suggesting ways to optimize the query causing the latency. Resolution AutoOps provides a cURL command to cancel the query. By identifying and canceling the long running search task, the administrator was able to block this specific query. AutoOps monitors the Task Management API and flags long running search tasks providing an easy way to detect long running search queries and optimize them. AutoOps provides in-context Elasticsearch commands to resolve the issues, such as canceling the long running search task Scenario #2: Ineffective use of data tiering, leading to slow search and indexing Issue: Users report slow search performance and indexing. AutoOps insight: AutoOps detects multiple issues stemming from increased load due to indexing activities on warm nodes, resulting in a high indexing queue and slow searches on one of these nodes. AutoOps detects that indexing activities are occurring in warm nodes, there is a high indexing queue and slow searches were detected on one of those warm nodes. Resolution: The team updated their ILM policy to ensure that indices only move off the hot tier once no further indexing activities are expected. AutoOps detects that indexing occurred in the hot tier AutoOps detects that the Index queue is high and provides a list of recommendations for resolution AutoOps Slow search performance event - detects slow search performance on the loaded node Scenario #3: Investigating production down time Issue: An outage is reported, and CPU usage on the cluster is high momentarily AutoOps insight: AutoOps identifies the time window during which CPU utilization was high, and provides a drill-down into the point of time of the issue with a recommendation to check slow logs. Drilling down further into the node view reveals that the CPU is high every day, at about 7am. Resolution SRE finds a script scheduled to run daily at 7 am, by amending the script they are able to fix the issue and stabilize the cluster. AutoOps provides hyperlinks for quick drill-downs into detected issues Drill down screens provide extra context with metrics on nodes, indices and shards and templates optimizations Scenario #4: Customer Kibana dashboards are slow Issue Customers complain that Kibana dashboards are at times slower than usual AutoOps insight AutoOps detects large shards that could lead to slow search performance and recommends reindexing into smaller indices and reviewing the ILM policy. Resolution The team follows AutoOps’ recommendation to change the shards sizes, improving the dashboard’s responsiveness and cluster stability. AutoOps monitors shards sizes and alerts when and how to optimize shards AutoOps and Elastic: name a more iconic duo! By analyzing hundreds of Elasticsearch metrics, your configuration, and usage patterns, AutoOps recommends operational and monitoring insights that deliver real savings in administration time and hardware costs. Elasticsearch Performance optimizations: AutoOps tells you exactly how to keep your Elasticsearch clusters running smoothly. It offers tailored insights based on your specific usage and configuration, helping you maintain high performance. Real-time Issue Detection for Elasticsearch specific issues: AutoOps continuously analyzes hundreds of Elasticsearch metrics and provides pre-configured alerts to catch issues like ingestion bottlenecks, data structure misconfigurations, unbalanced loads, slow queries, and more - before they become bigger issues. Easy Troubleshooting: Troubleshooting can be complex, especially in larger environments. AutoOps performs root cause analysis and provides drill-downs to the exact point in time when an issue occured, and resolution paths including in-context Elasticsearch commands and best practices. Cost visibility and optimization for Elasticsearch deployments : AutoOps identifies underutilized nodes, small or large indices and shards, and suggests data tier optimizations. This can help with better resource utilization and potential hardware cost savings. Seamless Integration: AutoOps isn't just a standalone tool; it's built into Elastic Cloud and integrates with alerting and messaging frameworks (MS Teams and Slack), incident management systems (PagerDuty and OpsGenie) and other tools. You can customize AutoOps alerts and notifications for your use case. Query Optimization, Template Optimization, and lots more! Built into AutoOps is our expertise in running and managing lots of types of Elastic environments. AutoOps identifies and alerts you on expensive queries, data types that exist and if/when they should (or should not) be used, for example storing numbers as integers/longs so they are optimized for range queries. There are many other types of suggestions built in, that we hope you will find useful! When will AutoOps be available for me? We’re rolling out AutoOps in phases, starting with select Elastic Cloud Hosted regions and coverage is expanding rapidly. Next up, we’ll focus on our Elastic Cloud Serverless users. While Elastic Cloud Serverless is already making Elasticsearch easier to use, AutoOps will take it to the next level by offering advanced monitoring and optimization capabilities. And for our self-managed customers, we haven’t forgotten you. Plans are in the works to bring AutoOps your way, too! Try AutoOps: the easy way to operate Elasticsearch Elasticsearch is powerful, but should also be as simple and efficient as possible for your use. With AutoOps, we’re delivering on that promise in a big way. Whether you’re striving for optimal performance or looking to cut costs, AutoOps offers insights and tools to help you. Got questions or eager to dive into AutoOps? Here are some ways to start, and happy optimizing! AutoOps home page - watch three minute video Try AutoOps using an Elastic Cloud trial account AutoOps product documentation Report an issue Related content AutoOps January 2, 2025 Leveraging AutoOps to detect long-running search queries Learn how AutoOps helps you investigate long-running search queries plaguing your cluster to improve search performance. VC By: Valentin Crettaz AutoOps December 18, 2024 Resolving high CPU usage issues in Elasticsearch with AutoOps How AutoOps pinpointed and resolved high CPU usage in an Elasticsearch cluster: A step-by-step case study. MD By: Musab Dogan AutoOps How To November 20, 2024 Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. SF By: Sachin Frayne Vector Database Generative AI +2 May 21, 2024 Elasticsearch delivers performance increase for users running the Elastic Search AI Platform on Arm-based architectures Benchmarking in preview provides Elasticsearch up to 37% better performance on Azure Cobalt 100 Arm-based VMs. YG HM By: Yuvraj Gupta and Hemant Malik Vector Database Generative AI +1 May 21, 2024 Elastic Cloud adds Elasticsearch Vector Database optimized profile to Microsoft Azure Elasticsearch added a new vector search optimized profile to Elastic Cloud on Microsoft Azure. Get started and learn how to use it here. SC JV YG By: Serena Chou , Jeff Vestal and Yuvraj Gupta Jump to AutoOps makes Elastic Cloud easy to operate Real scenarios: how AutoOps makes Elasticsearch easy to operate Scenario #1: Finding a query causing severe search latency Scenario #2: Ineffective use of data tiering, leading to slow search and indexing Scenario #3: Investigating production down time Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"AutoOps makes every Elasticsearch deployment simple(r) to manage - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/autoops-elasticsearch-easy-operations","meta_description":"Learn about AutoOps for Elasticsearch and how it simplifies cluster management with performance recommendations, resource utilization, and cost insights."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. Generative AI How To JW By: James Williams On March 26, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Azure AI Document Intelligence is a powerful tool for extracting structured data from PDFs. It can be used to effectively extract text and table data. Once the data is extracted, it can be indexed into Elastic Cloud Serverless to power RAG (Retrieval Augmented Generation). In this blog, we will demonstrate how powerful Azure AI Document Intelligence is by ingesting four recent Elastic N.V. quarterly reports. The PDFs range from 43 to 196 pages in length and each PDF contains both text and table data. We will test the retrieval of table data with the following prompt: Compare/contrast subscription revenue for Q2-2025, Q1-2025, Q4-2024 and Q3-2024? This prompt is tricky because it requires context from four different PDFs that represent this information in tabular format. Let’s walk through an end-to-end reference that consists of two main parts: Python notebook Downloads four quarters’ worth of PDF 10-Q filings for Elastic N.V. Uses Azure AI Document Intelligence to parse the text and table data from each PDF file Outputs the text and table data into a JSON file Ingests the JSON files into Elastic Cloud Serverless Elastic Cloud Serverless Creates vector embeddings for PDF text+table data Powers vector search database queries for RAG Pre-configured OpenAI connector for LLM integration A/B test interface for chatting with the 10-Q filings Prerequisites The code blocks in this notebook require API keys for Azure AI Document Intelligence and Elasticsearch. The best starting point for Azure AI Document intelligence is to create a Document Intelligence resource . For Elastic Cloud Serverless, refer to the get started guide. You will need Python 3.9+ to run these code blocks. Create an .env file Place secrets for Azure AI Document Intelligence and Elastic Cloud Serverless in a .env file. Install Python packages Create input and output folders Download PDF files Download four recent Elastic 10-Q quarterly reports. If you already have PDF files, feel free to place them in the ‘./pdf’ folder. Parse PDFs using Azure AI Document Intelligence A lot is going on in code blocks that parse the PDF files. Here’s a quick summary: Set Azure AI Document Intelligence imports and environment variables Parse PDF paragraphs using AnalyzeResult Parse PDF tables using AnalyzeResult Combine PDF paragraph and table data Bring it all together by doing 1-4 for each PDF file and store the result in JSON Set Azure AI Document Intelligence imports and environment variables The most important import is AnalyzeResult. This class represents the outcome of a document analysis and contains details about the document. The details we care about are pages, paragraphs and tables. Parse PDF paragraphs using AnalzeResult Extract the paragraph text from each page. Do not extract table data. Parse the PDF tables using AnalyzeResult Extract the table content from each page. Do not extract paragraph text. The most interesting side effect of this technique is that there is no need to transform table data. LLMs know how to read text that looks like: “Cell [0, 1]: table data…” . Combine PDF paragraph and table data Pre-process chunking at the page level preserves context so that we can easily validate RAG retrieval manually. Later, you will see that this pre-process chunking will not have a negative effect on the RAG output. Bring it all together Open each PDF in the ./pdf folder, parse the text and table data and save the result in a JSON file that has entries for page_number , content_text and pdf_file . The content_text field represents the page paragraphs and table data for each page. Load data into Elastic Cloud Serverless The code blocks below handle: Set imports for the Elasticsearch client and environment variables Create index in Elastic Cloud Serverless Load the JSON files from ./json directory into the pdf-chat index Set imports for the Elasticsearch client and environment variables The most important import is Elasticsearch. This class is responsible for connecting to Elastic Cloud Serverless to create and populate the pdf-chat index. Create index in Elastic Cloud Serverless This code block creates an index named “pdf_chat” that has the following mappings: page_content - For testing RAG using full-text search page_content_sparse - For testing RAG using sparse vectors page_content_dense - For testing RAG using dense vectors page_number - Useful for constructing citations pdf_file - Useful for constructing citations Notice the use of copy_to and semantic_text . The copy_to utility copies body_content to two semantic text fields. Each semantic text field maps to an ML inference endpoint, one for the sparse vector and one for the dense vector. Elastic-powered ML inference will auto-chunk each page into 250 token chunks with a 100 token overlap. Load the JSON files from ./json directory into the pdf-chat index This process will take several minutes to run because we are: Loading 402 pages of PDF data Creating sparse text embeddings for each page_content chunk Creating dense text embeddings for each page_content chunk There is one last code trick to call out. We are going to set the elastic document ID by using the following naming convention: FILENAME_PAGENUMBER . This will make it easy/breezy to see the PDF file and page number associated with citations in Playground. Elastic Cloud Serverless Elastic Cloud Serverless is an excellent choice for prototyping a new Retrieval-Augmented Generation (RAG) system because it offers fully managed, scalable infrastructure without the complexity of manual cluster management. It supports both sparse and dense vector search out of the box, allowing you to experiment with different retrieval strategies efficiently. With built-in semantic text embedding, relevance ranking, and hybrid search capabilities, Elastic Cloud Serverless accelerates iteration cycles for search powered applications. With the help of Azure AI Document Intelligence and a little Python code, we are ready to see if we can get the LLM to answer questions grounded in truth. Let’s open Playground and conduct some manual A/B testing using different query strategies. Full text search This query will return the top ten pages of content that get a full-text search hit. Full-text search came close but it was only able to provide the right answer for three out of four quarters. This is understandable because we are stuffing the LLM context with ten full pages of data. And, we are not leveraging semantic search. Sparse vector search This query will return the top two semantic text fragments from pages that match our query using powerful sparse vector search. Sparse vector search powered by Elastic’s ELSER does a really good job retrieving table data from all four PDF files. We can easily double check the answers by opening the PDF page number associated with each citation. Dense vector search Elastic also provides an excellent dense vector option for semantic text ( E5 ). E5 is good for multi-lingual data and it has lower inference latency for high query per second use cases. This query will return the top two semantic text fragments that match our user input. The results are the same as with sparse search but notice how similar both queries are. The only difference is the “field” name. Hybrid search ELSER is so good for this use case that we do not need hybrid search. But, if we wanted to, we could combine dense and vector search into a single query. Then, rerank the results using RRF(Reciprocal Rank Fusion) . So what did we learn? Azure AI Document Intelligence Is very capable of parsing both text and table data in PDF files. Integrates well with the elasticsearch python client . Elastic Serverless Cloud Has built-in ML inference for sparse and dense vector embeddings at ingest and query time. Has powerful RAG A/B test tooling that can be used to identify the best retrieval technique for a specific use case. There are other techniques and technologies that can be used to parse PDF files. If your organization is all-in on Azure, this approach can deliver an excellent RAG system. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Create an .env file Install Python packages Create input and output folders Download PDF files Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Parse PDF text and table data with Azure AI Document Intelligence - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/azure-ai%E2%80%93document-intelligence-parse-pdf-text-tables","meta_description":"Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to automate synonyms and upload using our Synonyms API Discover how LLMs can be used to identify and generate synonyms automatically, allowing terms to be programmatically loaded into the Elasticsearch synonym API. Search Relevance How To AL By: Andre Luiz On March 27, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Improving the quality of search results is essential for providing an efficient user experience. One way to optimize searches is by automatically expanding the queried terms through synonyms. This allows queries to be interpreted more broadly, covering language variations and thus improving result matching. This blog explores how large language models (LLMs) can be used to identify and generate synonyms automatically, allowing these terms to be programmatically loaded into Elasticsearch's synonym API. When to use synonyms? The use of synonyms can be a faster and more cost-effective solution compared to vector search. Its implementation is simpler as it does not require deep knowledge of embeddings or a complex vector ingestion process. Additionally, resource consumption is lower since vector search demands greater storage capacity and memory for embedding indexing and retrieval. Another important aspect is search regionalization. With synonyms, it is possible to adapt terms according to local language and customs. This is useful in situations where embeddings may fail to match regional expressions or country-specific terms. For example, some words or acronyms may have different meanings depending on the region, but are naturally treated as synonyms by local users. In Brazil, this is quite common. \"Abacaxi\" and \"ananás\" are the same fruit (pineapple), but the second term is more commonly used in some regions of the Northeast. Similarly, the well-known \"pão francês\" in the Southeast may be known as \"pão careca\" in the Northeast. How to use LLMs to generate synonyms? To obtain synonyms automatically, we can use LLMs, which analyze the context of a term and suggest appropriate variations. This approach allows for dynamically expanding synonyms, ensuring a broader and more accurate search without relying on a fixed dictionary. In this demonstration, we will use an LLM to generate synonyms for e-commerce products. Many searches return few or no results due to variations in the queried terms. With synonyms, we can solve this issue. For example, a search for \"smartphone\" can encompass different models of mobile phones, ensuring users find the products they are looking for. Prerequisites Before getting started, we need to set up the environment and define the required dependencies. We will use the solution provided by Elastic to run Elasticsearch and Kibana locally in Docker . The code will be written in Python, v3.9.6, with the following dependencies: Creating the product index Initially, we will create an index of products without synonym support. This will allow us to validate queries and then compare them to an index that includes synonyms. To create the index, we bulk load a product dataset using the following command in Kibana DevTools: Generating synonyms with LLM In this step, we will use an LLM to dynamically generate synonyms. To achieve this, we will integrate the OpenAI API, defining an appropriate model and prompt. The LLM will receive the product category and name, ensuring that the synonyms are contextually relevant. From the created product index, we will retrieve all items in the \"Electronics\" category and send their names to the LLM. The expected output will be something like: With the generated synonyms, we can register them in Elasticsearch using the Synonyms API. Managing synonyms with the Synonyms API The Synonyms API provides an efficient way to manage synonym sets directly within the system. Each synonym set consists of synonym rules, where a group of words is treated as equivalent in searches. Example of creating a synonym set This creates a set called \"my-synonyms-set,\" where \"hello\" and \"hi\" are treated as equivalents, as well as \"bye\" and \"goodbye.\" Implementing synonym creation for the product catalog Below is the method responsible for building a synonym set and inserting it into Elasticsearch. The synonym rules are generated based on the mapping of synonyms suggested by the LLM. Each rule has an ID, corresponding to the product name in slug format, and the list of synonyms calculated by the LLM. Below is the request payload to create the synonym set: With the synonym set created in the cluster, we can move on to the next step, which is creating a new index with synonym support using the defined set. The complete Python code with the synonyms generated by LLM and the synonym set creation defined by the Synonyms API is below: Creating an index with synonym support A new index will be created where all data from the products index will be reindexed. This index will use the synonyms_filter , which applies the products-synonyms-set created earlier. Below is the index mapping configured to use synonyms: Reindexing the products index Now, we will use the Reindex API to migrate the data from the products index to the new products_02 index, which includes synonym support. The following code was executed in Kibana DevTools: After the migration, the products_02 index will be populated and ready to validate searches using the configured synonym set. Validating search with synonyms Let's compare the search results between the two indexes. We will execute the same query on both indexes and validate whether the synonyms are being used to retrieve results. Search in the products index (without synonyms) We will use Kibana to perform searches and analyze the results. In the Analytics > Discovery menu, we will create a Data View to visualize the data from the indexes we created. Within Discovery, click on Data View and define a name and an index pattern. For the \" products \" index, we will use the \" products ” pattern. Then, we will repeat the process to create a new Data View for the \" products_02 \" index, using the \" products_02” pattern. With the Data Views configured, we can return to Analytics > Discovery and start the validations. Here, after selecting DataView products and performing a search for the term \"tablet\", we get no results, even though we know that there are products like \"Kindle Paperwhite\" and \"Apple iPad Air\". Search in the products_02 index (support synonyms) When performing the same query on the \" products_synonyms \" Data View, which supports synonyms, the products were retrieved successfully. This demonstrates that the configured synonym set is working correctly, ensuring that different variations of the searched terms return the expected results. We can achieve the same result by running the same query directly in Kibana DevTools. Simply search the products_02 index using the Elasticsearch Search API: Conclusion Implementing synonyms in Elasticsearch improved the accuracy and coverage of product catalog searches. The key differentiator was the use of an LLM , which generated synonyms automatically and contextually, eliminating the need for predefined lists. The model analyzed product names and categories, ensuring relevant synonyms for e-commerce. Additionally, the Synonyms API simplified dictionary management, allowing synonym sets to be modified dynamically. With this approach, search became more flexible and adaptable to different user query patterns. This process can be continually improved with new data and model adjustments, ensuring an increasingly efficient research experience. References Run Elasticsearch locally https://www.elastic.co/guide/en/elasticsearch/reference/current/run-elasticsearch-locally.html Synonyms API https://www.elastic.co/guide/en/elasticsearch/reference/current/synonyms-apis.html Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to When to use synonyms? How to use LLMs to generate synonyms? Prerequisites Creating the product index Generating synonyms with LLM Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to automate synonyms and upload using our Synonyms API - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-synonyms-automate","meta_description":"Discover how LLMs can be used to identify and generate synonyms automatically, allowing terms to be programmatically loaded into the Elasticsearch synonym API."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Ingest autoscaling in Elasticsearch Learn more about how Elasticsearch autoscales to address ingestion load. Elastic Cloud Serverless PS HA FC By: Pooya Salehi , Henning Andersen and Francisco Fernández Castaño On July 29, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. Sizing an Elasticsearch cluster correctly is not easy. The optimal size of the cluster depends on the workload that the cluster is experiencing, which may change over time. Autoscaling adapts the cluster size to the workload automatically without human intervention. It avoids over-provisioning resources for the cluster only to accommodate peak usage and it also prevents degrading cluster performance in case of under-provisioning. We rely on this mechanism to free users of our Elastic Cloud Serverless offering from having to make sizing decisions for the indexing tier . Ingest autoscaling requires continuously estimating the resources required to handle the incoming workload, and provisioning and de-provisioning these resources in a timely manner. In this blog post we explore ingest autoscaling in Elasticsearch, covering the following: How ingest autoscaling works in Elasticsearch Which metrics we use to quantify the indexing workload the cluster experiences in order to estimate resources required to handle that workload How these metrics drive the autoscaling decisions. Ingest autoscaling overview Ingest autoscaling in Elasticsearch is driven by a set of metrics that is exposed by Elasticsearch itself. These metrics reflect the ingestion load and the memory requirement of the indexing tier. Elasticsearch provides an autoscaling metrics API that serves these metrics which allows an external component to monitor these metrics and make decisions whether the cluster size needs to change (see Figure 1). In the Elastic Cloud Serverless service, there is an autoscaler component which is a Kubernetes Controller. The autoscaler polls the Elasticsearch autoscaling metrics API periodically and calculates the desired cluster size based on these metrics. If the desired cluster size is different from the current one, the autoscaler changes the cluster size to consolidate the available resources in the cluster towards the desired resources. This change is both in terms of the number of Elasticsearch nodes in the cluster and the CPU, memory and disk available to each node. Figure 1 : ingestion autoscaling overview An important consideration for ingest autoscaling is that when the cluster receives a spike in the indexing load the autoscaling process can take some time until it effectively adapts the cluster size. While we try to keep this reaction time as low as possible, it cannot be instantaneous. Therefore, while the cluster is scaling up, the Elasticsearch cluster should be able to temporarily push back on the load it receives if the increased load is otherwise going to cause cluster instability issues. The increase in the indexing load can manifest itself in the cluster requiring more resources, i.e., CPU, memory or disk. Elasticsearch has protection mechanisms that allows nodes to push back on the indexing load if any of these resources becomes a bottleneck. To handle indexing requests Elasticsearch uses dedicated thread pools sized based on the number of cores available to the node. If the increased indexing load results in CPU or other resources becoming a bottleneck, incoming indexing requests are queued. The maximum size of this queue is limited and any request arriving at the node when the queue is full will be rejected with a 429 HTTP code. Elasticsearch also keeps track of the required memory to address ongoing indexing requests and rejects incoming requests (with a 429) if the indexing buffer grows beyond 10% of the available heap memory . This limits the memory used for indexing and ensures the node will not go out of memory. The Elastic Cloud Serverless offering relies on the object store as the main storage for indexed data. The local disk on the nodes are used temporarily to hold indexed data. Periodically, Elasticsearch uploads the indexed data to the object store which allows freeing up the local disk space as we rely on the object store for durability of the indexed document. Nonetheless, under high indexing load, it is possible for the node to run out of disk space before the periodic upload task gets a chance to run and free up the local disk space. To handle these cases, Elasticsearch monitors the available local disk space and if necessary throttles the indexing activity while it attempts to free up space by enforcing an upload to the object store rather than waiting for the periodic upload to take place. Note that this throttling in turn results in queueing of the incoming indexing requests. These protection mechanisms allow an Elasticsearch cluster to temporarily reject requests and provide the client with a response that indicates that the cluster is overloaded while the cluster tries to scale up. This push-back signal from Elasticsearch provides the client with a chance to react by reducing the load if possible or retrying the request which should eventually succeed if retried when the cluster is scaled up. Metrics The two metrics that are used for ingest autoscaling in Elasticsearch are ingestion load and memory. Ingestion load Ingestion load represents the number of threads that is needed to cope with the current indexing load. The autoscaling metrics API exposes a list of ingestion load values, one for each indexing node. Note that as the write thread pools (which handle indexing requests) are sized based on the number of CPU cores on the node, this essentially determines the total number of cores that is needed in the cluster to handle the indexing workload. The ingestion load on each indexing node consists of two components: Thread pool utilization: the average number of threads in the write thread pool processing indexing requests during that sampling period. Queued ingestion load: the estimated number of threads needed to handle queued write requests. The ingestion load of each indexing node is calculated as the sum of these two values for all the three write thread pools . The total ingestion load of the Elasticsearch cluster is the sum of the ingestion load of the individual nodes. n o d e _ i n g e s t i o n _ l o a d = ∑ ( t h r e a d _ p o o l _ u t i l i z a t i o n + q u e u e d _ i n g e s t i o n _ l o a d ) t o t a l _ i n g e s t i o n _ l o a d = ∑ ( n o d e _ i n g e s t i o n _ l o a d ) \\small node\\_ingestion\\_load = \\sum(thread\\_pool\\_utilization + queued\\_ingestion\\_load) \\newline total\\_ingestion\\_load = \\sum(node\\_ingestion\\_load) n o d e _ in g es t i o n _ l o a d = ∑ ( t h re a d _ p oo l _ u t i l i z a t i o n + q u e u e d _ in g es t i o n _ l o a d ) t o t a l _ in g es t i o n _ l o a d = ∑ ( n o d e _ in g es t i o n _ l o a d ) Figure 2 : ingestion load components The thread pool utilization is an exponentially weighted moving average (EWMA) of the number of busy threads in the thread pool, sampled every second. The EWMA of the sampled thread pool utilization values is configured such that the sampled values of the past 10 seconds have the most effect on the thread pool utilization component of the ingestion load and samples older than 60 seconds have very negligible impact. To estimate the resources required to handle the queued indexing requests in the thread pool, we need to have an estimate for how long each queued task can take to execute. To achieve this, each thread pool also provides an EWMA of the request execution time. The request execution time for an indexing request is the (wall-clock) time taken for the request to finish once it is out of the queue and a worker thread starts executing it. As some queueing is acceptable and should be manageable by the thread pool, we try to estimate the resources needed to handle the excess queueing. We consider up to 30s worth of tasks in the queue manageable by the existing number of workers and account for an extra thread proportional to this value. For example, if the average task execution time is 200ms, we estimate that each thread is able to handle 150 indexing requests within 30s, and therefore account for one extra thread for each 150 queued items. q u e u e d _ i n g e s t i o n _ l o a d = q u e u e _ s i z e × a v e r a g e _ r e q u e s t _ e x e c u t i o n _ t i m e 30 s \\small queued\\_ingestion\\_load = \\frac{queue\\_size \\times average\\_request\\_execution\\_time}{30s} q u e u e d _ in g es t i o n _ l o a d = 30 s q u e u e _ s i ze × a v er a g e _ re q u es t _ e x ec u t i o n _ t im e ​ Note that since the indexing nodes rely on pushing indexed data into the object store periodically, we do not need to scale the indexing tier based on the total size of the indexed data. However, the disk IO requirements of the indexing workload needs to be considered for the autoscaling decisions. The ingestion load represents both CPU requirements of the indexing nodes as well as disk IO since both CPU and IO work is done by the write thread pool workers and we rely on the wall clock time to estimate the required time to handle the queued requests. Each indexing node calculates its ingestion load and publishes this value to the master node periodically. The master node serves the per node ingestion load values via the autoscaling metrics API to the autoscaler. Memory The memory metrics exposed by the autoscaling metrics API are node memory and tier memory. The node memory represents the minimum memory requirement for each indexing node in the cluster. The tier memory metric represents the minimum total memory that should be available in the indexing tier. Note that these values only indicate the minimum to ensure that each node is able to handle the basic indexing workload and hold the cluster and indices metadata, while ensuring that the tier includes enough nodes to accommodate all index shards. Node memory must have a minimum of 500MB to be able to handle indexing workloads , as well as a fixed amount of memory per each index . This ensures all nodes can hold metadata for the cluster, which includes metadata for every index. Tier memory is determined by accounting for the memory overhead of the field mappings of the indices and the amount of memory needed for each open shard allocated on a node in the cluster. Currently, the per-shard memory requirement uses a fixed estimate of 6MB. We plan to refine this value. The estimate for the memory requirements for the mappings of each index is calculated by one of the data nodes that hosts a shard of the index. The calculated estimates are sent to the master node. Whenever there is a mapping change this estimate is updated and published to the master node again. The master node serves the node and total memory metrics based on these information via the autoscaling metrics API to the autoscaler. Scaling the cluster The autoscaler is responsible for monitoring the Elasticsearch cluster via the exposed metrics, calculating the desirable cluster size to adapt to the indexing workload, and updating the deployment accordingly. This is done by calculating the total required CPU and memory resources based on the ingestion load and memory metrics. The sum of all the ingestion load per node values determines the total number of CPU cores needed for the indexing tier. The calculated CPU requirement and the provided minimum node and tier memory resources are mapped to a predetermined set of cluster sizes. Each cluster size determines the number of nodes and the CPU, memory and disk size of each node. All nodes within a certain cluster size have the same hardware specification. There is a fixed ratio between CPU, memory and disk, thus always scaling all 3 resources linearly. The existing cluster sizes for the indexing tier are based on node sizes starting from 4GB/2vCPU/100GB disk to 64GB/32vCPU/1600GB disk. Once the Elasticsearch cluster scales up to the largest node size (64GB memory), any further scale-up adds new 64GB nodes, allowing a cluster to scale up to 32 nodes of 64GB. Note that this is not a hard upper bound on the number of Elasticsearch nodes in the cluster and can be increased if necessary. Every 5 seconds the autoscaler polls metrics from the master node, calculates the desirable cluster size and if it is different from the current cluster size, it updates the Elasticsearch Kubernetes Deployment accordingly. Note that the actual reconciliation of the deployment towards the desired cluster size and adding and removing the Elasticsearch nodes to achieve this is done by Kubernetes. In order to avoid very short-lived changes to the cluster size, we account for a 10% headroom when calculating the desired cluster size during a scale down and a scale down takes effect only if all desired cluster size calculations within the past 15 minute have indicated a scale-down. Currently, the time that it takes for an increase in the metrics to lead to the first Elasticsearch node being added to the cluster and ready to process indexing load is under 1 minute. Conclusion In this blog post, we explained how ingest autoscaling works in Elasticsearch, the different components involved, and the metrics used to quantify the resources needed to handle the indexing workload. We believe that such an autoscaling mechanism is crucial to reduce the operational overhead of an Elasticsearch cluster for the users by automatically increasing the available resources in the cluster when necessary. Furthermore, it leads to cost reduction by scaling down the cluster when the available resources in the cluster are not required anymore. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Ingest autoscaling overview Metrics Ingestion load Memory Scaling the cluster Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Ingest autoscaling in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-ingest-autoscaling","meta_description":"Learn more about how Elasticsearch autoscales to address ingestion load."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Resolving high CPU usage issues in Elasticsearch with AutoOps How AutoOps pinpointed and resolved high CPU usage in an Elasticsearch cluster: A step-by-step case study. AutoOps MD By: Musab Dogan On December 18, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this article, we’ll dive into a real-world scenario where AutoOps was instrumental in diagnosing and addressing high CPU usage in a production Elasticsearch cluster. The incident began with a customer support ticket and ended with actionable insights to ensure smoother operations in the future. Introduction: Diagnosing High CPU usage in Elasticsearch Efficiently managing Elasticsearch clusters is crucial for maintaining application performance and reliability. When a customer experiences sudden performance bottlenecks, the ability to quickly diagnose the issue and provide actionable recommendations becomes a key differentiator. This review explores how AutoOps, a powerful monitoring and management tool, helped us identify and analyze a high CPU utilization issue affecting an Elasticsearch cluster. The article provides a step-by-step account of how AutoOps identified the root cause, along with the benefits this tool offers in streamlining the investigation process. The high CPU situation On July 14, 2024, a production cluster named “Palomino” experienced an outage. The customer reported the issue the next day, citing high CPU usage as a potential root cause. Despite the issue being non-urgent (as the outage was resolved), understanding the underlying cause remained critical for preventing recurrence. The initial request was as follows: The investigation began with one keyword in mind: high CPU usage . Using AutoOps for diagnosing high CPU usage Step 1: Analyzing AutoOps events AutoOps immediately flagged multiple “High CPU Utilization” events. Clicking on an event provided comprehensive details, including: When the event started and ended. The node experiencing the most pressure. Initial recommendations, such as enabling search slow logs. While the suggestion to enable slow logs was noted, we continued exploring for a deeper root cause. If you want to activate search slowlogs, you can use this link . Step 2: Node view analysis Focusing on the node with the highest CPU pressure (instance-0000000008), AutoOps filtered the graphs to highlight metrics specific to that node during the event window. This view confirmed significant CPU usage spikes. Step 3: Broader investigation By zooming out to analyze a larger time range, we observed that the CPU increase coincided with a rise in both search and indexing requests. Expanding the view further revealed that the issue was not limited to one node but affected all nodes in the cluster. Step 4: Identifying patterns The investigation revealed a critical pattern: a regular spike around 7:00 AM each day , triggered by simultaneous search and indexing requests. This repetitive behavior was the root cause of the high CPU utilization. Step 5: Actionable insights AutoOps provided three critical questions to ask the customer: What is happening every day at 7:40 AM (GMT+3)? Can these requests be distributed more evenly over time to decrease pressure? Have you monitored the CPU graph (AutoOps > Node View > Host and Process > CPU) at 7:00 AM after implementing changes? Finding the root cause of problems generally takes 90% of the time, while fixing the problem takes 10%. Thanks to AutoOps, we were able to handle this 90% more easily and much faster. Hint: To find the problematic query AutoOps plays a crucial role. It will help you find where the problematic query/indexing runs, eg, on which node, shard, and index. Also, thanks to the long_running_search_task event, without any manual effort, AutoOps can identify the problematic query and create an event with a recommended approach to fine-tune the query. Benefits of using AutoOps Rapid Identification: AutoOps’ event-based monitoring pinpointed the affected node and time range within minutes. Clear Recommendations: Suggestions like enabling slow logs will help focus on troubleshooting efforts. Pattern Recognition: By correlating metrics across all nodes and timeframes, AutoOps uncovered the recurring nature of the issue. User-Friendly Views: Filtering and zooming capabilities made it easy to visualize trends and anomalies. Conclusion Thanks to AutoOps, we transformed a vague report of high CPU usage into a clear, actionable plan. By identifying a recurring pattern of activity, we provided the customer with the tools and insights to prevent similar issues in the future. If your team manages production systems, incorporating tools like AutoOps into your workflow can significantly enhance visibility and reduce the time to resolve critical issues. Report an issue Related content AutoOps January 2, 2025 Leveraging AutoOps to detect long-running search queries Learn how AutoOps helps you investigate long-running search queries plaguing your cluster to improve search performance. VC By: Valentin Crettaz AutoOps How To November 20, 2024 Hotspotting in Elasticsearch and how to resolve them with AutoOps Explore hotspotting in Elasticsearch and how to resolve it using AutoOps. SF By: Sachin Frayne AutoOps Elastic Cloud Hosted November 6, 2024 AutoOps makes every Elasticsearch deployment simple(r) to manage AutoOps for Elasticsearch significantly simplifies cluster management with performance recommendations, resource utilization and cost insights, real-time issue detection and resolution paths. ZS OS By: Ziv Segal and Ori Shafir Jump to Introduction: Diagnosing High CPU usage in Elasticsearch The high CPU situation Using AutoOps for diagnosing high CPU usage Step 1: Analyzing AutoOps events Step 2: Node view analysis Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Resolving high CPU usage issues in Elasticsearch with AutoOps - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-cpu-usage-high","meta_description":"Learn how to diagnose and fix the Elasticsearch high CPU usage issue. We'll use AutoOps to pinpoint & resolve the issue and gain insights for prevention."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Embeddings and reranking with Alibaba Cloud AI Service Using Alibaba Cloud AI Service features with Elastic. Generative AI Integrations How To TM By: Tomás Murúa On February 26, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this article, we'll cover how to integrate Alibaba Cloud AI features with Elasticsearch to improve relevance in semantic searches. Alibaba Cloud AI Search is a solution that integrates advanced AI features with Elasticsearch tools, by leveraging the Qwen LLM family to contribute with advanced models for inference and classification. In this article, we'll use descriptions of novels and plays written by the same author to test the Alibaba reranking and sparse embedding endpoints. Steps Configure Alibaba Cloud AI Create Elasticsearch mappings Index data into Elasticsearch Query data Bonus: Answering questions with completion Configure Alibaba Cloud AI Alibaba Cloud AI reranking and embeddings Open inference Alibaba Cloud offers different services. In this example, we'll use the descriptions of popular books and plays by Agatha Christie to test Alibaba Cloud embeddings and reranking endpoints in semantic search. The Alibaba Cloud AI reranking endpoint is a semantic reranking functionality. This type of reranking uses a machine learning model to reorder search results based on their semantic similarity to a query. This allows you to use out-of-the-box semantic search capabilities on existing full-text search indices. The sparse embedding endpoint is a type of embedding where most values are zero, making relevant information more prominent. Get Alibaba Cloud API Key We need a valid API Key to integrate Alibaba with Elasticsearch. To get it, follow these steps: Access the Alibaba Cloud portal from the Service Plaza section. Go to the left menu API Keys as shown below. Generate a new API Key. Configure Alibaba Endpoints We´ll first configure the sparse embedding endpoint to transform the text descriptions into semantic vectors: Embeddings endpoint: We´ll then configure the rerank endpoint to reorganize results. Rerank Endpoint: Now that the endpoints are configured, we can prepare the Elasticsearch index. Create Elasticsearch mappings Let's configure the mappings . For this, we need to organize both the texts with the descriptions as well as the model-generated vectors. We'll use the following properties: semantic_description : to store the embeddings generated by the model and run semantic searches. description : we'll use a \" text \" type to store the novels and plays’ descriptions and use them for full-text search. We'll include the copy_to parameter so that both the text and the semantic field are available for hybrid search: With the mappings ready, we can now index the data. Index data into Elasticsearch Here's the dataset with the descriptions that we'll use for this example. We'll index it using the Elasticsearch Bulk API . Note that the first two documents, “Black Coffee” and “The Mousetraps” are plays while the others are novels. Query data To see the different results we can get, we'll run different types of queries, starting with semantic query, then applying reranking, and finally using both. We'll use the same question \"Which novel was written by Agatha Christie?\" expecting to get the three documents that explicitly say novel, plus the one that says book. The two plays should be the last results. Semantic search We'll begin querying the semantic_text field to ask: \"Which novel was written by Agatha Christie?\" Let's see what happens: Response: In this case, the response prioritized most of the novels, but the document that says “book” appears last. We can still further refine the results with reranking. Refining results with Reranking In this case, we'll use a _inference/rerank request to assess the documents we got in the first query and improve their rank in the results. Response: The response here shows that both plays are now at the bottom of the results. Semantic search and reranking endpoint combined Using a retriever , we'll combine the semantic query and reranking in just one step: Response: The results here differ from the semantic query. We can see that the document with no exact match for \"novel\" but that says “book” ( The Murder of Roger Ackroyd) appears higher than in the first semantic search. Both plays are still the last results, just like with reranking. Bonus: Answering questions with completion With embeddings and reranking we can satisfy a search query, but still, the user will see all the search results and not the actual answer. With the examples provided, we are one step away from a RAG implementation, where we can provide the top results + the question to an LLM to get the right answer. Fortunately, Alibaba Cloud AI Service also provides an endpoint service we can use to achieve this purpose. Let’s create the endpoint Completion Endpoint: And now, send the results and question from the previous query: Query Response Conclusion Integrating Alibaba Cloud AI Search with Elasticsearch allows us to easily access completion, embedding, and reranking models to incorporate them into our search pipeline. We can use the reranking and embedding endpoints, either separately or together, with the help of a retriever. We can also introduce the completion endpoint to finish up a RAG end-to-end implementation. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Configure Alibaba Cloud AI Alibaba Cloud AI reranking and embeddings Get Alibaba Cloud API Key Configure Alibaba Endpoints Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Embeddings and reranking with Alibaba Cloud AI Service - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/alibaba-cloud-ai-embeddings-reranking","meta_description":"Learn how to use Alibaba Cloud AI features with Elastic, including embeddings and reranking."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch Learn how to build a scalable data pipeline for unstructured documents using NeMo Retriever, Unstructured Platform, and Elasticsearch for RAG applications. Integrations AG By: Ajay Krishnan Gopalan On May 8, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this blog, we will discuss how to implement a scalable data processing pipeline using NVIDIA NeMo Retriever extraction models, Unstructured Platform and Elasticsearch. This pipeline transforms unstructured data from a data source into structured, searchable content ready for downstream AI applications, such as RAG. Retrieval Augmented Generation (RAG) is an AI technique where Large Language Models (LLMs) are provided with external knowledge to generate responses to user queries. This allows LLM responses to be tailored to specific context, making answers more accurate and relevant. Before we get started, let’s take a look at the key components enabling this pipeline and what each brings to the table. Pipeline components NeMo Retriever extraction is a set of microservices for transforming unstructured documents into structured content and metadata. It handles document parsing, visual structure identification, and OCR processing at scale. The RAG NVIDIA AI Blueprint provides a starting point for how to use the NeMo Retriever microservices in a high-performance extraction pipeline. Unstructured is an ETL+ platform for orchestrating the entirety of unstructured data processing: from ingesting unstructured data from multiple data sources, converting raw, unstructured files into structured data through a configurable workflow engine, enriching data with additional transformations, all the way to uploading the results into vector stores, databases and search engines. It provides a visual UI, APIs, and scalable backend infrastructure to orchestrate document parsing, enrichment, and embedding in a single workflow. Elasticsearch is an industry-leading search and analytics engine that now includes native vector search capabilities. It can function as both a traditional text database and a vector database, enabling semantic search at scale with features like k-NN similarity search. Now that we’ve introduced the core components, let’s take a look at how they work together in a typical workflow before diving into the implementation. RAG with NeMo Retriever - Unstructured - Elasticsearch While here we only provide key highlights, you can find the full notebook here . This blog can be divided into 3 parts: Setting up the source and destination connectors Setting up the workflow with Unstructured API RAG over the processed data Unstructured workflow is represented as a DAG where the nodes, called connectors, control where the data is ingested from and where the processed results are uploaded to. These nodes are required in any workflow. A source connector configures ingestion of the raw data from a data source, and the destination connector configures the data uploading of the processed data into a vector store, search engine, or a database. For this blog, we store research papers in Amazon S3 and we want the processed data to be delivered into Elasticsearch for downstream use. This means that before we can build a data processing workflow, we need to create a source connector for Amazon S3, and a destination connector for Elasticsearch with Unstructured API. Step 1: Setting up the S3 source connector When creating a source connector, you need to give it a unique name, specify its type (e.g. S3 , or Google Drive ), and provide the configuration which typically contains the location of the source you're connecting to (e.g. S3 bucket URI, or Google Drive folder) and authentication details. Step 2: Setting up the Elasticsearch destination connector Next, let’s set up the Elasticsearch destination connector. The Elasticsearch index that you use must have a schema that is compatible with the schema of the documents that Unstructured produces for you—you can find all the details in the documentation . Step 3: Creating a workflow with Unstructured Once you have the source and destination connectors, you can create a new data processing workflow. We’ll build the workflow DAG with the following nodes: NeMo Retriever for document partitioning Unstructured’s Image Summarizer, Table Summarizer, and Named Entity Recognition nodes for content enrichment Chunker and Embedder nodes for making the content ready for similarity search Once your job for this workflow completes, the data is uploaded into Elasticsearch and we can proceed with building a basic RAG application. Step 4: RAG setup Let's go ahead with a simple retriever that will connect to the data, take in the user query, embed it with the same model that was used to embed the original data, and calculate cosine similarity to retrieve the top 3 documents. Then let's set up a workflow to receive a user query, fetch similar documents from Elasticsearch, and use the documents as context to answer the user’s question. Putting everything together we get: And a response: Elasticsearch provides various strategies to enhance search, including Hybrid search , a combination of approximate semantic search and keyword-based search. This approach can improve the relevance of the top documents used as context in the RAG architecture. To enable it, you need to modify the vector_store initialization as follows: Conclusion Good RAG starts with well-prepared data, and Unstructured simplifies this critical first step. By enabling partitioning with NeMo Retriever, metadata enrichment of unstructured data and efficient ingestion into Elasticsearch, it ensures that your RAG pipeline is built on a solid foundation, unlocking its full potential for all your downstream tasks. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Jump to Pipeline components RAG with NeMo Retriever - Unstructured - Elasticsearch Step 1: Setting up the S3 source connector Step 2: Setting up the Elasticsearch destination connector Step 3: Creating a workflow with Unstructured Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Unstructured data processing with NVIDIA NeMo Retriever, Unstructured, and Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/unstructured-data-processing-with-nvidia-nemo-retriever-unstructured-and-elasticsearch","meta_description":"Unstructured data processing with NV‑Ingest, Unstructured & Elastic"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elastic Playground: Using Elastic connectors to chat with your data Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources. Integrations Ingestion How To JR TM By: Jeffrey Rengifo and Tomás Murúa On February 3, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. Elastic connectors make it easy to index and combine data from different sources to run unified searches. With the addition of Playground you can set up a knowledge base that you can chat with and ask questions. Connectors are a type of Elastic integration that are helpful for syncing data from different sources to an Elasticsearch index. In this article, we'll see how to index a Confluence Wiki using the Elastic connector, configure an index to run semantic queries, and then use Playground to chat with your data. Steps Configure the connector Preparing the index Chat with data using Playground Configure the connector In our example, our Wiki works as a centralized repository for a hospital and contains info on: Doctors' profiles: speciality, availability, contact info. Patients' files: Medical records and other relevant data. Hospital guidelines: Policies, emergency protocols and instructions for staff. We'll index the content from our Wiki using the Elasticsearch-managed Confluence connector . The first step is to get your Atlassian API Key : Configuring the Confluence native connector You can follow the steps here to guide you through the configuration: Access your Kibana instance and go to Search > Connectors Click on add a connector and select Confluence from the list. Name the new connector \"hospital\". Then click on the create new Index button. Click on edit configurations and, for this example, we need to modify the data source for \"confluence cloud\". The required fields are: Confluence Cloud account email API Key Confluence URL label Save the configuration and go to the next step. By default, the connector will index: Pages Spaces Blog Posts Attachments To make sure to only index the wiki, you need to use an advanced filter rule to include only pages inside the space named \"Hospital Health\" identified as \"HH\". You can check out additional examples here . Now, let's run a Full Content Sync to index our wiki. Once completed, we can check the indexed documents on the tab \"Documents\". Preparing the index With what we have so far, we could run full text queries on our content. Since we want to make questions instead of looking for keywords, we now need to have semantic search. For this purpose we will use Elasticsearch ELSER model as the embeddings provider. To configure this, use the Elasticsearch's inference API . Go to Kibana Dev Tools and copy this code to start the endpoint: Now the model is loading in the background. You might get a 502 Bad Gateway error if you haven't used the ELSER model before. To make sure the model is loading, check Machine Learning > Trained Models: Let's add a semantic_text field using the UI. Go to the connector's page, select Index mappings, and click on Add Field. Select \"Semantic text\" as field type. For this example, the reference field will be \"body\" and the field name content_semantic. Finally, select the inference endpoint we've just configured. Before clicking on \"Add field\", check that your configuration looks similar to this: Now click on \"Save mapping\": One you've ran the Full Content Sync from the UI, let's check it's ok by running a semantic query: The response should look something like this: Chat with your data using Playground What is Playground? Playground is a low code platform hosted in Kibana that allows you to easily create a RAG application and ask questions to your indices, regardless if they have embeddings. Playground not only provides a UI chat with citations and provides full control over the queries, but also handles different LLMs to synthesize the answers. You can read this article for a deeper insight and test the online demo to familiarize yourself with it. Configure Playground To begin, you only need the credentials for any of the compatible models : OpenAI (or any local model compatible with OpenAI API) Amazon Bedrock Google Gemini When you open Playground, you have the option to configure the LLM provider and select the index with the documents you want to use as knowledge base. For this example, we'll use OpenAI. You can check this link to learn how to get an API key . Let's create our OpenAI connector by clicking Connect to an LLM > OpenAI and let's fill in the fields as in the image below: To select the index we created using the Confluence connector, click on \"Add data sources\" and click on the index. NOTE: You can select more than one index, if you want. Now that we're done configuring, we can start making questions to the model. Aside from choosing to include citations with the source document in your answers, you can also control which fields to send to the LLM to use in search. The View Code window provides the python code you need to integrate this into your apps. Conclusion In this article, we learned that we can use connectors both to search for information in different sources as well as a knowledge base using Playground. We also learned to easily deploy a RAG application to chat with your data without leaving the Elastic environment. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Configure the connector Configuring the Confluence native connector Preparing the index Chat with your data using Playground Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elastic Playground: Using Elastic connectors to chat with your data - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/playground-connectors-data-chat","meta_description":"Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Lucene Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Lucene January 3, 2025 Lucene Wrapped 2024 2024 has been another major year for Apache Lucene. In this blog, we’ll explore the key highlights. CH By: Chris Hegarty Lucene December 27, 2024 Lucene bug adventures: Fixing a corrupted index exception Sometimes, a single line of code takes days to write. Here, we get a glimpse of an engineer's pain and debugging over multiple days to fix a potential Apache Lucene index corruption. BT By: Benjamin Trent Vector Database Lucene December 4, 2024 Smokin' fast BBQ with hardware accelerated SIMD instructions How we optimized vector comparisons in BBQ with hardware accelerated SIMD (Single Instruction Multiple Data) instructions. CH By: Chris Hegarty Vector Database Lucene +1 November 18, 2024 Better Binary Quantization (BBQ) vs. Product Quantization Why we chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch. BT By: Benjamin Trent 1 2 3 Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Lucene - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/lucene","meta_description":"Lucene articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Semantic reranking in Elasticsearch with retrievers Explore strategies for using semantic reranking to boost the relevance of top search results, including semantic reranking with retrievers. Vector Database Search Relevance How To AD NC By: Adam Demjen and Nick Chow On May 28, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This guide explores strategies for using semantic reranking to boost the relevance of top search results, including as direct inference call, in the context of a search experience, or as part of a simplified search flow with retrievers. Before diving into the details, let's explain what semantic reranking is and why it is important. What is semantic reranking? Semantic reranking is a method that allows us to utilize the speed and efficiency of fast retrieval methods while layering semantic search on top of it. It also lets us immediately add semantic search capabilities to existing Elasticsearch installations out there. With the advancement of machine learning-powered semantic search we have more and more tools at our disposal for finding matches quickly from millions of documents. However, like cramming for a final exam, optimizing for speed means making some tradeoffs, and that usually comes at a loss in fidelity. To offset this, we see some tools emerging and becoming increasingly available on the other side of the gradient. These are much slower, but can tell how closely a document matches a query with much more accuracy. To explain some key terms: reranking is the process of reordering a set of retrieved documents in order to improve search relevance. In semantic reranking this is done with the help of a reranker machine learning model, which calculates a relevance score between the input query and each document. Rerankers typically operate on the top K results, a narrowed-down window of relevant candidates fulfilling the search query, since reranking a large list of documents would be extremely costly. Why is semantic reranking important? Semantic reranking is an important refinement layer for search users for a couple of reasons. First, users are expecting more from their search, where the right result isn't in the top ten hits or in the first page, but is the top answer . It's like that old search joke - the best place to hide a secret is in the second page of search results. Except today it's even more narrow: anything below the top one, two, or maybe three results will likely get discarded. This applies even more so for RAG (Retrieval Augmented Generation) - those Generative AI use cases need a tight context window. The best document could be the 4th result, but if you're only feeding in the top three, you aren't going to get the right answer, and the model could hallucinate. On top of that, Generative AI use cases work best with an effective cutoff . You could define a minimum score or count up to which the results are considered \"good\", but this is hard to do without consistent scoring. Semantic reranking solves these problems by reordering the documents so that the most relevant ones come out on top. It provides usable, normalized and well-calibrated scores, so you can measure how closely your results match your query. So you more reliably get much more accurate top results to feed to your large language model, and you can cut off results if there's a big dropoff in score in the top K hits to prevent hallucinations. How do we perform semantic reranking? The rerank inference type Elastic recently introduced inference endpoints and related APIs . This feature allows us to use certain services, such as built-in or 3rd party machine learning models, to perform inference tasks. Supported inference tasks come in various shapes - for example a sparse_embedding task is where an ML model (such as ELSER) receives some text and generates a weighted set of terms, whereas a text_embedding task creates vector embeddings from the input. Elastic Serverless - and the upcoming 8.14 release - adds a new task type: rerank . In the first iteration rerank supports integrating with Cohere 's Rerank API. This means you can now create an inference endpoint in Elastic, supply your Cohere API key, and enjoy semantic reranking out of the box! Let's see that in action with an example taken from the Cohere blog . Assuming you have set up your rerank inference endpoint in Elastic with the Cohere Rerank v3 model, we can pass a query and an array of input text. As we can see, the short passages all relate to the word \"capital\", but not necessarily to the meaning of the location of the seat of government, which is what the query is looking for: The rerank task responds with an array of scores and document indices: The topmost entry tells us that the highest relevance score of 99.8% is the 4th document ( \"index\": 3 with zero-based indexing) of the original list, a.k.a. \"Washington, D.C. ...\" . The rest of the documents are semantically less relevant to the original query. This reranking inference step is an important puzzle piece of an optimized search experience, and now we are ready to place it in the puzzle board! Reranking search results today - through your application One way of harnessing the power of semantic reranking is to implement a workflow like this in a search application: A user enters a query in your app's UI. The search engine component retrieves a set of documents that match this query. This can be done using any retrieval strategy: lexical (BM25), vector search (e.g. kNN) or a method that combines the two, such as RRF. The application takes the top K documents, extracts the text field we are querying against from each document, then sends this list of texts to the rerank inference endpoint, which is configured to use Cohere. The inference endpoint passes the documents and the query to Cohere. The result is a list of scores and indices to match each score. Your app takes these scores, assigns them to the documents, and reorders them by this score in a descending order. This effectively moves the semantically most relevant documents to the top. If this flow is used in RAG to provide some sources to a generative LLM (such as summarizing an answer), then you can rest assured it will work with the right context and provide answer. This works great, but it involves many steps, data massaging, and a complex processing logic with many moving parts. Can we simplify this? Reranking search results tomorrow - with retrievers Let's spend a minute talking about retrievers. Retriever is a new type of abstraction in the _search API, which is more than just a simple query. It's a building block for an end-to-end search flow for fetching hits and potentially modifying the documents' scores and their order. Retrievers can be used in a pipeline pattern, where each retriever unit does something different in the search process. For example we can configure a first-stage retriever to fetch documents, pass the results to a second-stage retriever to combine with other results, trim the number of candidates etc. As a final stage, a retriever can update the relevance score of documents. Soon we'll be adding new reranking capabilities with retrievers, text similarity reranker retriever being the first one. This will perform reranking on top K hits by calling a rerank inference endpoint. The workflow will be simplified into a single API call that hides all the complexity! This is what the previously described multi-stage workflow looks like as a single retriever query: The text_similarity_reranker retriever is configured with the following details: Nested retriever Reranker inference configuration Additional controls, such as minimum score cutoff for eliminating irrelevant hits Below is an example text_similarity_reranker query. Let's dissect it to understand each part better! The request defines a retriever query as the root property. The outermost retriever will execute last, in this case it's a text_similarity_reranker . It specifies a standard first-stage retriever, which is responsible for fetching some documents. The standard retriever accepts an Elasticsearch query , which is a BM25 match in the example. The text similarity reranker is pointed at the text document field that contains the text for semantic reranking. The top 100 documents will be sent for reranking to the cohere-rerank-v3-model rerank inference endpoint we have configured with Cohere. Only those documents will be returned that receive at least 60% relevance score in the reranking process. The response is the exact same structure as that of a search query. The _score property is the semantic relevance score from the reranking process, and _rank refers to the ranking order of documents. Semantic reranking with retrievers will be available shortly in a coming Elastic release. Conclusion Semantic reranking is an incredibly powerful tool for boosting the performance of a search experience or a RAG tool. It can be used as direct inference call, in context of a search experience, or as part of a simplified search flow with retrievers. Users can pick and choose the set of tools that work best for their use case and context. Happy reranking! 😃 Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to What is semantic reranking? Why is semantic reranking important? How do we perform semantic reranking? The Reranking search results today - through your application Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Semantic reranking in Elasticsearch with retrievers - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/semantic-reranking-with-retrievers","meta_description":"Explore strategies for using semantic reranking to boost the relevance of top search results, including semantic reranking with retrievers."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing Elastic Rerank: Elastic's new semantic re-ranker model Learn about how Elastic's new re-ranker model was trained and how it performs. ML Research TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou On November 25, 2024 Part of Series Semantic reranking & the Elastic Rerank model Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our previous blog in this series we introduced the concept of semantic reranking. In this blog we're going to discuss the re-ranker we've trained and are releasing in technical preview. Introduction One of our purposes at Elastic is lowering the bar to achieve high quality text search. Building on top of Lucene, Elasticsearch provides a rich set of scalable well optimized full-text retrieval primitives including lexical retrieval with BM25 scoring, learned sparse retrieval and a vector database. We've recently introduced the concept of retrievers to the search API, which allow for composable operations including semantic reranking. We're also working towards introducing advanced search pipelines to ES|QL . Starting with our serverless offering, we're releasing in technical preview the Elastic Rerank model. This is a cross-encoder reranking model. Over time we plan to integrate it with our full product suite and provide optimized versions to run on ML nodes in any cluster; exactly as we do for our retrieval model . We've also been working on some exciting new inference capabilities, which will be ideally suited to reranking workloads, so expect further announcements. This first version targets English text reranking and provides some significant advantages in terms of quality as a function of inference cost compared to the other models we evaluated. In this blog post we will discuss some aspects of its architecture and training. But first… How does it compare? In our last blog , we discussed how lexical retrieval with BM25 scoring or (BM25 for short) represents an attractive option in cases where the indexing costs using sparse or dense models would be very high. However, newer methodologies tend to give very significant relevance improvements compared to BM25, particularly for more complex natural language queries. As we've discussed before, the BEIR suite is a high quality and widely used benchmark for English retrieval. It is also used by the MTEB benchmark to assess retrieval quality of text embeddings. It includes various tasks, including open-domain Question Answering (QA) where BM25 typically struggles. Since BM25 represents a cost effective first stage retriever, it is interesting to understand to what extent we can use reranking to “fix up” its relevance as measured by BEIR. In our next blog, we're going to present a detailed analysis of the different high quality re-rankers we include in the table below. This includes more qualitative analysis of their behavior as well as some additional insights into their cost-relevance trade-offs. Here, we follow prior art and describe their effectiveness reranking the top 100 results from BM25 retrieval. This is fairly deep reranking and not something we would necessarily recommend for inference on CPU. However, as we'll show in our next blog, it provides a reasonable approximation of the uplift in relevance you can achieve from reranking. Model Parameter Count Average nDCG@10 BM25 - 0.426 MiniLM-L-12-v2 33M 0.487 mxbai-rerank-base-v1 184M 0.48 monoT5-large 770M 0.514 Cohere v3 n/a 0.529 bge-re-ranker-v2-gemma 2B 0.568 Elastic 184M 0.565 Average nDCG@10 for the BEIR reranking the top 100 BM25 retrieved documents To give a sense of the relative cost-relevance trade-off of the different models we've plotted this table below. Average nDCG@10 for the BEIR reranking the top 100 BM25 retrieved documents. Up and to the left is better For completeness we also show the individual dataset results for Elastic Rerank below. This represents an average improvement of 39% across the full suite. At the time of writing, this puts reranked BM25 around position 20 of all methods on the MTEB leaderboard . All more effective models use large embeddings, with at least 1024 dimensions, and significantly larger models (on average 30⨉ larger than Elastic Rerank). Dataset BM25 nDCG@10 Reranked nDCG@10 Improvement AguAna 0.47 0.68 44% Climate-FEVER 0.19 0.33 80% DBPedia 0.32 0.45 40% FEVER 0.69 0.89 37% FiQA-2018 0.25 0.45 76% HotpotQA 0.6 0.77 28% Natural Questions 0.33 0.62 90% NFCorpus 0.33 0.37 12% Quora 0.81 0.88 9% SCIDOCS 0.16 0.20 23% Scifact 0.69 0.77 12% Touche-2020 0.35 0.36 4% TREC-COVID 0.69 0.86 25% MS MARCO 0.23 0.42 85% CQADupstack (avg) 0.33 0.41 27% *nDCG@10 per dataset for the BEIR reranking the top 100 BM25 retrieved documents using the Elastic Rerank model. Architecture As we've discussed before , language models are typically trained in multiple steps. The first stage training takes randomly initialized model weights and trains on a variety of different unsupervised tasks such as masked token prediction. These pretrained models are then trained on further downstream tasks, such as text retrieval, in a process called fine-tuning. There is extensive empirical evidence that the pre-training process generates useful features which can be repurposed for new tasks in a process called transfer learning. The resulting models display significantly better performance and significantly reduced train time compared to training the downstream task alone. This technique underpins a lot of the post BERT successes of transformer based NLP. The exact pre-training methods and the model architecture affect downstream task performance as well. For our re-ranker we've chosen to train from a DeBERTa v3 checkpoint. This combines various successful ideas from the pre-training literature and has provided state-of-the-art performance as a function of model size on a variety of NLP benchmarks when fine-tuned. To very briefly summarize this model: DeBERTa introduced a disentangled positional and content encoding mechanism that allows it to learn more nuanced relationships between hidden representations of the content and position of other tokens in the sequence. We conjecture this is particularly important for reranking since matching words in the query and document text and comparing their semantics is presumably a key ingredient. DeBERTa v3 adopts the ELECTRA pre-training objective which, GAN style , tries to simultaneously train a model to supply effective fake tokens and learn to recognize those fakes. They also propose a small improvement to parameterisation of this process. If you're interested you can find the details here . For the first version, we trained the base variant of this model family. This has 184M parameters, but since its vocabulary is around 4⨉ the size of BERT's, the backbone is only 86M parameters, with 98M parameters used for the input embedding layer. This means the inference cost is comparable to BERT base. In our next blog we explore optimal strategies for budget constrained reranking. Without going into the details suffice to say we plan to train a smaller version of this model via distillation. Data sets and training Whenever you train a new task on a model there is always a risk that it forgets important information. Our first step training the Elastic Reranker was therefore to make the best attempt to extract relevance judgments from DeBERTa as it is. We use a standard pooling approach; in particular, we add a head that: Computes A ( D ( L ( h ( [ C L S ] ) ) ) A(D(L(h([CLS]))) A ( D ( L ( h ([ C L S ]))) where A A A is a GeLU activation, D D D is a dropout layer and L L L is a linear layer. In pre-training the [ C L S ] [CLS] [ C L S ] token representation, h ( [ C L S ] ) h([CLS]) h ([ C L S ]) , is used for a next sentence classification task. This is well aligned with relevance assessment so its a natural choice to use as the input to the head, Computes the weighted average of the output activations to score the query-document pair. We train the head parameters to convergence, freezing the rest of the model, on a subset of our full training data. This step updates the head to read out what useful information it can for relevance judgments from the pre-trained [ C L S ] [CLS] [ C L S ] token representation. Performing a two step fine-tune like this yielded around a 2% improvement in the final nDCG@10 on BEIR. It is typical to train ranking tasks with contrastive methods. Specifically, a query is compared to one relevant (or positive) and one or more irrelevant (or negative) documents and the model is trained to prefer the relevant one. Rather than using a purely contrastive loss, like maximizing the log probability of the positive document, a strong teacher model can be used to provide ground truth assessment of the relevance of the documents. This choice handles issues such as mislabeling of negative examples. It also provides significantly more information per query than just maximizing the log probability of the relevant document. To train our cross-encoder we use a teacher to supply a set of scores from which we compute a reference probability distribution of the positive and negative documents for each query using the softmax function as follows: P ( q , d ) = e s c o r e ( q , d ) ∑ d ′ ∈ p ∪ N e s c o r e ( q , d ′ ) P(q,d)=\\frac{e^{score(q,d)}}{\\sum_{d'\\in {p}\\cup N} e^{score(q,d')}} P ( q , d ) = ∑ d ′ ∈ p ∪ N ​ e score ( q , d ′ ) e score ( q , d ) ​ Here, q q q is the query text, p p p is the positive text, N N N is the set of negative texts, d ∈ p ∪ N d∈p∪N d ∈ p ∪ N and the s c o r e score score function is the output of the cross-encoder. We minimize the cross-entropy of our cross-encoder scores with this reference distribution. We also tried Margin-MSE loss, which worked well training ELSER, but found cross-entropy was more effective for the reranking task. This whole formulation follows because it is natural to interpret pointwise ranking as assessing the probability that each document is relevant to the query. In this case, minimizing cross-entropy amounts to fitting a probability model by maximum likelihood with the nice properties such an estimator confers. Compared to Margin-MSE, we also think we get gains because cross-entropy allows us to learn something about the relationship between all scores, since it is minimized by exactly matching the reference distribution. This is relevant because, as we discuss below, we train with more than one negative. For the teacher, we use a weighted average ensemble of a strong bi-encoder model and a strong cross-encoder model. We found the bi-encoder provides a more nuanced assessment of the negative examples, which we hypothesize is due to large batch training that contrasts millions of distinct texts per batch. However, the cross-encoder was better at differentiating the positive and negative examples. In fact, we expect there are further improvements to be made in this area. Specifically, for model selection we use a small but effective proxy to a diverse retrieval task and we plan to explore if it is beneficial to use black box optimization of our teacher on this task. The training dataset and negative sampling are critical for model quality. Our training dataset comprises a mixture of open QA datasets and datasets with natural pairs, like article heading and summary. We apply some basic cleaning and fuzzy deduplication to these. Using an open source LLM, we also generated around 180 thousand pairs of synthetic queries and passages with varying degrees of relevance. We used a multi-stage prompting strategy to ensure this dataset covers diverse topics and a variety of query types, such as keyword search, exact phrase matching and short and long natural language questions. In total, our training dataset contains around 3 million queries. It has been generally observed that quality can degrade with the reranking depth. Typically hard negative mining uses shallow sampling of retrieval results: it searches out the hardest negatives for each query. Document diversity increases with retrieval depth and we believe that typical hard negative mining therefore doesn't present the re-ranker with sufficient diversity. In particular, training must demonstrate adequate diversity in the relationship between the query and the negative documents. This flaw isn't solved by increasing overall query and document diversity; training must include negative documents from the deep tail of retrieval results. For this reason, we extract the top 128 documents for each query using multiple methods. We then sample five negatives from this pool of candidates using a probability distribution shaped by their scores. Using this many negatives per query is not typical; however, we found increasing the number of sampled negatives gave us a significant bump in final quality. A nice side effect of using a large and diverse negative set for each query is it should help model calibration. This is a process by which the model scores are mapped to a meaningful scale, such as an estimate of relevance probability. Well calibrated scores provide useful information for downstream processing or directly to the user. They also help with other tasks such as selecting a cutoff at which to drop results. We plan to release some work we've done studying calibration strategies and how effectively they can be supplied to different retrieval and reranking models in a separate blog. Training language models has traditionally required learn rate scheduling to achieve the best possible results. This is the process whereby the multiplier of the step size used for gradient descent is changed as training progresses. It presents some challenges: the total number of training steps must be known in advance; also it introduces multiple additional hyperparameters to tune. Some recent interesting work demonstrated that it is possible to drop learn rate scheduling if you adopt a new weight update scheme that includes averaging of the parameters along the optimization trajectory. We adopted this scheme, using AdamW as the base optimizer, and found it produced excellent results as well as being easy to tune. Summary In this blog we've introduced our new Elastic Rerank model. It is fine-tuned from the DeBERTa v3 base model on a carefully prepared data set using distillation from an ensemble of a bi-encoder and cross-encoder model. We showed that it provides state-of-the-art relevance reranking lexical retrieval results. Furthermore, it does so using dramatically fewer parameters than competitive models. In our next blog post we study its behavior in much more detail and revisit the other high quality models with which we've compared it here. As a result, we're going to provide some additional insights into model and reranking depth selection. Report an issue Related content ML Research Search Relevance October 29, 2024 What is semantic reranking and how to use it? Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines. TV QH TP By: Thomas Veasey , Quentin Herreros and Thanos Papaoikonomou Vector Database Search Relevance +1 May 28, 2024 Semantic reranking in Elasticsearch with retrievers Explore strategies for using semantic reranking to boost the relevance of top search results, including semantic reranking with retrievers. AD NC By: Adam Demjen and Nick Chow Vector Database How To May 14, 2024 Search relevance tuning: Balancing keyword and semantic search This blog offers practical strategies for tuning search relevance that can be complementary to semantic search. KD By: Kathleen DeRusso Elastic Cloud Serverless May 15, 2024 Building Elastic Cloud Serverless Explore the architecture of Elastic Cloud Serverless and key design and scalability decisions we made along the way of building it. JT By: Jason Tedor ES|QL Python August 20, 2024 An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus Use Elasticsearch Query Language (ES|QL) to run statistical analysis on demographic data index in Elasticsearch. BA By: Baha Azarmi Jump to Introduction How does it compare? Architecture Data sets and training Summary Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing Elastic Rerank: Elastic's new semantic re-ranker model - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-semantic-reranker-part-2","meta_description":"Learn about how Elastic's new re-ranker model was trained and how it performs."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Exploring depth in a 'retrieve-and-rerank' pipeline Select an optimal re-ranking depth for your model and dataset. ML Research TP TV QH By: Thanos Papaoikonomou , Thomas Veasey and Quentin Herreros On December 5, 2024 Part of Series Semantic reranking & the Elastic Rerank model Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this last blog post in our series we explore in detail the characteristics of various high quality re-rankers including our own Elastic Rerank model. In particular, we focus on qualitative and quantitative evaluation of retrieval quality as a function of reranking depth. We provide some high-level guidelines for how to select reranking depth and recommend reasonable defaults for the different models we tested. We employ a \"retrieve-and-rerank\" pipeline using BM25 as our first stage retriever. We focus on English language text search and use BEIR to benchmark our end-to-end accuracy. Summary Below we show that end-to-end relevance follows three broad patterns as a function of re-ranking depth: Fast increase followed by saturation Fast increase to a maximum then decay Steady decay with any amount of re-ranking For the re-rankers and datasets tested pattern 1 accounted for around 72.6% of all results, followed by pattern 2 (20.2%) and then pattern 3 (7.1%). Unsurprisingly the overall strongest re-rankers, such as Elastic Rerank, display the most consistent improvements with re-ranking depth. We propose a simple model which explains the curves we observe and show it provides a surprisingly good fit across all datasets and re-rankers we tested. This suggests that the probability of finding a positive document at a given depth in the retrieval results follows a Pareto distribution . Furthermore, we can think of the different patterns being driven by the fraction of relevant (or positive) documents the re-ranker can detect and the fraction of irrelevant (or negative) documents it mistakenly identifies as relevant. We also study effectiveness versus efficiency as a mechanism to choose the re-ranking depth and to perform model selection. In the case there is no hard efficiency constraint, as a rule-of-thumb we pick the depth that attains 90% of the maximum effectiveness. This yields a 3× improvement in compute cost compared to maximizing effectiveness, so we feel it represents a good efficiency tradeoff. For our benchmark, the 90% rule suggests one should re-rank around 100 pairs from BM25 on average, although stronger models give more benefit from deeper re-ranking. We also observe an important first stage retrieval effect. For some data sets we study the retriever recall saturates at a relatively low depth. In those scenarios we see significantly shallower maximum and 90% effectiveness depths. In the realistic scenarios there are efficiency constraints, such as the maximum permitted query latency or compute cost. We propose a scheme to simultaneously select the model and re-ranking depth subject to an efficiency constraint. We find that when efficiency is at a premium, deep re-ranking with small models tends to out perform shallow re-ranking with larger higher quality models . This pattern reverses as we relax the efficiency constraint. We also find that Elastic Rerank provides state-of-the-art effectiveness versus efficiency, being optimal for nearly all constraints we test. For our benchmark we found re-ranking around the top 30 results from BM25 represented a good choice when compute cost is important. The re-rankers For this investigation we evaluate a collection of re-rankers of different sizes and capabilities. More specifically: Elastic Rerank: This was trained from a DeBERTa V3 base model checkpoint. We discussed some aspects of its training in our last post . It has roughly 184M parameters (86M in the \"backbone\" + 98M in the input embedding layer). The large embedding parameter count is because the vocabulary is roughly 4× the size of BERT. bge-reranker-v2-gemma : This is an LLM based re-ranker trained on one of the Google Gemma series of models. It’s one of the strongest re-ranking models available and has around 2B parameters. monot5-large : This is a model trained on the MS MARCO passage dataset using T5-large as backbone. At the time of release it demonstrated state-of-the-art zero shot performance and it’s still one of the strongest baselines. It has around 770M parameters. mxbai-rerank-base-v1 : This model is provided by Mixedbread AI and according to the company’s release post it was trained by a) first collecting the top-10 results from search engines for a large number of queries, b) asking a LLM to judge the results for their relevance to the query and finally c) using these examples for training. The model uses the same DeBERTa architecture as the Elastic Reranker. MiniLM-L12-v2 : This is a cross-encoder model trained on the MS MARCO passage ranking task. It follows the BERT architecture with around 33M parameters. Cohere-v3 : This is an efficient and high quality commercial re-ranker provided by Cohere. No further information is available regarding the parameter count for this model. Oracle In the graphs below we include the performance of an \"oracle\" that has access to the relevance judgments ( qrels ) per dataset and thus can sort the documents by their relevance score descending. This puts any relevant document before any irrelevant document and higher relevance documents higher than lower relevance ones. These data points represent the performance of the ideal re-ranker (assuming perfect markup) and quantify the available space of improvement for the re-ranking models. It also captures the dependence of the end-to-end accuracy on the first stage retriever as the re-ranker only has visibility over the items that the retriever returns. Main patterns We use nDCG@10 as our evaluation metric, which is the standard in the BEIR benchmark, and we plot these scores as a function of the re-ranking depth. Re-ranking depth is the number of candidate documents retrieved by BM25 and subsequently sent to the re-ranker. Since we are using nDCG@10 the score is affected only when a document that is found lower in the retrieved list is placed in the top-10 list by the re-ranker. In this context, it can either increase nDCG@10 if it is relevant or it can evict a relevant document and decrease nDCG@10. In the following we describe the main patterns that we identified in these graphs across the different combinations of datasets and models we tested. We present them in decreasing order of frequency and provide some possible explanations of the observed behavior. \"Pareto\" curve This accounts for most of the cases that we see. It can be divided into three phases as follows: Phase A: A rapid increase which takes place mostly at smaller depths (< 100) Phase B: Further improvements at a smaller rate Phase C: A \"plateau\" in performance Below you can see runs from DBpedia and HotpotQA , where the black dashed horizontal line depicts the nDCG@10 score of BM25 Figure 1 : nDCG@10 as a function of reranking depth on DBpedia Figure 2 : nDCG@10 as a function of reranking depth on HotpotQA Discussion A monotonically increasing curve has a simple explanation: as we increase the re-ranking depth, the first-stage retriever provides a larger pool of candidates to the next stage so that the re-ranker models can identify additional relevant documents and place them high in the result list. Based on the shape of these curves, we hypothesize that the rate at which we discover positives as a function of depth follows a power law. In particular, if we assume that the re-ranker moves each positive into the top-10 list, nDCG@10 will be related to the count of positives the retriever returns in total for a given depth. Therefore, if our hypothesis is correct its functional form would be related to the cumulative density function (CDF) of a power law. In the following, we fit a scaled version of a generalized Pareto CDF to the nDCG@10 curves to test this hypothesis. Below you can see some examples of fitted curves applied to a selection of datasets ( FiQA , Natural Questions , DBpedia and HotpotQA ) using different re-rankers. Figure 3 : Curve fitting of the nDCG graph using a generalized Pareto CDF Visually it is clear that the generalized Pareto CDF is able to fit the observed curves well, which supports our hypothesis. Since we don’t match the performance of the oracle the overall behavior is consistent with the model having some false negative (FN) fraction, but a very low false positive (FP) fraction: adding more examples will occasionally shuffle an extra positive to the top, but won’t rank a negative above the positives found so far. \"Unimodal\" curve This family of graphs is characterized by the following phases: Phase A: Rapid increase until the peak Phase B: Performance decrease at a smaller rate Below you can see two examples of this pattern: one when the MiniLM-L12-v2 model is applied on to the TREC-COVID dataset and a second when the mxbai-rerank-base-v1 model is applied to the FEVER dataset. In both cases, the black dashed line represents the performance of the BM25 baseline Figure 4 : Two cases of the \"unimodal\" pattern Discussion This sort of curve would be explained by exactly the same Pareto rate of discovery of extra relevant documents. However, it also appears there is some small non-zero FP fraction. Since the rate of discovery of additional relevant documents decreases monotonically, at a certain depth the rate of discovery of relevant documents multiplied by the true positive (TP) fraction will equal the rate of discovery of irrelevant documents multiplied by the FP fraction and the nDCG@10 will have a unique maximum. Thereafter, it will decrease because in aggregate re-ranking will push relevant documents out of the top-10 set. There are some likely causes for the presence of a non-zero FP rate: Incomplete markup : In other words the model surfaces items which are actually relevant, but not marked as such which penalizes the overall performance. This is something we have investigated in a previous blog . Re-ranker training : Here, we broadly refer to issues that have to do with the training of the re-ranker. One possible explanation is provided in this paper by Gao et al. where the authors emphasize the importance of tailoring a re-ranker to the retriever because there might be cases where false positives at lower positions share confounding characteristics with the true positives which ultimately \"confuses\" the re-ranker. However, we note that this pattern is more common for overall weaker re-ranking models. As we discussed in our previous blog , a potential mitigation for training issues in a zero-shot setting is to ensure that we present sufficiently diverse negatives to cover a broad set of possible confounding features. In other words, it could be the case that models which exhibit these problems haven’t mined enough deep negatives for training and thus deeper retrieval results are effectively \"out-of-domain\". Note that there are some edge cases where it’s hard to distinguish between the \"Pareto\" and \"Unimodal\" patterns. This happens when the peak in performance is achieved earlier than the maximum depth but the performance decrease is marginal. Based on the terminology used so far this would qualify as a \"Unimodal\" case. To address this, we introduce this extra rule: we label curves as \"Pareto\" if their nDCG gain at maximum depth is ≥ 95% of the maximum nDCG gain and \"Unimodal\" otherwise. Bad fit This category comprises all cases where the application of a re-ranker does not bring a performance benefit at any depth compared to BM25. On the contrary, we observe a continuous degradation as we re-rank more documents. As an example we can take ArguAna, which is a particularly challenging task in BEIR as it involves the retrieval of the best counterargument to the input. This is not a typical IR scenario and some studies even consider reporting results without it. We experimented with different re-rankers (even with some that didn’t make it into the final list) and we observed that many of them ( Cohere-v3 , bge-reranker-v2-gemma and Elastic Rerank being the only exceptions ) exhibited the same pattern. Below we show the results for monot5-large . Figure 5 : A \"bad-fit\" example Discussion We propose two possible explanations: The re-ranker could be a bad fit for the task at hand, which is sufficiently out of the training domain that its scoring is often incorrect, The re-ranker could just be worse than BM25 for the particular task. BM25 is a strong zero-shot baseline, particularly for certain query types such as keyword searches, because it relies on lexical matching with scoring tailored to the whole corpus. Overview of patterns Overall, the distribution of the patterns ( P → \"Pareto\" curve, U → \"Unimodal\" curve, B → \"Bad fit\") across all scenarios is as follows: Figure 6 : Distribution of patterns across all scenarios Regarding the \"Pareto\" pattern which is by far the most common, we note some observations from relevant works. First, this paper from Naver Labs presents results which are in line with our findings. There, the authors experiment with 3 different ( SPLADE ) retrievers and two different cross-encoders and test the pipeline on TREC-DL 19, 20 and BEIR. They try three different values for the re-ranking depth (50, 100 and 200) and the results show that in the majority of the cases the performance increases at a rapid pace at smaller depths (i.e. 50) and then almost saturates. A relevant result is also presented in this blog post from Vespa where the author employs a \"retrieve-and-rerank\" pipeline using BM25 and ColBERT on the MLDR dataset and finds that the nDCG metric can be improved significantly by re-ordering just the top ten documents. Finally, in this paper from Meng et al. we observe similar results when two retrieval systems (BM25 and RepLLaMA) are followed by a RankLLaMA re-ranker. The authors perform experiments on the TREC DL19 and 20 datasets investigating 8 Ranked List Truncation (RLT) methods, one of which is \"Fixed-k\" that aligns with our setup. In none of these works do the authors identify an explicit underlying process that could explain the observed nDCG curve. Since we found the behavior was consistent with our simple model across different datasets and re-rankers this feels like it warrants further investigation. Some characteristics of the other retrieval tasks that could also explain some of these results: ArguAna and Touche-2020 , both argument retrieval datasets, present the most challenging tasks for the models we consider here. An interesting related analysis can be found in this paper by Thakur et al. where the authors discuss the reduced effectiveness of neural retrieval models in Touche-2020 especially when compared to BM25. Even though the paper considers a single retrieval step we think that some of the conclusions might also apply to the \"retrieve-and-rerank\" pipeline. More concretely, the authors reveal an inherent bias of neural models towards preferring shorter passages (< 350 words) in contrast to BM25 which retrieves longer documents (>600 words) mimicking the oracle distribution better. In their study, even after \"denoising\" the dataset by removing short docs (less than 20 words) and adding post-hoc relevance judgments to tackle the small labeling rate BM25 continues to outperform all the retrieval models they tested. Scifact and FEVER are two datasets where two of the \"smaller\" models follow \"unimodal\" patterns. Both are fact verification tasks which require knowledge about the claim and reasoning over multiple documents. On Scifact it is quite important for the retriever to be able to access scientific background knowledge and make sense of specialized statistical language in order to support or refute a claim. From that perspective smaller models with less internal \"world\" knowledge might be at disadvantage. According to our previous study TREC-COVID has a large labeling rate i.e. for >90% of the retrieved documents there is a relevance judgment (either positive or negative). So, it’s the only dataset where incomplete markup is not likely a problem. BM25 provides very good ranking for Quora , which is a \"duplicate questions\" identification task. In this particular dataset, queries and documents are very short - 90% of the documents (queries) are less than 19 (14) words - and the Jaccard similarity across queries and their relevant counterparts is quite high, a bit over 43%. This could explain why certain purely semantic re-rankers can fail to add value. Understanding scores as a function of depth So far we treated a re-ranker model as though it were a classifier and discussed its performance in terms of its FN and FP rates. Clearly, this is a simplification since it outputs a score which captures some estimate of the relevance of each document to the query. We return to the process of creating interpretable scores for a model, which is called calibration, in a separate upcoming blog post. However, for our purposes here we would like to understand the general trends in the score as a function of depth because it provides further insight into how the nDCG@10 evolves. In the following figures we split documents by their judgment label and plot the average positive and negative document scores as a function of depth for examples from the three different patterns we identified. We also show one standard deviation confidence intervals to give some sense of the overlap of score distributions. For the Pareto pattern we see positive and negative scores follow a very similar curve as depth increases. (The negative curve is much smoother because there are many more negatives at any given depth.) They start higher, a regime which corresponds to excellent matches and very hard negatives, then largely plateau. Throughout the score distributions remain well separated, which is consistent with a FP fraction which is essentially zero. For the unimodal pattern there is a similar decay of scores with depth, but we also see noticeably more overlap in the score distributions. This would be consistent with a small but non-zero FP fraction. Finally, for the bad fit pattern we see that scores are not separated. Also there is no significant decrease in both the positive and negative scores with depth. This is consistent with the re-ranker being a bad fit for that particular retrieval task since it appears to be unable to reliably differentiate positives and negatives sampled from any depth. Figure 7 : Positive and negative scores as a function of re-ranking depth for Elastic Rerank on HotpotQA. The bars correspond to ± 1 standard deviation intervals Figure 8 : Positive and negative scores as a function of re-ranking depth for mxbai-rerank-base-v1 on FEVER. The bars correspond to ± 1 standard deviation intervals Figure 9 : Positive and negative scores as a function of re-ranking depth for monot5-large on ArguAna. The bars correspond to ± 1 standard deviation intervals Finally, note that the score curves for the unimodal pattern hints that one may be able to find a cut off score which results in a higher FN fraction but essentially zero FP fraction. If such a threshold can be found it would allow us to avoid the relevance degrading with re-ranking depth while still being able to retain a portion of the extra relevant documents the retriever surfaces. We will return to this observation in an upcoming blog post when we explore model calibration. Efficiency vs effectiveness In this section we focus on the trade-off between efficiency and effectiveness and provide some guidance on picking optimal re-ranking depths. At a high level, effectiveness refers to the overall gain in relevance we attain as we retrieve and re-rank more candidates, while efficiency focuses on minimizing the associated cost. Efficiency can be expressed in terms of different dimensions with some common choices being: Latency, which is usually tied to an SLA on the query duration. In other words, we may only be allowed a fixed upper wall time for re-scoring (query, document) pairs, and Infrastructure cost, which refers to the number of CPUs/GPUs needed to keep up with the query rate or the total compute time required to run all queries in a pay-as-you-go setting. We note that efficiency is also wrapped up with other considerations such as the ability to run the model at lower precision, the ability to use more efficient kernels and so on, which we do not study further. Here, we adopt a simplified setup where we focus solely on the latency dimension. Obviously, in a real-world scenario one could easily trade cost (i.e. by increasing the number of CPUs/GPUs and parallelising inference) to achieve lower latency, but for the rest of the analysis we assume fixed infrastructure. Cohere v3 is excluded from this experimentation as it is an API-based service \"Latency-free\" analysis We start our analysis by considering each (model, dataset) pair in isolation ignoring the latency dimension. We are interested in the evolution of the nDCG gain (nDCG score at depth k minus the nDCG score of BM25) and we pick two data points for further analysis: The maximum gain depth, which is the re-ranking depth where the nDCG gain is maximized, and The 90%-depth, which corresponds to the depth where we first attain 90% of the maximum gain. This can be seen as a trade-off between efficiency and effectiveness as we get most of the latter at a smaller depth. We calculate these two quantities across a selection of datasets. Dataset DBPedia HotpotQA FiQA Quora TREC-COVID Climate-FEVER Model max 90% max 90% max 90% max 90% max 90% max 90% bge-reranker-v2-gemma 300 150 400 180 390 140 350 40 110 50 290 130 monot5-large 350 100 400 100 400 130 80 20 110 60 280 60 MiniLM-L12-v2 340 160 400 120 400 80 20 20 50 50 280 50 mxbai-rerank-base-v1 290 140 90 30 400 70 0* 0* 110 50 290 120 Elastic Rerank 350 140 400 160 400 130 180 30 220 50 400 170 Cohere v3 300 100 400 130 400 130 30 20 270 50 290 70 Table 1: Max-gain and 90%-gain depths for different models and datasets. The \"0* - 0*\" entry for `mxbai-rerank-base-v1` on `Quora` indicates that the model does not provide any gain over BM25. If we group by the re-ranker model type and average, it gives us Table 2. We have omitted the ( Quora , mxbai-rerank-base-v1 ) pair as it corresponds to a bad-fit case. Model Average of maximum gain depth Average of 90%-to-max gain depth bge-reranker-v2-gemma 306.7 115 monot5-large 270 78.3 MiniLM-L12-v2 248.3 80 mxbai-rerank-base-v1 236 82 Elastic Rerank 325 113.3 Cohere v3 281.7 83.3 Table 2: Average values for the depth of maximum gain and depth for 90% of maximum gain per model. We observe that: More effective models such as Elastic Rerank and bge-reranker-v2-gemma reach a peak performance at larger depths, taking advantage of more of the available positives, while less effective models \"saturate\" faster. Obtaining 90% of the maximum gain is feasible at a much smaller depth in all scenarios: on average we have to re-rank 3× fewer pairs. A re-ranking depth of around 100 would be a reasonable choice for all the scenarios considered. Alternatively, if we group by dataset and average we get Table 3. Model Average of maximum gain depth Average of 90%-to-max gain depth DBPedia 321.7 131.7 HotpotQA 348.3 120 FiQA 398.3 113.3 Quora 132 26 TREC-COVID 145 51.7 Climate-FEVER 305 100 Table 3: Average values per dataset. There are two main groups: One group where the maximum gain depth is on average larger than 300. In this category belong DBpedia , HotpotQA , FiQA and Climate-FEVER . Another group where the maximum gain depth is significantly smaller - between 100 and 150 - containing Quora and TREC-COVID . We suggest that this behavior can be attributed to the performance of the first stage retrieval, in this case BM25. To support this claim, we plot the nDCG graphs of the \"oracle\" below. As we know the nDCG metric is affected by a) the recall of relevant documents and b) their position in the result list. Since the \"oracle\" has perfect information regarding the relevance of the retrieved documents, its nDCG score can be viewed as a proxy for the recall of the first-stage retriever. Figure 10 : nDCG@10 curves for the \"oracle\" across different datasets In this figure we see that for Quora and TREC-COVID the nDCG score rises quite fast to the maximum (i.e. 1.0) while in the rest of the datasets the convergence is much slower. In other words when the retriever does a good job surfacing all relevant items at shallower depths then there is no benefit in using a large re-ranking depth. \"Latency-aware\" analysis In this section we show how to perform simultaneous model and depth selection under latency constraints. To collect our statistics we use a VM with 2 NVIDIA T4 GPUs. For each dataset we measure the total re-ranking time and divide it by the number of queries in order to arrive into a single quantity that represents the time it takes to re-score 10 (query, document) pairs We assume the cost is linearly proportional to depth, that is it takes s seconds to re-rank 10 documents, 2×s to re-rank 20 documents and so on. The table below shows examples from HotpotQA and Climate-FEVER with each entry the number of seconds required to re-score 10 (query, document) pairs. Model MiniLM-L12-v2 mxbai-rerank-base-v1 Elastic Rerank monot5-large bge-reranker-v2-gemma HotpotQA 0.02417 0.07949 0.0869 0.21315 0.25214 Climate-FEVER 0.06890 0.23571 0.23307 0.63652 0.42287 Table 4: Average time to re-score 10 (query, doc) pairs on HotpotQA & Climate-FEVER Some notes: mxbai-rerank-base-v1 and Elastic Rerank have very similar running times because they use the same \"backbone\" model, DeBERTa In most datasets monot5-large and bge-reranker-v2-gemma have similar run times even though monot5-large only has 1 / 3 the parameter count. There are two possible contributing factors: For bge-reranker-v2-gemma we used bfloat16 while we kept float precision for monot5-large , and The Gemma architecture is able to better utilize the GPUs. T-shirt sizing The run times for different datasets can vary a lot due to the fact that queries and documents follow different length distributions. In order to establish a common framework we use a \"t-shirt\" approach as follows: We define the \"Small\" size as the time it takes the most efficient model (here MiniLM-L-12-v2 ) to reach 90% of its maximal gain, similar to our proposal in the previous section, We set other sizes in a relative manner, e.g. \"Medium\" and \"Large\" being 3x and 6x times the \"Small\" latency, respectively. The model and depth selection procedure is best understood graphically. We create graphs as follows: On the X-axis we plot the latency and on the Y-axis nDCG@10 The data points correspond to increments of 10 in the re-ranking depth, so more efficient models have a higher density of points in latency The vertical lines show the latency thresholds associated with the different \"t-shirt\" sizes For each model we print the maximum \"permitted\" re-ranking depth. This is the largest depth whose latency is smaller than the threshold For each \"t-shirt size\" we simply pick the model and depth which maximizes the nDCG@10. This is the model whose graph has the highest intercept with the corresponding threshold line. The optimal depth can be determined by interpolation. Figure 11 : nDCG@10 as a function of latency for Climate-FEVER Figure 12 : nDCG@10 as a function of latency for DBPedia Figure 13 : nDCG@10 as a function of latency for FiQA Figure 14 : nDCG@10 as a function of latency for HotpotQA Some observations: There are instances where the larger models are not eligible under the \"Small\" threshold like in the case of bge-reranker-v2-gemma and monot5-large on Climate-FEVER . MiniLM-L-12-v2 provides a great example of how a smaller model can take advantage of its efficiency to \"fill the gap\" in terms of accuracy, especially for a low latency constraint. For example, on FiQA , under the \"Small\" threshold, it achieves a better score compared to bge-reranker-v2-gemma and mxbai-rerank-base-v1 even though both models are more effective eventually. This happens because MiniLM-L-12-v2 can process many more documents (80 vs 10,20 respectively) for the same cost. It’s common for less effective models to saturate faster which makes it feasible for \"stronger\" models to surpass them even when employing a small re-ranking depth. For example, on Climate-FEVER under the \"Medium\" budget the bge-reranker-v2-gemma model can reach a maximum depth of 20, which is enough for it to place second ahead of MiniLM-L-12-v2 and mxbai-rerank-base-v1 . The Elastic Rerank model provides the optimal tradeoff between efficiency and effectiveness when considering latency values larger than a minimum threshold. The table below presents a) the maximum permitted depth and b) the relative increase in the nDCG score (compared to BM25) for the three latency constraints applied to 5 datasets for the Elastic Rerank model. T-shirt size Small Medium Large Dataset Depth nDCG increase (%) Depth nDCG increase (%) Depth nDCG increase (%) DBPedia 70 37.6 210 42.43 400 45.7 Climate-FEVER 10 31.82 40 66.72 80 77.25 FiQA 20 44.42 70 73.13 140 80.49 HotpotQA 30 21.31 100 28.28 200 31.41 Natural Questions 30 70.03 80 88.25 180 95.34 Average 32 41.04 100 59.76 200 66.04 Table 5: The maximum permitted depth & associated nDCG relative increase for the Elastic Rerank model in different scenarios We can see that a tighter budget (\"Small\" size scenario) allows only for the re-ranking of a few tens of documents, but that is enough to give a significant uplift (>40%) on the nDCG score. Conclusions In this last section we summarize the main findings and provide some guidance on how to select the optimal re-ranking depth for a given retrieval task. Selecting a threshold Selecting a proper re-ranking depth can have a large effect on the performance of the end-to-end system. Here, we considered some of the key dimensions that can guide this process. We were interested in approaches where a fixed threshold is applied across all queries, i.e. there is no variable-length candidate generation on a per query basis as for example in this work . For the re-rankers we tested we found that the majority of the gain is obtained with shallow re-ranking. In particular, on average we could achieve 90% of maximum possible nDCG@10 gain re-ranking only 1/3 the number of results. For our benchmark this translated to an average re-ranking around the top 100 documents when using BM25 as a retriever. However, there is some nuance: the better the first stage retriever the fewer the candidates you need to re-rank, conversely better re-rankers benefit more from re-ranking deeper. There are also failure modes: we see effectiveness both increase to a maximum then decrease and also decrease with any re-ranking for certain models and retrieval tasks. In this context, we found more effective models are significantly less likely to ‘misbehave’ after a certain depth. There is other work that reports similar behavior. Computational budget and non-functional requirements We explored the impact of computational budget on re-ranking depth selection. In particular, we defined a procedure to choose the best re-ranking model and depth subject to a cost constraint. In this context, we found that the new Elastic Rerank model provided excellent effectiveness across a range of budgets for our benchmark. Furthermore, based on these experiments we’d suggest re-ranking the top 30 results from BM25 with the Elastic Rerank model when cost is at a premium. With this choice we were able to achieve around a 40% uplift in nDCG@10 on the QA portion of our benchmark. We also have some qualitative observations: Re-ranking deeper with a more efficient model is often the most cost effective strategy. We found `MiniLM-L12-v2 was consistently a strong contender on a budget, More efficient models usually saturate faster which means that more effective models can quickly \"pick-up\". For example, for DBpedia and HotpotQA Elastic Rerank at depth 50 is better or on par with the performance of MiniLM-L12-v2 at depth 400. Relevance dataset Ideally, model and depth selection is based on relevance judgements for your own corpus. The existence of an evaluation dataset allows you to plot the evolution of retrieval metrics, such as nDCG or recall, allowing you to make an informed decision regarding the optimal threshold under desired computational cost constraints. These datasets are usually constructed as manual annotations from domain experts or through proxy metrics based on past observations, such as Click-through Rate (CTR) on historical search results. In our previous blog we also showed how LLMs can be used to produce automated relevance judgments lists that are highly correlated with human annotations for natural language questions. In the absence of an evaluation dataset, whatever your budget, we’d recommend starting with smaller re- ranking depths as for all the model and task combinations we evaluated this achieved the majority of the gain and also avoided some of the pathologies where quality begins to degrade. In this case you can also use the general guidelines we derived from our benchmark since it covers a broad range of retrieval tasks. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Summary The re-rankers Oracle Main patterns \"Pareto\" curve Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Exploring depth in a 'retrieve-and-rerank' pipeline - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-semantic-reranker-part-3","meta_description":"Learn how to select an optimal reranking depth for your model and dataset. We'll recommend reasonable defaults for the different models we tested. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blogs Developer insights and practical how-to articles from our experts to inspire and empower your search experience Articles Series Advanced RAG techniques In this series, we'll discuss and implement techniques that may increase RAG performance. Elasticsearch geospatial search This series covers how to use the new geospatial search features in ES|QL, including how to ingest geospatial data and how to use it in ES|QL queries. Elasticsearch in JavaScript the proper way Learn how to create a production-ready Elasticsearch backend in JavaScript, follow best practices, and run the Elasticsearch Node.js client in Serverless environments. Evaluating search relevance Blog posts discussing how to think about evaluating your own search systems in the context of better understanding the BEIR benchmark. We will introduce specific tips and techniques to improve your search evaluation processes. GenAI for customer support This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! How to ingest data from AWS S3 into Elastic Cloud Learn about different ways you can ingest data from AWS S3 into Elastic Cloud. Improving information retrieval in the Elastic Stack This series explores steps to improve search relevance, benchmarking passage retrieval, ELSER, and hybrid retrieval. Indexing OneLake data into Elasticsearch Learn how to connect to OneLake and index documents into Elasticsearch. Then, take the configuration one step further by developing your own OneLake connector. Integration tests using Elasticsearch This series demonstrates improvements for integration tests using Elasticsearch and advanced techniques to further reduce execution time in Elasticsearch integration tests. Introducing LangChain4j: Building RAG apps in plain Java Introducing LangChain4j (LangChain for Java). Discover how to use it to build your RAG application in plain Java. Jira connector tutorials Learn how to integrate Elasticsearch with Jira using Elastic’s Jira native connector and explore optimization techniques. Semantic reranking & the Elastic Rerank model Introducing the concept of semantic reranking and Elastic Rerank, Elastic's new semantic re-ranker model. The ColPali model series Introducing the ColPali model, its implementation in Elasticsearch, and how to scale late interaction models for large-scale vector search. The Spotify Wrapped series Here's how to create your own Spotify Wrapped in Kibana and dive deep into your data. Using the Elasticsearch Go client for keyword search, vector search & hybrid search This series explains how to use the Elasticsearch Go client for traditional keyword search, vector search and hybrid search. Vector search introduction and implementation This series dives into the intricacies of vector search, how it is implemented in Elasticsearch, and how to run hybrid search queries in Elasticsearch. Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Blogs - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog?tab=series","meta_description":"Blog articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Using CrewAI with Elasticsearch Learn how to create an Elasticsearch agent with CrewAI for your agent team and perform market research. Integrations How To JR By: Jeffrey Rengifo On April 8, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. CrewAI is a framework for orchestrating agents that uses role-playing for them to work together on complex tasks. If you want to read more about agents and how they work, I recommend you read this article. Image Source: https://github.com/crewAIInc/crewAI CrewAI claims to be faster and simpler than similar frameworks like LangGraph since it does not need as much boilerplate code or additional code to orchestrate agents, like Autogen. Additionally, Langchain tools are compatible with CrewAI, opening many possibilities. CrewAI has a variety of use cases, including research agents, stock market analysis, lead catchers, contract analysis, website generation, travel recommendations, etc. In this article, you´ll create an agent that uses Elasticsearch as a data search tool to collaborate with other agents and conduct market research on our Elasticsearch products. Based on a concept like summer clothes , an expert agent will search in Elasticsearch for the most semantically similar products, while a researcher agent will search online for websites and products. Finally, a writer agent will combine everything into a market analysis report. You can find a Notebook with the complete example here . To get the crew agent functioning, complete the following steps: Steps Install and import packages Prepare data Create Elasticsearch CrewAI tool Configure Agents Configure tasks Install and import packages We import SerperDevTool to search on the internet for websites related to our queries using the Serper API, and WebsiteSearchTool to do a RAG search within the found content. Serper provides 2,500 free queries you can claim here. Prepare data Elasticsearch client Create inference endpoint To enable semantic search capabilities, you need to create an inference endpoint using ELSER: Create mappings Now, we are going to apply the ELSER model into a single semantic_text field to enable the agent to run hybrid queries. Index data We are going to store some data about clothes so we can compare our source with the information the researcher agent can find on the internet. Create Elasticsearch CrewAI tool The CrewAI’s tool decorator simplifies turning regular Python functions into tools that agents can use. Here's how we create an Elasticsearch search tool: Import other needed tools and credentials Now, we instantiate the tools we prepared at the beginning to search on the internet and then do RAG within the found content. You also need an OpenAI API Key for the LLM communication. Configure Agents Now, you need to define the agents: Retriever : able to search in Elasticsearch using the tool created before. Researcher : uses search_tool to search on the internet. Writer : summarizes the info from the other two agents into a Markdown blog file. Configure tasks Now that you have defined the agents and tools, you need to create tasks for each agent. You will specify the different tasks to include content sources so the writer agent can quote them to make sure both the retriever and researcher agent are contributing with information. Now, you only need to instance the crew with all agents and tasks and run them: We can see the result in the new_post.md file: “**Short Report on Fashion Trends and Product Alignment** In this report, we will explore how the current fashion trends for summer 2025 align with the offerings in our store, as evidenced by product listings from [Elasticsearch]. The analysis focuses on five prominent trends and identifies specific products that reflect these aesthetics. **1. Romantic Florals with Kitsch Twist** The resurgence of floral patterns, particularly those that are intricate rather than large, embodies a whimsical approach to summer fashion. While our current inventory lacks offerings specifically featuring floral designs, there is an opportunity to curate products that align with this trend, potentially expanding into tops or dresses adorned with delicate floral patterns. [Source: Teen Vogue] **2. High Waisted and Baggy Silhouettes** High-waisted styles are a key trend for summer 2025, emphasizing comfort without sacrificing style. Among our offerings, the following products fit this criterion: - **Baggy Fit Cargo Shorts** ($20.99): These cargo shorts present a relaxed, generous silhouette, complementing the cultural shift towards practical fashion that allows ease of movement. - **Twill Cargo Shorts** ($20.99): These fitted options also embrace the high-waisted trend, providing versatility for various outfits. **3. Bold Colors: Turquoise and Earthy Tones** This summer promises a palette of vibrant turquoise alongside earthy tones. While our current collection does not showcase products that specifically reflect these colors, introducing pieces such as tops, dresses, or accessories in these hues could strategically cater to this emerging aesthetic. [Source: Heuritech] **4. Textured Fabrics** As textured fabrics gain popularity, we recognize an opportunity in our offerings: - **Oversized Lyocell-blend Dress** ($38.99): This dress showcases unique fabric quality with gathered seams and balloon sleeves, making it a textural delight that speaks to the trend of tactile experiences in fashion. - **Twist-Detail Crop Top** ($34.99): Featuring gathered side seams and a twist detail, it embraces the layered, visually engaging designs consumers are seeking. **5. Quiet Luxury** Quiet luxury resonates with those prioritizing quality and sustainability over fast fashion. Our offerings in this category include: - **Relaxed Fit Linen Resort Shirt** ($17.99): This piece’s breathable linen fabric and classic design underline a commitment to sustainable, timeless pieces that exemplify understated elegance. In conclusion, our current product listings from [Elasticsearch] demonstrate alignment with several key summer fashion trends for 2025. There are unique opportunities to further harness these trends by expanding our collection to include playful floral designs and vibrant colors. Additionally, leveraging the existing offerings that emphasize comfort and quality can enhance our customer appeal in the face of evolving consumer trends. We are well positioned to make strategic enhancements to our inventory, ensuring we stay ahead in the fast-evolving fashion landscape.” Conclusion CrewAI simplifies the process of instantiating an agent workflow with role-playing and supports Langchain tools, including custom tools, making their creation easier with abstractions like the tool decorator. This agent crew demonstrates the ability to execute complex tasks that combine local data sources and internet searches. If you want to continue improving this workflow, you could try creating a new agent to write the writer_agent results into Elasticsearch! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Install and import packages Prepare data Elasticsearch client Create inference endpoint Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using CrewAI with Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/using-crewai-with-elasticsearch","meta_description":"Learn how to create an Elasticsearch agent with CrewAI for your agent team and perform market research."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Indexing OneLake data into Elasticsearch - Part II Second part of a two-part article to index and search OneLake data into Elastic using a Custom connector. Integrations Ingestion How To GL JR By: Gustavo Llermaly and Jeffrey Rengifo On January 24, 2025 Part of Series Indexing OneLake data into Elasticsearch Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In this article, we'll put to use what we learned in part 1 to create a OneLake custom Elasticsearch connector. We have already uploaded some OneLake documents and indexed them into Elasticsearch for search. However, this only works with a one time upload. If we want to have synchronized data, then we need to develop a more complex system. Luckily, Elastic has a connectors framework available to develop custom connectors to fit our needs: We'll make now make a OneLake connector based on this article: How to create custom connectors for Elasticsearch Steps Connector bootstrapping Implementing BaseDataSource class Authentication Running the connector Configuring schedule Connector bootstrapping For context, there are two types of Elastic connectors: Elastic managed connector : Fully managed and run in Elastic Cloud Self managed connector : Self-managed. Must be hosted in your infrastructure Custom connectors fall into the “Connector Client” category, so we need to download and deploy the connectors framework. Let's begin by cloning the connectors repository: Now add the dependencies you will use at the end of the requirements/framework.txt file. In this case: With this, the repository is done and we can begin to code. Implementing BaseDataSource class You can find the full working code in this repository. We will go through the core pieces in the onelake.py file. After the imports and class declaration, we must define our __init__ method which will capture the configuration parameters. Then, you can configure the form the UI will show to fill those parameters using the get_default_configuration method which returns a configuration dictionary. Then we configure the methods to download, and extract the content from the OneLake documents. To make our connector visible to the framework, we need to declare it in the connectors/config.py file. For this, we add the following code to sources: Authentication Before testing the connector, we need to get the client_id , tenant_id , and client_secret that we'll use to access the Workspace from the connector. We will use service principals as authentication method. An Azure service principal is an identity created for use with applications, hosted services, and automated tools to access Azure resources. The steps are: Creating an application, and gathering client_id , tenant_id , and client_secret Enabling service principal in your workspace Adding the service principal to your workspace You can follow this tutorial step by step. Ready? Now it's time to test the connector! Running the connector With the connector ready, we can now connect to our Elasticsearch instance. Go to : Search > Content > Connectors > New connector and choose Customized Connector Choose a name to create, and then select Create and attach an index to create a new index with the same name as the connector. You can now run it using Docker or run it from source. In this example, we'll use \"Run from source\". Click on Generate Configuration and paste the content from the box on the file config.yml file at the project's root. On the field service_type you must match the connector's name in connectors/config.py . In this case, replace changeme with onelake . Now you can run the connector with these commands: If the connector was correctly initialized, you should see a message like this in the console: Note: If you get a compatibility error,check your connectors/VERSION file and compare with your Elasticsearch cluster version: Version compatibility with Elasticsearch We recommend keeping the connector version and Elasticsearch version in sync. For this article we are using Elasticsearch and connector version 8.15. If everything went fine, our local connector will communicate with our Elasticsearch cluster and we'll be able to configure it using our OneLake credentials: We'll now index the documents from OneLake. To do this, run a Full Content Sync by clicking on Sync > Full Content : Once the sync is over, you should see this in the console: At the Enterprise Search UI, you can click Documents to see the indexed documents: Configure schedule You can schedule recurring content syncs using the UI based on your needs to keep your index updated and in sync with OneLake. To configure scheduled syncs go to Search > Content > Connectors and select your connector. Then click on scheduling : As an alternative, you can use the Update connector scheduling API which allows CRON expressions. Conclusion In this second part, we took our configuration one step further by using the Elastic connectors framework and developing our own to easily communicate with our Elastic Cloud instance. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Connector bootstrapping Implementing BaseDataSource class Authentication Running the connector Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Indexing OneLake data into Elasticsearch - Part II - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/ingesting-data-with-onelake-part-ii","meta_description":"Learn to index & search OneLake data into Elastic with a custom connector. We’ll show you how to create a OneLake Elasticsearch connector to sync data."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch new semantic_text mapping: Simplifying semantic search Learn how to use the new semantic_text field type and semantic query for simplifying semantic search in Elasticsearch. Vector Database CD MP By: Carlos Delgado and Mike Pellegrini On June 24, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. semantic_text - You know, for semantic search! Do you want to start using semantic search for your data, but focus on your model and results instead of on the technical details? We’ve introduced the semantic_text field type that will take care of the details and infrastructure that you need. Semantic search is a sophisticated technique designed to enhance the relevance of search results by utilizing machine learning models . Unlike traditional keyword-based search, semantic search focuses on understanding the meaning of words and the context in which they are used. This is achieved through the application of machine learning models that provide a deeper semantic understanding of the text. These models generate vector embeddings , which are numeric representations capturing the text meaning. These embeddings are stored alongside your document data, enabling vector search techniques that take into account the word meaning and context instead of pure lexical matches. How to perform semantic search To perform semantic search, you need to go through the following steps: Choose an inference mode l to create embeddings, both for indexing documents and performing queries. Create your index mapping to store the inference results, so they can be efficiently searched afterwards. Setting up indexing so inference results are calculated for new documents added to your index. Automatically handle long text documents , so search can be accurate and cover the entire document. Querying your data to retrieve results. Configuring semantic search from the ground up can be complex. It requires setting up mappings, ingestion pipelines, and queries tailored to your chosen inference model. Each step offers opportunities for fine-tuning and optimization, but also demands careful configuration to ensure all components work together seamlessly. While this offers a great degree of control, it makes using semantic search a detailed and deliberate process, requiring you to configure separate pieces that are all related to each other and to the inference model. semantic_text simplifies this process by focusing on what matters: the inference model. Once you have selected the inference model, semantic_text will make it easy to start using semantic search by providing sensible defaults, so you can focus on your search and not on how to index, generate, or query your embeddings. Let's take a look at each of these steps, and how semantic_text simplifies this setup. Choosing an inference model The inference model will generate embeddings for your documents and queries. Different models have different tradeoffs in terms of: Accuracy and relevance of the results Scalability and performance Language and multilingual support Cost Elasticsearch supports both internal and external inference services: Internal services are deployed in the Elasticsearch cluster. You can use already included models like ELSER and E5 , or import external models into the cluster using eland . External services are deployed by model providers. Elasticsearch supports the following: Cohere Hugging Face Mistral OpenAI Azure AI Studio Azure OpenAI Google AI Studio Once you have chosen the inference mode, create an inference endpoint for it. The inference endpoint identifier will be the only configuration detail that you will need to set up semantic_text . Creating your index mapping Elasticsearch will need to index the embeddings generated by the model so they can be efficiently queried later. Before semantic_text, you needed to understand about the two main field types used for storing embeddings information: sparse_vector : It indexes sparse vector embeddings, like the ones generated by ELSER. Each embedding consists of pairs of tokens and weights. There is a small number of tokens generated per embedding. dense_vector : It indexes vectors of numbers, which contains the embedding information. A model produces vectors of a fixed size, called the vector dimension. The field type to use is conditioned by the model you have chosen. If using dense vectors, you will need to configure the field to include the dimension count, the similarity function used to calculate vectors proximity, and storage customizations like quantization or the specific data type used for each element. Now, if you're using semantic_text, you define a semantic_text field mapping by just specifying the inference endpoint identifier for your model: That's it. No need for you to define other mapping options, or to understand which field type you need to use. Setting up indexing Once your index is ready to store the embeddings, it's time to generate them. Before semantic_text , to generate embeddings automatically on document ingestion you needed to set up an ingestion pipeline . Ingestion pipelines are used to automatically enrich or transform documents when ingested into an index, or when explicitly specified as part of the ingestion process. You need to use the inference processor to generate embeddings for your fields. The processor needs to be configured using: The text fields from which to generate the embeddings The output fields where the generated embeddings will be added Specific inference configuration for text embeddings or sparse embeddings, depending on the model type With semantic_text , you simply add documents to your index. semantic_text fields will automatically calculate the embeddings using the specified inference endpoint. This means there's no need to create an inference pipeline to generate the embeddings. Using bulk, index, or update APIs will do that for you automatically: Inference requests in semantic_text fields are also batched. If you have 10 documents in a bulk API request, and each document contains 2 semantic_text fields, then that request will perform a single inference request with 20 texts to your inference service in one go, instead of making 10 separate inference requests of 2 texts each. Automatically handling long text passages Part of the challenge of selecting a model is the number of tokens that the model can generate embeddings for. Models have a limited number of tokens they can process. This is referred to as the model’s context window. If the text you need to work with is longer than the model’s context window, you may truncate the text and use just part of it to generate embeddings. This is not ideal as you'll lose information; the resulting embeddings will not capture the full context of the input text. Even if you have a long context window, having a long text means a lot of content will be reduced to a single embedding, making it an inaccurate representation. Also, returning a long text will be difficult for the users to understand, as they will have to scan the text to check it's what they are looking for. Using smaller snippets would be preferable instead. Another option is to use chunking to divide long texts into smaller fragments. These smaller chunks are added to each document to provide a better representation of the complete text. You can then use a nested query to search over all the individual fragments and retrieve the documents that contain the best-scoring chunks. Before semantic_text , chunking was not done out of the box - the inference processor did not support chunking. If you needed to use chunking, you needed to do it before ingesting your documents or use the script processor to perform the chunking in Elasticsearch. Using semantic_text means that chunking will be done on your behalf when indexing. Long documents will be split into 250-word sections with a 100-word overlap so that each section shares 100 words with the previous section. This overlap ensures continuity and prevents vital contextual information in the input text from being lost by a hard break. If the model and inference service support batching the chunked inputs are automatically batched together into as few requests as possible, each optimally sized for the Inference Service. The resulting chunks will be stored in a nested object structure so you can check the text contained in each chunk. Querying your data Now that the documents and their embeddings are indexed in Elasticsearch, it's time to do some queries! Before semantic_text , you needed to use a different query depending on the type of embeddings the model generates (dense or sparse). A sparse vector query is needed to query sparse_vector field types, and either a knn search or a knn query can be used to search dense_vector field types. The query process can be further customized for performance and relevance. For example, sparse vector queries can define token pruning to avoid considering irrelevant tokens. Knn queries can specify the number of candidates to consider and the top k results to be returned from each shard. You don't need to deal with those details when using semantic_text . You use a single query type to search your documents: Just include the field and the query text. There’s no need to decide between sparse vector and knn queries, semantic text does this for you. Compare this with using a specific knn search with all its configuration parameters: Under the hood: How semantic_text works To understand how semantic_text works, you can create a semantic_text index and check what happens when you ingest a document. When the first document is ingested, the inference endpoint calculates the embeddings. When indexed, you will notice changes in the index mapping: Now there is additional information about the model settings. Text embedding models will also include information like the number of dimensions or the similarity function for the model. You can check the document already includes the embedding results: The field does not just contain the input text, but also a structure storing the original text, the model settings, and information for each chunk the input text has been divided into. This structure consists of an object with two elements: text : Contains the original input text inference : Inference information added by the inference endpoint, that consists of: inference_id of the inference endpoint model_settings that contain model properties chunks : Nested object that contains an element for each chunk that has been created from the input text. Each chunk contains: The text for the chunk The calculated embeddings for the chunk text Customizing semantic_text semantic_text simplifies semantic search by making default decisions about indexing and querying your data: uses sparse_vector or dense_vector field types depending on the inference model type Automatically defines the number of dimensions and similarity according to the inference results Uses int8_hnsw index type for dense vector field types to leverage scalar quantization . Uses query defaults. No token pruning is applied for sparse_vector queries, nor custom k and num_candidates are set for knn queries. Those are sensible defaults and allow you to quickly and easily start working with semantic search. Over time, you may want to customize your queries and data types to optimize search relevance, index and query performance, and index storage. Query customization There are no customization options - yet - for semantic queries. If you want to customize queries against semantic_text fields, you can perform advanced semantic_text search using explicit knn and sparse vector queries. We're planning to add retrievers support for semantic_text , and adding configuration options to the semantic_text field so they won't be needed at query time. Stay tuned! Data type customization If you need deeper customization for the data indexing, you can use the sparse_vector or dense_vector field types. These field types give you full control over how embeddings are generated, indexed, and queried. You need to create an ingest pipeline with an inference processor to generate the embeddings. This tutorial walks you through the process. What's next with semantic_text ? We're just getting started with semantic_text ! There are quite a few enhancements that we will keep working on, including: Better inference error handling Customize the chunking strategy Hiding embeddings in _source by default, to avoid cluttering the search responses Inner hits support, to retrieve the relevant chunks of information for a query Filtering and retrievers support Kibana support Try it out! semantic_text is available on Elasticsearch Serverless now! It will be available soon on Elasticsearch 8.15 version for Elastic Cloud and on Elasticsearch downloads . If you already have an Elasticsearch serverless cluster, you can see a complete example for testing semantic search using semantic_text in this tutorial , or try it with this notebook . We'd love to hear about your experience with semantic_text ! Let us know what you think in the forums , or open an issue in the GitHub repository . Let's make semantic search easier together! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to semantic_text - You know, for semantic search! How to perform semantic search Choosing an inference model Creating your index mapping Setting up indexing Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch new semantic_text mapping: Simplifying semantic search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/semantic-search-simplified-semantic-text","meta_description":"Learn how to use the new semantic_text field type and semantic query for simplifying semantic search in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. Integrations Ingestion How To JR By: Jeffrey Rengifo On March 7, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. BigQuery is a Google platform that allows you to centralize data from their different sources and services into one repository. It also enables you to do data analysis and use GenAI and ML tools. Below are the ways to bring data into BigQuery: Indexing data from all of these sources into Elasticsearch allows you to centralize your data sources for a better observability experience. In this article, you'll learn how to index data from BigQuery into Elasticsearch using Python, enabling you to unify data from different systems for search and analysis. You can use the example from this article in this Google Colab notebook . Steps Prepare BigQuery Configure the BigQuery Python client Index data to Elasticsearch Search data Prepare BigQuery To use BigQuery, you need to access Google Cloud Console and create a project . Once done, you'll be redirected to this view: BigQuery allows you to transfer data from Google Drive and Google Cloud Storage, and to upload local files. To upload data to BigQuery you must first create a dataset . Create one and name it \"server-logs\" so we can upload some files. For this article, we'll upload a local dataset that includes different types of articles. Check BigQuery’s official documentation to learn how to upload local files . Dataset The file we will upload to BigQuery has data from a server log with HTTP responses and their descriptions in a ndjson format. The ndjson file includes these fields: ip_address , _timestamp , http_method , endpoint , status_code , response_time and status_code_description . BigQuery will extract data from this file. Then, we'll consolidate it with Python and index it to Elasticsearch. Create a file named logs.ndjson and populate it with the following: We upload this file to the dataset we've just created (shown as \"server_logs\") and use \"logs\" as table name (shown as \"table id\"). Once you're done, your files should look like this: Configure the BigQuery Python client Below, we'll learn how to use the BigQuery Python client and Google Colab to build an app. 1. Dependencies First, we must install the following dependencies: The google-cloud-bigquery dependency has the necessary tools to consume the BigQuery data, elasticsearch allows it to connect to Elastic and index the data, and getpass lets us enter sensitive variables without exposing them in the code. Let's import all the necessary dependencies: We also need to declare other variables and initialize the Elasticsearch client for Python: 2. Authentication To get the necessary credentials to use BigQuery, we'll use auth. Run the command line below and choose the same account you used to create the Google Cloud project: Now, let's see the data in BigQuery: This should be the result you see: With this simple code, we've extracted the data from BigQuery. We've stored it in the logs_data variable and can now use it with Elasticsearch. Index data to Elasticsearch We'll begin by defining the data structure from the Kibana Devtools console : The match_only_text field is a variant of the text field type that saves disk space by not storing the metadata to calculate scores. We use it since logs are usually time-centric, i.e. the date is more important than the match quality in the text field. Queries that use a textfield are compatible with the ones that use a match_only_text field. We'll index the files using the Elasticsearch _bulk api : Search data We can now run queries using the data from the bigquery-logs index. For this example, we'll run a search using the error descriptions from the server in the ( status_code_description field). In addition, we'll sort them by date and get the IP addresses of the errors: This is the result: Conclusion Tools like BigQuery, which help to centralize information, are very useful for data management. In addition to search, using BigQuery with Elasticsearch allows you to leverage the power of ML and data analysis to detect or analyze issues in a simpler and faster way. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Prepare BigQuery Dataset Configure the BigQuery Python client 1. Dependencies Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Ingesting data with BigQuery - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/big-query-data-ingestion","meta_description":"Learn how to index and search Google BigQuery data in Elasticsearch using Python."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. Integrations Ingestion How To AK By: Amit Khandelwal On February 19, 2025 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In this article, we will cover how to avoid critical performance mistakes, why the Elasticsearch default solution doesn’t cut it, and important implementation considerations. All modern-day websites have autocomplete features (search as you type) on their search bar to improve the user experience (no one wants to type entire search terms…). It’s imperative that the autocomplete be faster than the standard search, as the whole point of autocomplete is to start showing the results while the user is typing. If the latency is high, it will lead to a subpar user experience. Below is an autocomplete search example on the famous question-and-answer site, Quora. This is a good example of autocomplete: when searching for “elasticsearch auto”, the following posts begin to show in their search bar: Note that in the search results, there are questions relating to the auto-scaling, auto-tag and autocomplete features of Elasticsearch. Users can further type a few more characters to refine the search results. Various approaches for autocomplete in Elasticsearch / search as you type There are multiple ways to implement the autocomplete feature which broadly fall into four main categories: Search-as-you-type Query time Completion suggester Index time 1. Search as you type It is a data type intended to facilitate the autocomplete queries without prior knowledge of custom analyzer setup. Elasticsearch internally stores the various tokens (edge n-gram, shingles) of the same text, and therefore can be used for both prefix and infix completion. It can be convenient if you are not familiar with the advanced features of Elasticsearch, which the other three approaches require. Not much configuration is required in Search as you type to make it work with simple use cases and code samples.ore details are available in our documentation . 2. Query time Autocomplete can be achieved by changing match queries to prefix queries . While match queries work on token (indexed) to token (search query tokens) match, prefix queries (as their name suggests) match all the tokens starting with search tokens, hence the number of documents (results) matched is high. As explained, prefix query is not an exact token match, rather it’s based on character matches in the string which is very costly and fetches a lot of documents. Elasticsearch internally uses a B+ tree kind of data structure to store its tokens. It’s useful to understand the internals of the data structure used by inverted indices and how different types of queries impact the performance and results. Elasticsearch also introduced Match boolean prefix query in ES 7.2 version. This is a combination of Match and Prefix queries and has the best of both worlds. It’s especially useful when you have multiple search terms. For example, if you have foo bar baz , then instead of running a prefix search on all the search terms (which is costly and produces fewer results), this query would prefix search only on the last term and match previous terms in any order. Doing this improves the speed and relevance of the search results. 3. Completion suggester This is useful if you are providing suggestions for search terms like on e-commerce and hotel search websites. The search bar offers query suggestions, as opposed to the suggestions appearing in the actual search results, and after selecting one of the suggestions provided by the completion suggester, it provides the search results. For the completion suggester to work, suggestions must be indexed as any other field. You can also optionally add a weight field to rank the suggestions. This approach is ideal if you have an external source of autocomplete suggestions, like search analytics. Code samples Index definition Response Indexing suggestions Response Searching Response 4. Index time Sometimes the requirements are just prefix completion or infix completion in autocomplete. It’s not uncommon to see autocomplete implementation using custom-analyzers , which involves indexing the tokens in such a way that it matches the user’s search term. If we continue with our example, we are looking at documents that consist of “elasticsearch autocomplete”, “elasticsearch auto-tag”, “elasticsearch auto scaling” and “elasticsearch automatically”. The default analyzer won’t generate any partial tokens for “autocomplete”, “autoscaling” and “automatically”, and searching “auto” wouldn’t yield any results. To overcome the above issue, edge ngram or n-gram tokenizer are used to index tokens in Elasticsearch, as explained in the documentation , and search time analyzer to get the autocomplete results. The above approach uses Match queries, which are fast as they use a string comparison (which uses hashcode), and there are comparatively less exact tokens in the index. Performance consideration Almost all the above approaches work fine on smaller data sets with lighter search loads, but when you have a massive index getting a high number of auto suggest queries, then the SLA and performance of the above queries is essential. The following bullet points should assist you in choosing the approach best suited for your needs: Ngram or edge Ngram tokens increase index size significantly, providing the limits of min and max gram according to application and capacity. Planning would save significant trouble in production. Allowing empty or few character prefix queries can bring up all the documents in an index and has the potential to bring down an entire cluster . It’s always a better idea to do a prefix query only on the term (on a few fields) and limit the minimum characters in prefix queries. This can be now solved by using the boolean Match prefix query as explained above. ES provided “search as you type” data type tokenizes the input text in various formats. As it is an ES-provided solution which can’t address all use-cases, it’s always a better idea to check all the corner cases required for your business use-case. In addition, as mentioned it tokenizes fields in multiple formats which can increase the Elasticsearch index store size. Completion suggests separately indexing the suggestions and doesn’t address the use-case of fetching the search results. Index time approaches are fast as there is less overhead during query time, but they involve more grunt work, like re-indexing, capacity planning and increased disk cost. Query time is easy to implement, but search queries are costly. This is very important to understand as most of the time users need to choose one of them and to understand this trade-off can help with many troubleshooting performance issues. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Various approaches for autocomplete in Elasticsearch / search as you type 1. Search as you type 2. Query time 3. Completion suggester Code samples Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch autocomplete search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-autocomplete-search","meta_description":"Learn about Elasticsearch autocomplete search and how to handle it with search as you type, query time, completion suggester and index time."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. ML Research Python GC KS By: Gus Carlock and Kirti Sodhi On February 5, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this article, we’ll demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging a sample text dataset, streamlining the workflow within Elastic’s ecosystem. You can follow along to create a simple clustering pipeline with this Jupyter notebook . Clustering prologue The Machine Learning App in Kibana provides a comprehensive suite of advanced capabilities, including anomaly and outlier detection, as well as classification and regression models. It supports the integration of custom models from the scikit-learn library via the eland Python client. While Kibana offers robust machine learning capabilities, it currently does not support clustering analysis in both prebuilt and custom models. Clustering algorithms are crucial for enhancing search relevance by grouping similar queries and for security, where they help identify patterns in data to detect potential threats and anomalies. Elastic provides the flexibility to leverage custom scikit-learn models, such as k-means, for tasks like clustering—for example, grouping news articles by similarity. While these algorithms aren’t officially supported, you can use the model’s cluster centers as input for the ingest pipeline to integrate these capabilities seamlessly into your Elastic workflow. In the following sections, we’ll guide you through implementing this approach. Dataset overview for clustering workflow For this proof of concept, we utilized the 20 Newsgroups dataset , a popular benchmark for text classification and clustering tasks. This dataset consists of newsgroup posts organized into 20 distinct categories, covering topics such as sports, technology, religion, and science. It is widely available through the scikit-learn library. In our experiments, we focused on a subset of 5 categories: rec.sport.baseball rec.sport.hockey comp.sys.ibm.pc.hardware talk.religion.misc sci.med These categories were chosen to ensure a mix of technical, casual, and diverse topics for effective clustering analysis. Feature extraction and generating text embeddings The text documents were cleaned by removing stop words, punctuation, and irrelevant tokens using scikit-learn’s feature_extraction utility, ensuring that the text vectors captured meaningful patterns. These features were then used to generate text embeddings using OpenAI’s language model “text-embedding-ada-002”. The model, text-embedding-ada-002 stands among the most advanced models for generating dense vector representations of text, capturing the nuanced semantic meaning inherent in textual data. We utilized the Azure OpenAI endpoint to generate the embeddings for our analysis. Instructions to use this endpoint with Elasticsearch can be found at Elasticsearch open inference API adds Azure AI Studio support . The embeddings were normalized before training the k-means clustering model to standardize vector magnitudes. Normalization is a critical preprocessing step for k-means since it calculates clusters based on Euclidean distances. Standardized embeddings eliminate magnitude discrepancies, ensuring that clustering decisions rely purely on semantic proximity, thereby enhancing the accuracy of the clustering results. We trained the k-means model using k=5 to match the dataset’s categories and extracted the cluster centers. These centers served as inputs for Kibana’s ingest pipeline, facilitating real-time clustering of incoming documents. We’ll discuss this further in the next section. Dynamic clustering with ingest pipeline’s script processor After the model is trained in Scikit-learn, an ingest pipeline is used to assign cluster numbers to each record. This ingest pipeline takes three configurable parameters: clusterCenters – a nested list with one list for each cluster center vector. For this blog, they were generated with Scikit-learn. analysisField – the field which contains dense vectorized data. normalize – normalizes the analysisField vectors. Once the ingest pipeline is added to an index or datastream, all new ingested data will be assigned a closest cluster number. The image below illustrates the end-to-end workflow of importing clustering in Kibana. The full ingest pipeline script can be generated using Python, an example is in the “Add clustering ingest pipeline” section of the notebook . We’ll dive into the specifics of the ingest pipeline below. The cluster_centers are then loaded as a nested list of floats, with one list for each cluster center. In the first part of the Painless script, two functions are defined. The first is euclideanDistance , which returns a distance between two arrayLists as a float. The second, l2NormalizeArray , scales an arrayLists so that the sum of its squared elements is equal to one. Then the inference step of k-Means is performed. For every cluster center, the distance is taken between a new incoming document vector using the ingest pipeline context (ctx) and the analysisField parameter, which selects the field containing the OpenAI text-ada-002 vector. The closestCluster number is then assigned to the document based on the closest cluster center, that is, the document which has the shortest distance. Additionally, if the normalize parameter is set to true, the L2 norm of the incoming document vector is taken before doing the distance calculation. Then the closestCluster and minDistance value to that cluster are passed back to the document through the ingest pipeline context. There are a few configurable parameters, which are described above but included here for reference. The first is the clusterCenters , a nested array of floats, with one array for each cluster center. The second is the analysisField , the field which contains the text-ada-002 vectors. Lastly, normalize which will L2 normalize the document vector. Note that the normalize parameter should only be set to True if the vectors are also normalized before training the k-Means model. Finally, once the pipeline is configured, assign an ID and put it on the cluster. Clustering results We expect the clustering results to show each category forming a distinct cluster. While baseball and hockey might overlap due to their shared sports context; technical, religious, and medical categories should form separate and clearly defined clusters. When OpenAI text-ada-002 vectors are viewed with the t-SNE dimensionality reduction algorithm, they show that there is clear separation between these clusters, and that the sports topics are close together: Actual newsgroup labels; 2D t-SNE trained on OpenAI text-ada-002 vectors The location of the points indicates clear separation between the groupings, which indicates that the vectorization is capturing the semantic meaning of each article. As a result, the zero-shot classification results are excellent. Even though no labels were provided in training data to the model, with only the number of clusters provided, on in-sample data a k-means model provides greater than 94% accuracy when assigning cluster numbers: Predicted cluster labels; 2D t-SNE trained on OpenAI text-ada-002 vectors Comparing the actual newsgroup labels to the in-sample predicted labels, there is very little difference between the actual newsgroup labels and those predicted by the clustering model. This is represented by the confusion matrix: Zero-shot Classification Confusion Matrix on OpenAI text-ada-002 vectors The diagonal on the confusion matrix represents the in-sample accuracy of each category, the model is predicting the correct label more than 94% of the time for each category. Detecting outliers in the clusters The k-means model can be viewed as an approximation of a Gaussian Mixture Model (GMM) without capturing covariance, where the quantiles of distances from the nearest cluster being an approximation of the distribution quantile. This means that a k-mean model can capture an approximation of the data distribution. With this approach, a large number of clusters can be chosen, in this case 100, and a new model trained. The higher number of clusters, the more flexible the fit of the distribution. So in this case, the goal is not to learn the internal groupings of the data, but rather capture the distribution of the data overall. The distance quantiles can be computed with a query. In this case, a model was trained with 100 clusters and the 75th percentile distances were chosen as the cutoff for outliers. Starting with the same graph above showing the t-SNE representation of the actual newsgroup labels: Actual newsgroup labels; 2D t-SNE trained on OpenAI text-ada-002 vectors When adding data in from newsgroups which were not in the training set, the 2D t-SNE representation shows a good fit for the data. Here, orange datapoints are not considered outliers, while those which are dark grey are labelled as outliers: Outlier results for k=100; 2D t-SNE trained on OpenAI text-ada-002 vectors Bringing it all together In this blog, we demonstrated how to integrate custom clustering models into the Elastic Stack. We developed a workflow that imports scikit-learn clustering models, such as k-means, into the Elastic Stack, enabling clustering analysis directly within Kibana. By using the 20 Newsgroups dataset, we demonstrated how to apply this workflow to group similar documents, while also discussing the use of advanced text embedding models such as OpenAI's “text-embedding-ada-002” to create semantic representations essential for efficient clustering. The results section showcased clear cluster separation, indicating that the “text-embedding-ada-002” model captures semantic meaning effectively. The k-means model achieved over 94% accuracy in zero-shot classification, with the confusion matrix showing minimal discrepancies between predicted and actual labels, confirming its strong performance. With this workflow, Elastic users can apply clustering techniques to their own datasets, whether for grouping similar queries in search or detecting unusual patterns for security applications. The solution presented here provides an easy way to integrate advanced clustering functionality into Elastic. We hope this inspires you to explore these capabilities and apply them to your own use cases. What’s next? The clustering results above show that the Painless implementation accurately clusters similar topics, achieving 94% accuracy in performance. Moving forward, our goal is to test the pipeline on a less structured dataset with significantly more noise and a larger number of clusters. This will help evaluate its performance in more challenging scenarios. While k-means has shown decent clustering results, exploring alternatives like Gaussian Mixture Models or Mean Shift for outlier detection might yield better outcomes. These methods could also be implemented using a Painless script or an ingest pipeline. In the future, we think this workflow can be enhanced with ELSER , as we could use ELSER to first retrieve relevant features from the dataset, which would then be used for clustering, further improving the model’s performance and relevance in the analysis. Additionally, we would like to address how to properly set the correct number of clusters, and how to effectively deal with model drift. In the meantime, if you have similar experiments or use cases to share, we’d love to hear about them! Feel free to provide feedback or connect with us through our community Slack channel or discussion forums . Report an issue Related content Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey Jump to Clustering prologue Dataset overview for clustering workflow Feature extraction and generating text embeddings Dynamic clustering with ingest pipeline’s script processor Clustering results Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Implementing clustering workflows in Elastic to enhance search relevance - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-clustering-workflows","meta_description":"Explore clustering workflows and learn how to integrate custom clustering models into the Elastic Stack through an example."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. Vector Database Lucene TV MS By: Thomas Veasey and Mayya Sharipova On April 7, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the past, we discussed some of the challenges of having to search multiple HNSW graphs and how we were able to mitigate them. At that time we alluded to some further improvements we had planned. This post is the culmination of that work. You might ask, why use multiple graphs at all? This is a side effect of an architectural choice in Lucene: immutable segments. As with most architectural choices there are pros and cons. For example, we’ve recently GA’d Serverless Elasticsearch. In this context, we’ve gained very significant benefits from immutable segments including efficient index replication and the ability to decouple index and query compute and autoscale them independently. For vector quantization, segment merges give us the opportunity to update parameters to adapt them to data characteristics. Along these lines, we think there are other advantages that having opportunities to measure data characteristics and revisit indexing choices affords. In this post we will discuss the work we’ve been doing to significantly reduce the overhead of building multiple HNSW graphs and in particular to reduce the cost of merging graphs. Background In order to maintain a manageable number of segments Lucene periodically checks to see if it should merge segments. This amounts to checking if the current segment count exceeds a target segment count, which is determined by the base segment size and the merge policy. If the count is exceeded, Lucene merges groups of segments while the constraint is violated. This process has been described in detail elsewhere . Lucene elects to merge similar sized segments because this achieves logarithmic growth in the write amplification. In the case of a vector index, write amplification is the number of times a vector will be inserted into a graph. Lucene will try to merge segments in groups of approximately 10. Consequently, vectors are inserted into a graph roughly 1 + 9 10 log ⁡ 10 ( n n 0 ) 1+\\frac{9}{10}\\log_{10}\\left(\\frac{n}{n_0}\\right) 1 + 10 9 ​ lo g 10 ​ ( n 0 ​ n ​ ) times, where n n n is the index vector count and n 0 n_0 n 0 ​ is the expected base segment vector count. Because of the logarithmic growth, write amplification is single digits even for huge indices. However, the total time spent merging graphs is linearly proportional to the write amplification. When merging HNSW graphs we already make a small optimization: retaining the graph for the largest segment and inserting vectors from the other segments into it. This is the reason for the 9/10 factor above. Below we show how we are able to do significantly better by using information from all the graphs we are merging. Graph Merging Previously we retained the largest graph and inserted vectors from the others ignoring the graphs that contain them. The key insight we make use of below is that each graph we discard contains important proximity information about the vectors it contains. We would like to use this information to accelerate inserting, at least some, of the vectors. We focus on the problem of inserting a smaller graph G s = ( V s , E s ) G_s=(V_s,E_s) G s ​ = ( V s ​ , E s ​ ) into a larger graph G l = ( V l , E l ) G_l=(V_l,E_l) G l ​ = ( V l ​ , E l ​ ) , since this is an atomic operation we can use to build any merge policy. The strategy is to find a subset of vertices of J ⊂ V s J\\subset V_s J ⊂ V s ​ to insert into the large graph. We then use the connectivity of these vertices in the small graph to accelerate inserting the remaining vertices V s ∖ J V_s \\setminus J V s ​ ∖ J . In the following, we use N s ( u ) N_s(u) N s ​ ( u ) and N l ( u ) N_l(u) N l ​ ( u ) to denote the neighbors of a vertex u u u in the small and large graph, respectively. Schematically the process is as follows. MERGE-HNSW Inputs G s G_s G s ​ and G l G_l G l ​ 1 \\;\\; Find J ⊂ V s J\\subset V_s J ⊂ V s ​ to insert into G l G_l G l ​ using COMPUTE-JOIN-SET 2 \\;\\; Insert each vertex u ∈ J u\\in J u ∈ J into G l G_l G l ​ 3 \\;\\; for u ∈ V s ∖ J u\\in V_s \\setminus J u ∈ V s ​ ∖ J do 4 J u ← J ∩ N s ( u ) \\;\\;\\;\\;J_u\\leftarrow J\\cap N_s(u) J u ​ ← J ∩ N s ​ ( u ) 5 E u ← ∪ v ∈ J u N l ( u ) \\;\\;\\;\\;E_u\\leftarrow \\cup_{v\\in J_u} N_l(u) E u ​ ← ∪ v ∈ J u ​ ​ N l ​ ( u ) 6 W ← \\;\\;\\;\\;W\\leftarrow\\, W ← FAST-SEARCH-LAYER ( J u , E u ) (J_u, E_u) ( J u ​ , E u ​ ) 7 n e i g h b o r s ← \\;\\;\\;\\;neighbors \\leftarrow\\, n e i g hb ors ← SELECT-NEIGHBORS-HEURISTIC ( u , W ) (u, W) ( u , W ) 8 J ← J ∪ { u } \\;\\;\\;\\;J\\leftarrow J \\cup \\{u\\} J ← J ∪ { u } We compute the set J J J using a procedure we discuss below (line 1). Then we insert every vertex in J J J into the large graph using the standard HNSW insertion procedure (line 2). For each vertex we haven’t inserted we find its neighbors that we have inserted and their neighbors in the large graph (lines 4 and 5). We use a FAST-SEARCH-LAYER procedure seeded with this set (line 6) to find the candidates for the SELECT-NEIGHBORS-HEURISTIC from the HNSW paper (line 7). In effect, we’re replacing SEARCH-LAYER to find the candidate set in the INSERT method (Algorithm 1 from the paper), which is otherwise unchanged. Finally, we add the vertex we just inserted into J J J (line 8). It's clear that for this to work every vertex in V s ∖ J V_s \\setminus J V s ​ ∖ J must have at least one neighbor in J J J . In fact, we require that for every vertex in u ∈ V s ∖ J u\\in V_s \\setminus J u ∈ V s ​ ∖ J that ∣ J ∩ N s ( u ) ∣ ≥ k u |J\\cap N_s(u) |\\geq k_u ∣ J ∩ N s ​ ( u ) ∣ ≥ k u ​ for some k u < M k_u<M k u ​ < M , the maximum layer connectivity. We observe that in real HNSW graphs we see quite a spread of vertex degrees. The figure below shows a typical cumulative density function of vertex degree for the bottom layer of an Lucene HNSW graph. Example vertex degree distribution We explored using a fixed value for k u k_u k u ​ as well as making it a function of the vertex degree. This second choice leads to larger speedups with minimal impact on graph quality so went with the following k u = max ⁡ ( 2 , 1 4 ∣ N s ( u ) ∣ ) k_u=\\max\\left(2,\\frac{1}{4}|N_s(u)|\\right) k u ​ = max ( 2 , 4 1 ​ ∣ N s ​ ( u ) ∣ ) Note that ∣ N s ( u ) |N_s(u) ∣ N s ​ ( u ) | is equal to the degree of the vertex u u u in the small graph by definition. Having a lower limit of two means that we will insert every vertex whose degree is less than two. A simple counting argument suggests that if we choose J J J carefully we need only insert around 1 5 ∣ V s ∣ \\frac{1}{5}|V_s| 5 1 ​ ∣ V s ​ ∣ into G l G_l G l ​ directly. Specifically, we color an edge of the graph if we insert exactly one of its end vertices into J J J . Then we know that for every vertex in V s ∖ J V_s \\setminus J V s ​ ∖ J to have at least k u k_u k u ​ neighbors in J J J we need to color at least ∑ u ∈ V s ∖ J k u \\sum_{u\\in V_s\\setminus J} k_u ∑ u ∈ V s ​ ∖ J ​ k u ​ edges. Furthermore, we expect that ∑ u ∈ V s ∖ J k u ≈ ( ∣ V s ∣ − ∣ J ∣ ) 1 4 E U [ ∣ N s ( U ) ∣ ] \\sum_{u\\in V_s\\setminus J} k_u\\approx \\left(|V_s|-|J|\\right)\\frac{1}{4}\\mathbb{E}_U\\left[|N_s(U)|\\right] u ∈ V s ​ ∖ J ∑ ​ k u ​ ≈ ( ∣ V s ​ ∣ − ∣ J ∣ ) 4 1 ​ E U ​ [ ∣ N s ​ ( U ) ∣ ] Here, E U [ N s ( U ) ∣ ] \\mathbb{E}_U\\left[N_s(U)|\\right] E U ​ [ N s ​ ( U ) ∣ ] is the average vertex degree in the small graph. For each vertex u ∈ J u\\in J u ∈ J we color at most ∣ N s ( u ) ∣ |N_s(u)| ∣ N s ​ ( u ) ∣ edges. Therefore, the total number of edges we expect to color is at most ∣ J ∣ E U [ ∣ N s ( U ) ∣ ] |J|\\, \\mathbb{E}_U\\left[|N_s(U)|\\right] ∣ J ∣ E U ​ [ ∣ N s ​ ( U ) ∣ ] . We hope by choosing J J J carefully we will color close to this number of edges and so in order to cover all the vertices ∣ J ∣ |J| ∣ J ∣ needs to satisfy ∣ J ∣ E U [ ∣ N s ( U ) ∣ ] = ( ∣ V s ∣ − ∣ J ∣ ) 1 4 E U [ ∣ N s ( U ) ∣ ] |J|\\, \\mathbb{E}_U\\left[|N_s(U)|\\right] =\\left(|V_s|-|J|\\right)\\frac{1}{4}\\mathbb{E}_U\\left[|N_s(U)|\\right] ∣ J ∣ E U ​ [ ∣ N s ​ ( U ) ∣ ] = ( ∣ V s ​ ∣ − ∣ J ∣ ) 4 1 ​ E U ​ [ ∣ N s ​ ( U ) ∣ ] This implies that ∣ J ∣ = 1 4 ( ∣ V s ∣ − ∣ J ∣ ) = 4 5 1 4 ∣ V s ∣ = 1 5 ∣ V s ∣ |J|=\\frac{1}{4}(|V_s|-|J|)=\\frac{4}{5}\\frac{1}{4}|V_s|=\\frac{1}{5}|V_s| ∣ J ∣ = 4 1 ​ ( ∣ V s ​ ∣ − ∣ J ∣ ) = 5 4 ​ 4 1 ​ ∣ V s ​ ∣ = 5 1 ​ ∣ V s ​ ∣ . Providing SEARCH-LAYER dominates the runtime this suggests we could achieve up to a 5 × 5\\times 5 × speed up in merge time. Given the logarithmic growth of the write amplification, this means even for very large indices we would typically only double the build time compared to building one graph. The risk in this strategy is that we damage the graph quality. We initially tried with a no-op FAST-SEARCH-LAYER . We found this degrades graph quality to the extent that recall as a function of latency was impacted, particularly when merging down to a single segment. We then explored various alternatives using a limited search of the graph. In the end, the most effective choice was the simplest. Use SEARCH-LAYER but with a low ef_construction . With this parameterisation we were able to achieve excellent quality graphs and still decrease the merge time by a little over 30% on average. Computing the Join Set Finding a good join set can be formulated as a graph covering problem. A greedy heuristic is a simple and effective heuristic for approximating optimal graph covers. The approach we take picks vertices one at a time to add to J J J in decreasing gain order. The gain is defined as follows: G a i n ( v ) = max ⁡ ( k v − c ( v ) , 0 ) + ∑ u ∈ N s ( v ) ∖ J 1 { c ( u ) < k u } Gain(v)=\\max(k_v-c(v),0)+\\sum_{u\\in N_s(v)\\setminus J} 1\\left\\{c(u)<k_u\\right\\} G ain ( v ) = max ( k v ​ − c ( v ) , 0 ) + u ∈ N s ​ ( v ) ∖ J ∑ ​ 1 { c ( u ) < k u ​ } Here, c ( v ) c(v) c ( v ) denotes the count of neighbors of a vector v v v in J J J and 1 { ⋅ } 1\\{\\cdot\\} 1 { ⋅ } is the indicator function. The gain includes the change in the count of the vertex we added to J J J , i.e. max ⁡ ( k v − c ( v ) , 0 ) \\max(k_v-c(v),0) max ( k v ​ − c ( v ) , 0 ) , since we get closer to our goal by adding a less covered vertex. The gain calculation is illustrated in the figure below for the central orange vertex. Vertex gain to add to join set J We maintain the following state for each vertex v v v : Whether it is stale, Its gain G a i n ( v ) Gain(v) G ain ( v ) , The count of adjacent vertices in J J J denoted c ( v ) c(v) c ( v ) , A random number in the range [0,1] that is used for tie breaking. The pseudo code for computing the join set is as follows. COMPUTE-JOIN-SET Inputs G s G_s G s ​ 1 C ← ∅ \\;\\;C\\leftarrow\\emptyset C ← ∅ 2 G a i n e x i t ← 0 \\;\\;Gain_{exit}\\leftarrow 0 G ai n e x i t ​ ← 0 3 \\;\\; for u ∈ G s u\\in G_s u ∈ G s ​ do 4 C ← C ∪ { ( false , k u + d e g ( u ) , 0 , rand in [ 0 , 1 ] ) } \\;\\;\\;\\;C\\leftarrow C \\cup \\{(\\text{false}, k_u+deg(u), 0, \\text{rand in }[0,1])\\} C ← C ∪ {( false , k u ​ + d e g ( u ) , 0 , rand in [ 0 , 1 ])} 5 G a i n e x i t ← G a i n e x i t + k u \\;\\;\\;\\;Gain_{exit}\\leftarrow Gain_{exit}+k_u G ai n e x i t ​ ← G ai n e x i t ​ + k u ​ 6 G a i n t o t ← 0 \\;\\;Gain_{tot}\\leftarrow 0 G ai n t o t ​ ← 0 7 \\;\\; while G a i n t o t < G a i n e x i t Gain_{tot}<Gain_{exit} G ai n t o t ​ < G ai n e x i t ​ do 8 v ∗ ← \\;\\;\\;\\;v^*\\leftarrow\\, v ∗ ← maximum gain vertex in C C C 9 \\;\\;\\;\\; Remove the state for v ∗ v^* v ∗ from C C C 10 \\;\\;\\; if v ∗ v^* v ∗ is not stale then 11 J ← J ∪ { v ∗ } \\;\\;\\;\\;\\;J\\leftarrow J\\cup\\{v^*\\} J ← J ∪ { v ∗ } 12 G a i n t o t ← G a i n t o t + G a i n ( v ∗ ) \\;\\;\\;\\;\\;Gain_{tot}\\leftarrow Gain_{tot}+Gain(v^*) G ai n t o t ​ ← G ai n t o t ​ + G ain ( v ∗ ) 13 \\;\\;\\;\\;\\; for u ∈ N s ( v ∗ ) u \\in N_s(v^*) u ∈ N s ​ ( v ∗ ) do 14 \\;\\;\\;\\;\\;\\;\\; mark u u u as stale if c ( v ∗ ) < k v ∗ c(v^*)<k_{v^*} c ( v ∗ ) < k v ∗ ​ 15 \\;\\;\\;\\;\\;\\;\\; mark neighbors of u u u stale if c ( u ) = k u − 1 c(u)=k_u-1 c ( u ) = k u ​ − 1 16 c ( u ) ← c ( u ) + 1 \\;\\;\\;\\;\\;\\;\\;c(u)\\leftarrow c(u)+1 c ( u ) ← c ( u ) + 1 17 \\;\\;\\; else 18 G a i n ( v ∗ ) ← max ⁡ ( k v − c ( v ∗ ) , 0 ) + ∑ u ∈ N s ( v ∗ ) ∖ J 1 { c ( u ) < k u } \\;\\;\\;\\;\\;Gain(v^*)\\leftarrow \\max(k_v-c(v^*),0)+\\sum_{u\\in N_s(v^*)\\setminus J} 1\\left\\{c(u)<k_u\\right\\} G ain ( v ∗ ) ← max ( k v ​ − c ( v ∗ ) , 0 ) + ∑ u ∈ N s ​ ( v ∗ ) ∖ J ​ 1 { c ( u ) < k u ​ } 19 \\;\\;\\;\\;\\; if G a i n ( v ∗ ) > 0 Gain(v^*)>0 G ain ( v ∗ ) > 0 then 20 C ← C ∪ { ( false , G a i n ( v ∗ ) , c ( v ∗ ) , copy rand ) } \\;\\;\\;\\;\\;\\;\\;C\\leftarrow C \\cup \\{(\\text{false},Gain(v^*),c(v^*),\\text{copy rand})\\} C ← C ∪ {( false , G ain ( v ∗ ) , c ( v ∗ ) , copy rand )} 21 \\;\\; return J J J We first initialize the state in lines 1-5. In each iteration of the main loop we initially extract the maximum gain vertex (line 8), breaking ties at random. Before making any change, we need to check if the vertex’s gain is stale. In particular, each time we add a vertex into J J J we affect the gain of other vertices: Since all its neighbors have an additional neighbor in J J J their gains can change (line 14) If any of its neighbors are now fully covered all their neighbors’ gains can change (lines 14-16) We recompute gains in a lazy fashion, so we only recompute the gain of a vertex if we want to insert it into J J J (lines 18-20). Since gains only ever decrease we can never miss a vertex we should insert. Note that we simply need to keep track of the total gain of vertices we’ve added to J J J to determine when to exit. Furthermore, whilst G a i n t o t < G a i n e x i t Gain_{tot}<Gain_{exit} G ai n t o t ​ < G ai n e x i t ​ at least one vertex will have non-zero gain so we always make progress. Results We ran experiments on four datasets that together cover our three supported distance metrics (Euclidean, cosine and inner product): quora-E5-small: 522931 documents, 384 dimensions and uses cosine similarity, cohere-wikipedia-v2: 1M documents, 768 dimensions and uses cosine similarity, gist: 1M documents, 960 dimensions and uses Euclidean distance, and cohere-wikipedia-v3: 1M documents, 1024 dimensions and uses maximum inner product. For each dataset we evaluate two quantization levels: int8 – which uses a 1-byte integer per dimension and BBQ – which uses a single bit per dimension. Finally, for each experiment we evaluated search quality at two retrieval depths and examine after building the index and then after force merging to a single segment. In summary, we achieve consistent substantial speedups in indexing and merging while maintaining graph quality and so search performance in all cases. Experiment 1: int8 quantization The average speedups from the baseline to the candidate, the proposed changes, are: Index Time Speedup: 1.28 × \\times × Force Merge Speedup: 1.72 × \\times × This corresponds to the following breakdown in run times Index and merge times for the baseline and candidate merge strategies For completeness the exact times are Index Merge Dataset baseline candidate build candidate quora-E5-small 112.41s 81.55s 113.81s 70.87s wiki-cohere-v2 158.1s 122.95s 425.20s 239.28s gist 141.82s 119.26s 536.07s 279.05s wiki-cohere-v3 211.86s 168.22s 654.97s 414.12s Below we show the recall vs latency graphs that compare the candidate (dashed lines) to the baseline at two retrieval depths: recall@10 and recall@100 for indices with multiple segments (the final result of our default merge strategy after indexing all vectors) and after force merging to a single segment. A curve that is higher and further to the left is better, which means higher recall at lower latency. As you can see, for multiple segment indices the candidate is better for the Cohere v3 dataset and slightly worse, but almost comparable, for all other datasets. After merging to a single segment recall curves are almost identical for all cases. Recall @10 and @100 vs latency after building the index Recall @10 and @100 vs latency after merging to a single segment Experiment 2: BBQ quantization The average speedups from the baseline to the candidate are: Index Time Speedup: 1.33 × \\times × Force Merge Speedup: 1.34 × \\times × This corresponds to the following breakdown in run times Index and merge time for the baseline and candidate merge strategies For completeness the exact times are Index Merge Dataset baseline candidate build candidate quora-E5-small 70.71s 58.25s 59.38s 40.15s wiki-cohere-v2 203.08s 142.27s 107.27s 85.68s gist 110.35s 105.52s 323.66s 202.2s wiki-cohere-v3 313.43s 190.63s 165.98s 159.95s For multiple segment indices the candidate is better for almost all datasets, except cohere v2 where the baseline is slightly better. For the single segment indices recall curves are almost identical for all cases. Recall @10 and @100 vs latency after building the index Recall @10 and @100 vs latency having merged to a single segment Conclusion The algorithm discussed in this blog will be available in the upcoming Lucene 10.2, and in the Elasticsearch release that is based on it. Users will be able to take advantage of the improved merge performance and reduced index build time in these new versions. This change is a part of our continuous effort to make Lucene and Elasticsearch fast and efficient for vector and hybrid search. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Search Relevance Vector Database +1 March 20, 2025 Scaling late interaction models in Elasticsearch - part 2 This article explores techniques for making late interaction vectors ready for large-scale production workloads, such as reducing disk space usage and improving computation efficiency. PS BT By: Peter Straßer and Benjamin Trent Jump to Background Graph Merging Computing the Join Set Results Experiment 1: int8 quantization Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Speeding up merging of HNSW graphs - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/hnsw-graphs-speed-up-merging","meta_description":"Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it Lucene BT AL By: Benjamin Trent and Ao Li On February 7, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Yep, another bug fixing blog. But this one has a twist, an open-source hero swoops in and saves the day. Debugging concurrency bugs is no picnic, but we're going to get into it. Enter Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, that turns flaky failures into reliably reproducible ones. Thanks to Fray’s clever shadow lock design and precise thread control, we tracked down a tricky Lucene bug and finally squashed it. This post explores how open-source heroes and tools are making concurrency debugging less painful—and the software world a whole lot better. Concurrency bugs: software engineers' bane Concurrency bugs are the worst. Not only are they difficult to fix, simply getting them to fail reliably is the hardest part. Take this test failure, TestIDVersionPostingsFormat#testGlobalVersions , as an example. It spawns multiple document writing and updating threads, challenging Lucene’s optimistic concurrency model. This test exposed a race condition in the optimistic concurrency control. Meaning, a document operation may falsely claim to be the latest in a sequence of operations 😱. Meaning, in certain conditions, an update or delete operation might actually succeed when it should have failed given optimistic concurrency constraints. Apologies for those who hate Java stack traces. Note, delete doesn’t necessarily mean “delete”. It can also indicate a document “update”, as Lucene’s segments are read-only. Apache Lucene manages each thread that is writing documents through the DocumentsWriter class. This class will create or reuse threads for document writing and each write action controls its information within the DocumentsWriterPerThread (DWPT) class. Additionally, the writer keeps track of what documents are deleted in the DocumentsWriterDeleteQueue (DWDQ). These structures keep all document mutation actions in memory and will periodically flush, freeing up in-memory resources and persisting structures to disk. In an effort to prevent blocking threads and ensuring high throughput in concurrent systems, Apache Lucene tries to only synchronize in very critical sections. While this can be good in practice, like in any concurrent systems, there are dragons. A false hope My initial investigation pointed me to a couple of critical sections that were not appropriately synchronized. All interactions to a given DocumentsWriterDeleteQueue are controlled by its enclosing DocumentsWriter . So while individual methods may not be appropriately synchronized in the DocumentsWriterDeleteQueue , their access to the world is (or should be). (Let’s not delve into how this muddles ownership and access—it’s a long-lived project written by many contributors. Cut it some slack.) However, I found one place during a flush that was not synchronized. These actions aren’t synchronized into a single atomic operation. Meaning, between newQueue being created, and calling getMaxSeqNo , other code could have executed incrementing the sequence number in the documentsWriter class. I found the bug! If only it were that easy. But, as with most complex bugs, finding the root cause wasn't simple. That's when a hero stepped in. A hero in the fray Enter our hero: Ao Li and his colleagues at the PASTA Lab. I will let him explain how they saved the day with Fray. Fray is a deterministic concurrency testing framework developed by researchers at the PASTA Lab , Carnegie Mellon University. The motivation behind building Fray stems from a noticeable gap between academia and industry: while deterministic concurrency testing has been extensively studied in academic research for over 20 years, practitioners continue to rely on stress testing—a method widely acknowledged as unreliable and flaky—to test their concurrent programs. Thus, we wanted to design and implement a deterministic concurrency testing framework with generality and practical applicability as the primary goal. The core idea At its heart, Fray leverages a straightforward yet powerful principle: sequential execution. Java’s concurrency model provides a key property —if a program is free of data races, all executions will appear sequentially consistent. This means the program’s behavior can be represented as a sequence of program statements. Fray operates by running the target program in a sequential manner: at each step, it pauses all threads except one, allowing Fray to precisely control thread scheduling. Threads are selected randomly to simulate concurrency, but the choices are recorded for subsequent deterministic replay. To optimize execution, Fray only performs context-switches when a thread is about to execute a synchronizing instruction such as locking or atomic/volatile access. A nice property about data-race freedom is that this limited context switching is sufficient to explore all observable behaviors due to any thread interleaving ( our paper has a proof sketch). The challenge: controlling thread scheduling While the core idea seems simple, implementing Fray presented significant challenges. To control thread scheduling, Fray must manage the execution of each application thread. At first glance, this might seem straightforward—replacing concurrency primitives with customized implementations. However, concurrency control in the JVM is intricate, involving a mix of bytecode instructions , high-level libraries , and native methods . This turned out to be a rabbit hole: For example, every MONITORENTER instruction must have a corresponding MONITOREXIT in the same method. If Fray replaces MONITORENTER with a method call to a stub/mock, it also needs to replace MONITOREXIT . In code that makes use of object.wait/notify , If MONITORENTER is replaced, the corresponding object.wait must also be replaced. This replacement chain extends to object.notify and beyond. JVM invokes certain concurrency-related methods (e.g., object.notify when a thread ends) within native code. Replacing these operations would require modifying the JVM itself. JVM functions, such as class loaders and garbage collection (GC) threads, also use concurrency primitives. Modifying these primitives can create mismatches with those JVM functions. Replacing concurrency primitives in the JDK often results in JVM crashes during its initialization phase. These challenges made it clear that a comprehensive replacement of concurrency primitives was not feasible. Our solution: shadow lock design To address these challenges, Fray uses a novel shadow lock mechanism to orchestrate thread execution without replacing concurrency primitives. Shadow locks act as intermediaries that guide thread execution. For example, before acquiring a lock, an application thread must interact with its corresponding shadow lock. The shadow lock determines whether the thread can acquire the lock. If the thread cannot proceed, the shadow lock blocks it and allows other threads to execute, avoiding deadlocks and allowing controlled concurrency. This design enables Fray to control thread interleaving transparently while preserving the correctness of concurrency semantics. Each concurrency primitive is carefully modeled within the shadow lock framework to ensure soundness and completeness. More technical details can be found in our paper. Moreover, this design is intended to be future-proof. By requiring only the instrumentation of shadow locks around concurrency primitives, it ensures compatibility with newer versions of JVM. This is feasible because the interfaces of concurrency primitives in the JVM are relatively stable and have remained unchanged for years. Testing Fray After building Fray, the next step was evaluation. Fortunately, many applications, such as Apache Lucene, already include concurrency tests. Such concurrency tests are regular JUnit tests that spawn multiple threads, do some work, then (usually) wait for those threads to finish, and then assert some property. Most of the time, these tests pass because they exercise only one interleaving. Worse yet, some tests only fail occasionally in the CI/CD environment, as described earlier, making these failures extremely difficult to debug. When we executed the same tests with Fray, we uncovered numerous bugs. Notably, Fray rediscovered previously reported bugs that had remained unfixed due to the lack of a reliable reproduction, including this blog’s focus: TestIDVersionPostingsFormat.testGlobalVersions . Luckily, with Fray, we can deterministically replay them and provide developers with detailed information, enabling them to reliably reproduce and fix the issue. Next steps for Fray We are thrilled to hear from developers at Elastic that Fray has been helpful in debugging concurrency bugs. We will continue to work on Fray to make it available to more developers. Our short-term goals include enhancing Fray’s ability to deterministically replay the schedule, even in the presence of other non-deterministic operations such as a random-value generator or the use of object.hashcode . We also aim to improve the usability of Fray, enabling developers to analyze and debug existing concurrency tests without any manual intervention. Most importantly, if you are facing challenges debugging or testing concurrency issues in your program, we’d love to hear from you. Please don’t hesitate to create an issue in the Fray Github repository . Time to fix the concurrency bug Thanks to Ao Li and the PASTA lab, we now have a reliably failing instance of this test! We can finally fix this thing. The key issue resided in how DocumentsWriterPerThreadPool allowed for thread and resource reuse. Here we can see each thread being created, referencing the initial delete queue at generation 0. Then the queue advance will occur on flush, correctly seeing the previous 7 actions in the queue. But, before all the threads can finish flushing, two are reused for an additional document: These will then increment the seqNo above the assumed maximum, which was calculated during the flush as 7. Note the additional numDocsInRAM for segments _3 and _0 Thus causing Lucene to incorrectly account for the sequence of document actions during a flush and tripping this test failure. Like all good bug fixes, the actual fix is about 10 lines of code . But took two engineers multiple days to actually figure out: Some lines of code take longer to write than others. And even require the help of some new friends. Not all heroes wear capes Yes, it's cliche – but it's true. Concurrent program debugging is incredibly important. These tricky concurrency bugs take an inordinate amount of time to debug and work through. While new languages like Rust have built in mechanisms to help prevent race conditions like this, the majority of software in the world is already written, and written in something other than Rust . Java, even after all these years, is still one of the most used languages. Improving debugging on JVM based languages makes the software engineering world better. And given how some folks think that code will be written by Large Language Models, maybe our jobs as engineers will eventually just be debugging bad LLM code instead of just our own bad code. But, no matter the future of software engineering, concurrent program debugging will remain critical for maintaining and building software. Thank you Ao Li and his colleagues from the PASTA Lab for making it that much better. Report an issue Related content Lucene December 27, 2024 Lucene bug adventures: Fixing a corrupted index exception Sometimes, a single line of code takes days to write. Here, we get a glimpse of an engineer's pain and debugging over multiple days to fix a potential Apache Lucene index corruption. BT By: Benjamin Trent Lucene January 3, 2025 Lucene Wrapped 2024 2024 has been another major year for Apache Lucene. In this blog, we’ll explore the key highlights. CH By: Chris Hegarty Vector Database Lucene June 26, 2024 Elasticsearch vs. OpenSearch: Vector Search Performance Comparison Elasticsearch is out-of-the-box 2x–12x faster than OpenSearch for vector search US By: Ugo Sangiorgi Lucene ML Research November 11, 2023 Understanding scalar quantization in Lucene Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights. BT By: Benjamin Trent Vector Database February 6, 2025 A quick introduction to vector search This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. VC By: Valentin Crettaz Jump to Concurrency bugs: software engineers' bane A false hope A hero in the fray The core idea The challenge: controlling thread scheduling Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Concurrency bugs in Lucene: How to fix optimistic concurrency failures - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/concurrency-bugs-lucene-debugging","meta_description":"Exploring how to fix optimistic concurrency failures and bugs in Lucene using Fray. We'll be making concurrency debugging less painful by using open-source tools."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Introducing LangChain4j: Building RAG apps in plain Java Introducing LangChain4j (LangChain for Java). Discover how to use it to build your RAG application in plain Java. Part1 Integrations Java +1 September 23, 2024 Introducing LangChain4j to simplify LLM integration into Java applications LangChain4j (LangChain for Java) is a powerful toolset to build your RAG application in plain Java. DP By: David Pilato Part2 Integrations Java +1 October 8, 2024 LangChain4j with Elasticsearch as the embedding store LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Discover how to use it to build your RAG application in plain Java. DP By: David Pilato Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing LangChain4j: Building RAG apps in plain Java - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/langchain4j-for-building-rag-apps-in-plain-java","meta_description":"Introducing LangChain4j (LangChain for Java). Discover how to use it to build your RAG application in plain Java."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series How to ingest data from AWS S3 into Elastic Cloud Learn about different ways you can ingest data from AWS S3 into Elastic Cloud. Part1 Ingestion How To October 2, 2024 How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder Learn how to ingest data from AWS S3 using Elastic Serverless Forwarder (ESF). HL By: Hemendra Singh Lodhi Part2 Ingestion How To October 10, 2024 How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent Learn about different options to ingest data from AWS S3 into Elastic Cloud. This blog covers how to ingest data from AWS S3 using Elastic Agent. HL By: Hemendra Singh Lodhi Part3 Ingestion Integrations +1 November 5, 2024 How to ingest data from AWS S3 into Elastic Cloud - Part 3: Elastic S3 Connector Learn about different options to ingest data from AWS S3 into Elastic Cloud. This time we will focus on Elastic S3 Connector. HL By: Hemendra Singh Lodhi Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to ingest data from AWS S3 into Elastic Cloud - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/how-to-ingest-data-from-aws-s3-into-elastic-cloud","meta_description":"Learn about different ways you can ingest data from AWS S3 into Elastic Cloud."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Understanding scalar quantization in Lucene Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights. Lucene ML Research BT By: Benjamin Trent On November 11, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Automatic byte quantization in Lucene While HNSW is a powerful and flexible way to store and search vectors, it does require a significant amount of memory to run quickly. For example, querying 1MM float32 vectors of 768 dimensions requires roughly 1 , 000 , 000 ∗ 4 ∗ ( 768 + 12 ) = 3120000000 b y t e s ≈ 3 G B 1,000,000 * 4 * (768 + 12) = 3120000000 bytes \\approx 3GB 1 , 000 , 000 ∗ 4 ∗ ( 768 + 12 ) = 3120000000 b y t es ≈ 3 GB of ram. Once you start searching a significant number of vectors, this gets expensive. One way to use around 75 % 75\\% 75% less memory is through byte quantization. Lucene and consequently Elasticsearch has supported indexing b y t e byte b y t e vectors for some time, but building these vectors has been the user's responsibility. This is about to change, as we have introduced i n t 8 int8 in t 8 scalar quantization in Lucene. Scalar quantization 101 All quantization techniques are considered lossy transformations of the raw data. Meaning some information is lost for the sake of space. For an in depth explanation of scalar quantization, see: Scalar Quantization 101 . At a high level, scalar quantization is a lossy compression technique. Some simple math gives significant space savings with very little impact on recall. Exploring the architecture Those used to working with Elasticsearch may be familiar with these concepts already, but here is a quick overview of the distribution of documents for search. Each Elasticsearch index is composed of multiple shards . While each shard can only be assigned to a single node, multiple shards per index gives you compute parallelism across nodes. Each shard is composed as a single Lucene Index . A Lucene index consists of multiple read-only segments. During indexing, documents are buffered and periodically flushed into a read-only segment. When certain conditions are met, these segments can be merged in the background into a larger segment. All of this is configurable and has its own set of complexities. But, when we talk about segments and merging, we are talking about read-only Lucene segments and the automatic periodic merging of these segments. Here is a deeper dive into segment merging and design decisions. Quantization per segment in Lucene Every segment in Lucene stores the following: the individual vectors, the HNSW graph indices, the quantized vectors, and the calculated quantiles. For brevity's sake, we will focus on how Lucene stores quantized and raw vectors. For every segment, we keep track of the raw vectors in the v e c vec v ec file, quantized vectors and a single corrective multiplier float in v e q veq v e q , and the metadata around the quantization within the v e m q vemq v e m q file. Figure 1: Simplified layout of raw vector storage file. Takes up d i m e n s i o n ∗ 4 ∗ n u m V e c t o r s dimension * 4 * numVectors d im e n s i o n ∗ 4 ∗ n u mV ec t ors of disk space since f l o a t float f l o a t values are 4 bytes. Because we are quantizing, these will not get loaded during HNSW search. They are only used if specifically requested (e.g. brute-force secondary via rescore ), or for re-quantization during segment merge. Figure 2: Simplified layout of the . v e q .veq . v e q file. Takes up ( d i m e n s i o n + 4 ) ∗ n u m V e c t o r s (dimension + 4)*numVectors ( d im e n s i o n + 4 ) ∗ n u mV ec t ors of space and will be loaded into memory during search. The + 4 + 4 + 4 bytes is to account for the corrective multiplier float, used to adjust scoring for better accuracy and recall. Figure 3: The simplified layout of the metadata file. Here is where we keep track of quantization and vector configuration along with the calculated quantiles for this segment. So, for each segment, we store not only the quantized vectors, but the quantiles used in making these quantized vectors and the original raw vectors. But, why do we keep the raw vectors around at all? Quantization that grows with you Since Lucene periodically flushes to read only segments, each segment only has a partial view of all your data. This means the quantiles calculated only directly apply for that sample set of your entire data. Now, this isn't a big deal if your sample adequately represents your entire corpus. But Lucene allows you to sort your index in various ways. So, you could be indexing data sorted in a way that adds bias for per-segment quantile calculations. Also, you can flush the data whenever you like! Your sample set could be tiny, even just one vector. Yet another wrench is that you have control over when merges occur. While Elasticsearch has configured defaults and periodic merging, you can ask for a merge whenever you like via _force_merge API. So how do we still allow all this flexibility, while providing good quantization that provides good recall? Lucene's vector quantization will automatically adjust over time. Because Lucene is designed with a read-only segment architecture, we have guarantees that the data in each segment hasn't changed and clear demarcations in the code for when things can be updated. This means during segment merge we can adjust quantiles as necessary and possibly re-quantize vectors. Figure 4: Three example segments with different quantiles. But isn't re-quantization expensive? It does have some overhead, but Lucene handles quantiles intelligently, and only fully-requantizes when necessary. Let's use the segments in Figure 4 as an example. Let's give segments A A A and B B B 1 , 000 1,000 1 , 000 documents each and segment C C C only 100 100 100 documents. Lucene will take a weighted average of the quantiles and if that resulting merged quantile is near enough to the segments original quantiles, we don't have to re-quantize that segment and will utilize the newly merged quantiles. Figure 5: Example of merged quantiles where segments A A A and B B B have 1000 1000 1000 documents and C C C only has 100 100 100 . In the situation visualized in figure 5, we can see that the resulting merged quantiles are very similar to the original quantiles in A A A and B B B . Thus, they do not justify quantizing the vectors. Segment C C C , seems to deviate too much. Consequently, the vectors in C C C would get re-quantized with the newly merged quantile values. There are indeed extreme cases where the merged quantiles differ dramatically from any of the original quantiles. In this case, we will take a sample from each segment and fully re-calculate the quantiles. Quantization performance & numbers So, is it fast and does it still provide good recall? The following numbers were gathered running the experiment on a c3-standard-8 GCP instance. To ensure a fair comparison with f l o a t 32 float32 f l o a t 32 we used an instance large enough to hold raw vectors in memory. We indexed 400 , 000 400,000 400 , 000 Cohere Wiki vectors using maximum-inner-product. Figure 6: Recall@10 for quantized vectors vs raw vectors. The search performance of quantized vectors is significantly faster than raw, and recall is quickly recoverable by gathering just 5 more vectors; visible by q u a n t i z e d @ 15 quantized@15 q u an t i ze d @15 . Figure 6 shows the story. While there is a recall difference, as to be expected, it's not significant. And, the recall difference dissappears by gathering just 5 more vectors. All this with 2 × 2\\times 2 × faster segment merges and 1/4 of the memory of f l o a t 32 float32 f l o a t 32 vectors. Conclusion Lucene provides a unique solution to a difficult problem. There is no “training” or “optimization” step required for quantization. In Lucene, it will just work. There is no worry about having to “re-train” your vector index if your data shifts. Lucene will detect significant changes and take care of this automatically over the lifetime of your data. Look forward to when we bring this capability into Elasticsearch! Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Jump to Automatic byte quantization in Lucene Scalar quantization 101 Exploring the architecture Quantization per segment in Lucene Quantization that grows with you Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Understanding scalar quantization in Lucene - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/scalar-quantization-in-lucene","meta_description":"Explore how Elastic introduced scalar quantization into Lucene, including automatic byte quantization, quantization per segment & performance insights."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Evaluating search relevance part 2 - Phi-3 as relevance judge Using the Phi-3 language model as a search relevance judge, with tips & techniques to improve the agreement with human-generated annotation. ML Research Python TP TV By: Thanos Papaoikonomou and Thomas Veasey On September 19, 2024 Part of Series Evaluating search relevance Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our previous post we discussed the importance of obtaining deep markups of search results for effective evaluation. We showed that even widely used benchmarks, like those comprising the MTEB retrieval challenge, have relatively few judgments in the top 10 result sets returned by many state of the art retrieval models. Furthermore, we did some exploration regarding how missing judgments can lead to incorrect evaluation of model quality, identifying a high proportion of false negatives in the MS MARCO benchmark for the queries we analyzed. Large generative models (LLMs) hold the promise of providing many capabilities in a zero- or few-shot fashion. In the context of search relevance, this amounts to providing expert judgments of the relevance of supplied documents to a query. This can help to significantly lower the effort needed to obtain high quality benchmark datasets for your own search systems. However, as with all tasks involving LLMs there are many design choices and these can significantly affect the quality of results one obtains. In this post, we present a case study of tuning a Phi-3-mini pipeline to provide the best possible agreement with human relevance judgments on the same sample of 100 queries we used to explore missing markup in MS MARCO in our previous post. Phi models: A short introduction Phi-3 is a recent generation of Small Language Models (SLMs) from Microsoft . It was shown that properly curated collections of publicly available web data combined with LLM-created synthetic data, enabled \"small\" language models to match the performance of much larger language models (in some cases 25 times larger) trained on regular data. The release of Phi models under the name \"Phi-3\" has introduced a family of models of varied sizes. More specifically, Phi-3-mini: a 3.8B parameter model trained on approximately 3.3T tokens that follows the transformer decoder architecture. It comes in two different flavors of context sizes, namely: the 4K which is the default and an enlarged version to 128K. Phi-3-mini’s size allows running inference on edge devices like a modern mobile phone. Phi-3-small: a 7B parameter model with a default context length of 8192. Phi-3-medium: a model with 14B parameters which uses the same tokenizer and architecture as Phi-3-mini and is trained on the same data for slightly more epochs. Phi-3-mini has also undergone a post-training process in two stages including supervised finetuning and direct preference optimization to further improve performance and safety of the model. Its strong reasoning capabilities and ability to follow instructions makes it a great choice for our purposes. Typically, the information needed to judge relevance is self-contained in the query and document and so general language skills and reasoning are more important than knowledge for this task. A note on terminology Pairwise and pointwise denote two choices of ranking function. A pointwise ranker gets to see the query and one document at a time and decide how relevant that document is to the query. A pairwise ranking function gets to see the query and two documents at a time and decides which is more relevant. Our experiment: Evaluating search relevance As we alluded, there are multiple choices in the prompt and the way an LLM is used that can influence its effectiveness for any given task. These should be evaluated systematically before it is deployed for real. Although details of the optimal prompt wording may change between LLMs, general principles have emerged that work for a variety of tasks and models. Furthermore, task specific strategies can transfer to different datasets, much as task specific finetuning does. For our annotation task we have clearly defined quality metrics: the number of agreements between the relevance judgment (0 or 1) assigned by the LLM and that we assign ourselves. This constitutes a confusion matrix for each design choice. As well as presenting the optimal choices, we perform some ablation studies for the different choices we found. This guides us on where to expend optimization effort when creating judgment lists for different retrieval tasks. We explore the impact on quality of the following broad design choices: Prompt wording for a zero-shot pointwise approach A pairwise approach where you have one relevant document for each query Few-shot chain-of-thought Ensembling We note in passing that, although we did not explore this, these approaches are often compatible and we expect certain combinations are likely to achieve further improvements. About the annotation task In our previous blog post we touched upon the issue of incomplete markup in the MTEB benchmark and as a proof concept we created a dataset from the dev portion of the MS MARCO dataset. More specifically, we pulled 100 queries and kept the top-5 negative documents (based on the original relevance judgments file) that were retrieved with BM25 and reranked with Cohere-v2 . We then manually scored all 500 (query, document) pairs. Regarding the manual annotation task we decided to go with a strict approach and mark documents as positives when there was high enough confidence. We tried to be consistent by adhering to the following criteria: The retrieved documents are on-topic and contain all the necessary information to address the query. In the case of ambiguous queries we resort to the positive document (labeled through the original relevance judgments supplied with MS MARCO) to infer some additional metadata. Regarding the second point, we can use as an example the query age requirements for name change (MS MARCO query ID 14151) . In this case, most of the returned documents seemed relevant at first glance as they provided on-topic information for name change in different states, but by reviewing the positive document we saw that the age requirements were localized for the state of California. Thus, documents which contained information for other states were finally marked as irrelevant. Given the metadata are not provided to the relevance model, we would not recommend this approach evaluating your own retrieval systems. However, for this task we decided to try as much as possible to stay in agreement with the intentions of the original annotators. Finally, there were 33 documents we decided to remove. These included queries and their associated documents for which the positive document did not appear to provide any useful information, judgments we did not feel qualified to make and documents which appeared to answer the question, but provided contradictory information to the positive example. We felt these would introduce errors to the evaluation process without providing any additional useful differentiation between design choices. In summary, out of the 500 pairs considered, our manual annotation gave us: 288 relevant documents 179 irrelevant documents 33 removed documents We ended up making some revisions to our initial markup based on reviewing the chain-of-thought judgment errors and their rationale and have updated our previous post to reflect this process. This underlines how challenging it can be to determine relevance. We discuss the revision process in the context of that experiment. About the language model We decided to work with the Phi-3-mini-4k-instruct model using the transformers library because: It contains only 3.8B parameters that when loaded with 4-bit quantization (through bitsandbytes it can easily fit into a consumer-grade GPU As noted in the technical report : the model performs quite well on reasoning tasks, and the model does not have enough capacity to store too much \"factual knowledge\" which can be an advantage in the context of retrieval evaluation as we expect a smaller hallucination rate For details on how we use this model see the notebook . Results summary Our baseline performance was obtained by the simple zero-shot pointwise prompt shown below. We choose to measure the quality of the LLM judge using the micro F1 score vs the ground truth labeling. Specifically, this is the fraction of correctly labeled (query, document) pairs or overall accuracy. This prompt achieves an F1 score of 0.7002. This is the baseline prompt where query_text and retrieved_text are placeholders for the query and document to judge The figure below summarizes the accuracy we obtain from the various techniques we evaluated. In practice, one would typically want to apply these techniques to unseen queries and documents (or else you don’t save any effort). By ‘fine-tuning’ the behavior of the model on a smaller (validation) set we implicitly assume that these improvements will ‘transfer’ well to the test dataset. Whilst this is usually justified, if you have more data it can be useful to hold out a portion for testing the optimisations you make. Figure 1. provides a summary of the impact on the accuracy of the various design choices we studied. We discuss these in detail next. The role of word choice on prompt performance Optimizing a prompt for a downstream task can be quite challenging; as others have noted, even small paraphrases can have a drastic effect on performance . Some good general guidelines are that prompts should: Maintain consistent style and naming conventions, Clearly delineate distinct content Unambiguously define the task, and Avoid extra verbiage (except for special cases) Consistent naming of entities minimizes the risk that the LLM becomes confused about task descriptions that reference them. Capitalisation or punctuation can help. In the example below, we always refer to \"Query\" and \"Retrieved Document\" making clear that these are proper names we have defined in the context that we reference them. Keeping separate content clearly delineated helps the attention mechanism avoid confounding content you wish to disambiguate. In the example below we use ##### to surround the supplied \"Query\" and \"Retrieved Document\", keeping them distinct from their definition in the task description. In general, you should strive to make each sentence clear and concise. Clear unambiguous task definitions are particularly important. One can observe dramatic effects, particularly for weaker models, from accidental typos, contradictory content and missing punctuation in key passages. It is always worth printing and proofreading your prompts. Aside from \"bugs\", there is often a trade off in precision and conciseness; being too brief can introduce ambiguity. Furthermore, some tasks are intrinsically nuanced and so lack a simple clear definition. Judging relevance is such a task since it can depend on multiple factors some of which are left unsaid. We will return to this topic later. Aside from certain special phrases, such as those which condition for particular behavior in the LLM, irrelevant content usually simply acts as a distraction and should be avoided. The best zero shot variant of our baseline prompt is shown below. In particular, the variants we compared excluded pairwise and chain-of-thought strategies, which we will discuss later. This is the best zero-shot pointwise prompt The results from this first experiment are shown in the table below: LLM\\Relevant LLM\\Not Relevant Human\\Relevant 244 44 Human\\Not Relevant 70 109 Table 1. Pointwise results The micro F1-score in this case is 0.7559, whereas focusing only on the \"Relevant\" class we get Precision = 0.7771, Recall = 0.8472 and F1(Relevant class) = 0.8106. It’s worth noting that the LLM output could be parsed without any issues, i.e. the output always contained the tokens either \"Relevant\" or \"Not Relevant\". For the high-level design of the prompt we drew some inspiration from this paper that builds a taxonomy for prompt templates in the context of zero-shot LLM re-rankers and we adapt it to our specific task. The taxonomy is built around the following high-level components: Evidence : the query and the associated passage(s) to be judged Task Instruction : The instructions associated with our judging task Output Type : The instructions that specify the desired format for the output Tone Words : Positive or negative expressions that aim to steer the performance of the language model e.g. \"please\", \"good luck\" Role Playing : The instructions asking the LM to \"impersonate\" a role We used the following prompt template to logically group the prompt components (the element tags are for illustrative purposes and were not used to prime the language model). This is the prompt template structure. The element tags are here just for illustrative purposes, they were not part of the prompt sent to the language model. A benefit of adding structure to the prompt template is that it makes it easier to iterate on word choices in a more systematic way. We present some of the results of those experiments below. Tone words We investigated three different choices for the \"Tone words\" section as presented above. We tried the following versions: Good luck which gives a micro-F1 score of 0.7473. A version that urges the LM to produce a good score such as It's absolutely crucial that you perform well on this task as it will have a direct impact on the quality of the search results for our users . The micro F1-score in this case was 0.7259 Removing the tone words section gave us the best score (0.7559) as depicted above. Output instructions In the prompts above we have set the output instruction as You should provide your answer in the form of a boolean value: \"Relevant\" or \"Not Relevant\" . We have found that Phi-3 behaves very well and adheres to the output format that we requested with 0 errors. Removing that guideline leads to unparsable output as expected. \"Take a step back\" reasoning In the prompts presented above we have added a line to steer the model into employing a \"step back\" reasoning ( Take a step back and reflect carefully on how best to solve your task ). Our experiments show that the role of this component is absolutely critical . By removing this line we observed a significant drop in the accuracy as the model entered a failure mode over-predicting one class over the other (in this case the \"Relevant\" class). More specifically, the set of prompts that contained this guideline led to a prediction rate of 68.9% on average for the \"Relevant\" class whereas in prompts with the guideline removed the rate rose to 85.6% rendering the model unusable. (As a reminder from the previous blog post the percentage of \"Relevant\" data points after manual annotation is ~57%.) Role playing For the role-playing section we didn’t observe a clear pattern regarding its effect. Adding or removing this component yielded small differences in absolute terms (mostly positive but also some negative). More precise instructions In the discussion so far we have implicitly assumed that the language model ‘knows’ how to interpret the task instructions in order to assess the relevance between the query and the document. We also investigated the effect of more detailed instructions following the work of Thomas et al . There, the authors elicit a step-by-step reasoning by asking the model to explicitly score two aspects of the problem - namely topicality and trust - prior to deciding on a final score for the query, document pair. Inspired by the methodology of Thomas et al. we incorporated the following guidelines (and variations of it) between the evidence and the output instructions sections: Adding instructions In this case we didn’t observe a benefit from adding these guidelines in the accuracy metric and the best score we achieved was 0.7238. One common pattern was the tendency of the language model to be more 'hesitant' into predicting the 'Relevant' class which can be attributed to the additional criteria that need to be satisfied. In other words, it seems like a useful ‘knob’ that can be tuned to regulate the behavior of a language model if you are more sensitive to certain classes of error. Pairwise Pairwise methods have been considered in the context of LLM reranking, where an LLM is presented with a query and two candidate passages and it is asked to select the better one. After repeating multiple times for different pairs of documents, these intermediate results are aggregated to produce the final ranking of the documents. It has been shown that pairwise approaches are usually quite effective and also exhibit low variability in the generated results. The superiority, especially compared to pointwise methods, can be potentially explained by the fact that the extra document(s) provide additional context for making individual relevance judgments. Pairwise comparisons have been also applied more broadly in LLM evaluation setups such as MT-Bench where the LLM acting as judge is presented with a question and two answers and is asked to determine which one is better or declare a tie. This is a good way to enhance stability according to recent studies . Here, we experiment to see if we can improve accuracy by injecting a positive example. This might sound like it defeats the purpose to reduce manual labeling effort. However, it actually aligns well with the deep mark up scenario: it enables you to \"transfer\" your judgment on one result to label the top-n. The most relevant work is this 1-shot labeling (1SL) study. It explores the effectiveness of extending manual markup with noisy LLM labels using the same pairwise approach. The authors found significant improvements in correlation to full human evaluations of a retrieval system compared to evaluating using single labels alone. This was despite errors in the LLM labels. The prompt template is adapted as follows: Pairwise prompting The results from this adaptation to the prompt are shown in the table below LLM\\Relevant LLM\\Not Relevant Human\\Relevant 246 42 Human\\Not Relevant 58 121 Table 2. Pairwise results where we observe a micro-F1 score of 0.7859 and focusing on the relevant class we get precision equal to 0.8092, a recall score of 0.8542 and a F1 score of 0.8311 From a high level perspective we ask the language model to perform two sub-tasks before responding: Extract the information from the positive document that answers the query Look for that piece of information in the retrieved document and respond with a relevance label Another interesting avenue to explore is to combine this sort of approach with current synthetic data generation techniques like RAGAS to enhance offline evaluation. In these frameworks documents are sampled from the target corpus and are fed to an LLM \"asking\" it to produce a suitable query. These (query, document) pairs form a test collection that can then be used to evaluate the performance of a QA pipeline or a RAG setup more generally. This process seems likely to also benefit from 1SL. Few-shot chain-of-thought In-context learning (ICL) has been a distinctive characteristic of the current state-of-the-art language models where through a few demonstrations the model can generalize to unseen inputs. A recent paper from Microsoft provides some intuition around this behavior by treating ICL as implicit finetuning where the LLMs act as meta-optimizers that produce meta-gradients according to the demonstration examples. Separately, it has been demonstrated in many contexts that getting LLMs to first generate text that provides some analysis of the task can improve performance. This is particularly important for tasks that require some form of reasoning. There are many variants on this theme, the prototype was chain-of-thought prompting. This asks the LLM to break complex tasks into multiple smaller steps and solve them one-by-one. These two techniques can be combined and we study this strategy below. Giving the LLM examples of our preferred relevance judgments, using the gradient descent analogy, should cause it to better align with our preferences. In the spirit of active learning these examples can be tailored to errors the LLM is observed to make. Furthermore, since judgments typically require weighing multiple factors, having the LLM clearly explain these to itself before it makes its judgment is a good candidate for chain-of-thought. We decided to steer the steps the LLM uses for its chain-of-thought as follows: Expand the query to try and infer intent Summarize the document to extract the key information it provides Justify the relevance judgment Make the judgment Note that autoregressive models only get to attend to the tokens they have already generated, so it is vital that the judgment step comes at the end. An additional advantage of generating rationales is that these can be useful to present to human annotators. Recent research suggests these should not be considered as an explanation of the LLMs \"thought process\". However, we found in practice that they are still often useful to diagnose error cases. Furthermore, we found them to be consistently useful to understand a point of view for the relevance judgment and this actually caused us to revise some of our original relevance judgments. There are downsides with this approach: It costs far more in terms of input and output tokens, There is a risk that information is hallucinated in the generation process which misinforms the relevance judgment, Models can place too much emphasis on examples. The below shows the prompt we evaluated. The choice of examples to supply is important. These were designed to demonstrate a mixture of skills we thought would be useful as well as fix point issues we were observing: Example 1 is a case when matching a synonym, which is often useful, is undesirable because the query intent is rather specific. The user is likely looking for the dictionary definition of a word they do not know, so synonym definitions are not useful. This was responsible for a number of mistakes in our test corpus. Example 2 demonstrates inductive reasoning applied to a relevance judgment. This is often useful, with the assessment having to weigh up the probability the document matches the user’s intent. Example 3 is an example of entity resolution. Queries that mention specific entities usually require examples related to those specific entities and this overrides other considerations. Few-shot Chain-of-thought prompt Before discussing the results, it is worth highlighting some of the subtleties involved in making good judgments. One recurring challenge we found in marking up this small dataset is how to treat documents that only provide links to information that would likely answer the question. In the context of a web search engine this is a relevant response; however, if the snippet were being supplied to an LLM to generate a response it may be less useful. We followed the positive examples in such cases (which were in fact inconsistent in this respect!). In general, we advise you to carefully consider what constitutes relevant in the context of your own datasets and retrieval tasks and suggest that this is often best done by trying to generalize from example queries and their retrieved results. Another challenge is where to draw the line in what constitutes relevant information. For example, the main helpline for Amazon may not be directly related to a query for \"amazon fire stick customer service\" but it would likely be useful. Again we were guided by the positive examples when making our own judgments. The LLM false negative errors we observed fell into several categories. For the sake of brevity we mention the two dominant ones: we define these as pedantry and selective blindness. For pedantry the LLM is correct in a pedantic sense, but in our opinion it fails to properly assess the balance of probability that the document is useful. For selective blindness the LLM seems to miss important details in its reading of the document content. We give examples below. pedantry blindness Table 3. Examples of false negative LLM relevance judgments Similarly, we saw several interesting false positive categories: we define two of these as confabulations and invented content. For confabulations the LLM fails to distinguish between entities in the query and document. For invented content it typically invents information about either the query or document which misleads the relevance judgment. We include examples below. Confabulations Invented Content Table 4. Examples of false positive LLM relevance judgments In the first example, it is interesting that Titanic is misinterpreted as Titan. This is likely in part a side effect of the tokenization: neither titan nor titanic are present in the Phi-3 vocabulary. In the second example, we hypothesize that the question in the retrieved document confuses the LLM regarding what is the actual query. It is possible that better delineation of context could help with this specific problem. Finally, the process of reading LLM rationales caused us to revise our judgment of 15 cases and exclude one additional example (on the grounds that the positive example was wrong). A typical example is that the LLM includes information which helps us better evaluate relevance itself. We show such a case below. In this context we strongly advise to double check any knowledge beyond the data that the LLM brings since there is risk of hallucination. The final results with the updated markup are provided in the table below. The micro F1-score in this case is 0.773, whereas focusing only on the \"Relevant\" class we get Precision = 0.7809, Recall = 0.8785 and F1(Relevant class) = 0.8268. In all cases the LLM output could be parsed without any issues, i.e. it always contained either Answer: \"Relevant\" or Answer: \"Not Relevant\" . LLM\\Relevant LLM\\Not Relevant Human\\Relevant 253 35 Human\\Not Relevant 71 108 Table 5. Chain-of-thought results Ensemble One common technique to increase the performance of AI systems in language tasks is to first perform multiple inferences over the same set of input data and then aggregate the intermediate results - usually through majority voting . Here is also an interesting paper discussing how the number of LM calls and the \"hardness\" of the queries affect the performance of systems which aggregate LM responses in a similar manner. We experimented with multiple inferences and majority voting as well: in order to simulate this behavior in our setup we enabled sampling (more in the accompanying notebook ), we increased the temperature to 0.5 and finally we asked the model to return 5 sequences through beam search that were aggregated to output a single response via majority voting. As an extra metric we computed the \"majority rate\" per data point which is simply the percentage of sequences matching the majority response. It is worth noting that temperature only affects token selection. Therefore, it can only affect matters after the first token is decoded. Even so this can affect outputs from all our prompts. For example, it can insert an extra token at the start of the sequence or it can deviate into alternative forms like \"Not sure\". We parse the raw output using a regular expression to match the instructed output and discard votes from runs we were unable to parse. We investigated both pointwise and pairwise prompts: In the pointwise case, we observed a consistent uplift between 0.6 and 1.5 percentage points (in terms of micro-F1) for 'suboptimal' prompts i.e. whose score was below the optimal (~0.76). On the other hand, majority voting did not give us a better top score. In other words, the gains are larger when you are far from the top but as you get closer you reach a plateau in terms of performance. In the pairwise case the increase was much more modest (~0.2 pp) and less robust as we came across cases where the effect was in fact negative. Another interesting aspect came from the analysis of the majority rate: as data points with values less than 1.0 surfaced queries that they were either ambiguous (e.g. query ID 1089846 - the economics is or are ) or \"hard to decide\" i.e. some of the 33 undefined cases that were eliminated from evaluation. Overall, such aggregation techniques are becoming more and more common given the performance boost they offer on certain tasks. Nevertheless, it’s an area that requires some additional exploration considering the extra computational cost tied to the additional inference steps. Also bear in mind that the more constrained the output the less scope it has to affect results. Conclusion We set out to explore whether we can use an LLM to bootstrap a search evaluation dataset. We took as a test case a sample of queries from MS MARCO together with the most relevant documents found by a strong retrieval system. We showed using a strong, small and permissively-licensed language model (in our case Phi-3 mini) that we are able to achieve good correlation with human judgments. At the same time we wanted to explore design choices in the use of the LLM. We tested: The prompt wording, Supplying a known relevant document, Chain-of-thought, and Ensembling multiple generations. We found significant gains could be obtained just by changing the prompt wording. We suggest some general guidelines for writing effective prompts. However, small details can have noticeable effects. Generally, some trial and error is required and automation frameworks can be useful when you have a test set. Chain-of-thought produced further improvements but comes at the cost of expensive generation. Judgment rationales are interesting in their own right and allowed us to improve our original manual markup. Our best result was obtained with a pairwise approach which supplied an example positive document. We postulate that part of the gain comes because it better aligns with our evaluation criteria, which was strongly steered by the supplied relevant document. However, other studies have shown improvements over pointwise approaches and we think it represents an interesting strategy for obtaining deep markup from shallow markup. Finally, we explored using multiple generations, via non-zero temperature. This achieved some modest benefits for our baseline strategy, but we expect the main advantage to be associated with complex outputs such as chain-of-thought. A natural next step would be ensembling of prompts. In this blog, we really just scratched the surface of using LLMs for evaluating search relevance. Longer term we expect them to have a larger and larger role in multiple aspects of building information retrieval systems and this is something we’re actively exploring at Elastic. Report an issue Related content Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Jump to Phi models: A short introduction A note on terminology Our experiment: Evaluating search relevance About the annotation task About the language model Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Evaluating search relevance part 2 - Phi-3 as relevance judge - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/evaluating-search-relevance-part-2","meta_description":"Using the Phi-3 model for evaluating search relevance, with tips for tuning the Phi-3-mini pipeline to achieve good correlation with human judgments. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Building a search app with Blazor and Elasticsearch Learn how to build a search application using Blazor and Elasticsearch, and how to use the Elasticsearch .NET client for hybrid search. Vector Database .NET Python How To GL By: Gustavo Llermaly On October 9, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. In this article, you will learn how to leverage your C# skills to build a search application using Blazor and Elasticsearch. We are going to use the Elasticsearch .NET client to run full text , semantic and hybrid search queries. NOTE If you are familiar with the older version of the Elasticsearch C# client NEST , read this blog post about NEST client deprecation and new features. NEST was the previous generation of the .NET client which was replaced with the Elasticsearch .NET Client . What is Blazor? What is ESRE? Configuring ELSER Indexing data Building application What is Blazor? Image: ASP.NETCoreBlazorhostingmodels Blazor is an open source HTML, CSS, and C# based web framework created by Microsoft to allow developers to build web applications that run on the client or the server. Blazor also allows you to make reusable components to build applications faster; it enables developers to build the HTML view and actions in C# within the same file, which helps maintain readable and clean code. Additionally, with Blazor Hybrid you can build native mobile apps accessing the native platform capabilities via .NET code. Some of the features that makes Blazor a great framework to work with: Server-side and client-side rendering options Reusable UI components Real-time updates with SignalR Built-in state management Built-in routing system Strong typing and compile-time checks Why Blazor? Blazor offers several advantages over others frameworks and libraries: it allows developers to use C# for both client and server code, providing strong typing and compile-time checking which enhances reliability. It integrates seamlessly with the .NET ecosystem, enabling the reuse of .NET libraries and tools, and offers robust debugging support. What is ESRE? Elasticsearch Relevance Engine™ (ESRE) is a set of tools to build search applications using machine learning and and artificial intelligence on top of the powerful Elasticsearch search engine. To learn more about ESRE, you can read our insightful blog post located here Configuring ELSER To leverage Elastic's ESRE capabilities, we are going to use ELSER as our model provider. Note to use the ELSER models of Elasticsearch, you must to have a Platinum or Enterprise license, and have a minimum dedicated Machine Lerning (ML) node of 4GB of size. Read more about here. Start by creating the inference endpoint: If this is your first time using ELSER, you may encounter a 502 Bad Gateway error as the model loads in the background. You can check the model's status in Machine Learning > Trained Models in Kibana. Once it is deployed, you can proceed to the next step. Indexing data You can download the dataset here and then import the data using Kibana. To do this, go to the homepage and click \"Upload data\". Then, upload the file and click Import . Finally, go to the Advanced tab and paste the following mappings: We are going to create an index capable of running semantic and full text queries. The semantic_text field type will take care of data chunking and embedding. Note we are indexing longDescription as semantic_text , you can use copy_to if you want to index a field as both semantic_text and text` . Building the app with Blazor & Elasticsearch API key The first thing we need to do is create an API key to authenticate our requests to Elasticsearch. The API key should be read-only and allowed only to query the books-blazor index. You will see something like this: Save the value of the encoded response field as it's needed later. If you are running on Elastic Cloud , you will also need your Cloud ID. (You can find it here ). Creating the Blazor project Start by installing Blazor and creating a sample project following the official instructions . One you have the project created, the folder structure and files should look like this: The template application includes Bootstrap v5.1.0 for styling. Finish the project setup by installing Elasticsearch .NET client: Once you finish this step, your page should look like this: Folders structure Now, we are going to organize our folders as follows: Files explained: Components/Pages/Search.razor: main page containing the search bar, the results, and the filters. Components/Pages/Search.razor.css: page styles. Components/Elasticsearch/SearchBar.razor: search bar component. Components/Elasticsearch/Results.razor: results component. Components/Elasticsearch/Facet.razor: filters component. Components/Svg/GlassIcon.razor: search icon. Components/_Imports.razor: this will import all the components. Models/Book.cs: this will store the book field schema. Models/Response.cs: this will store the response schema, including the search results, facets and total hits. Services/ElasticsearchService.cs: Elasticsearch service. It will handle the connection and queries to Elasticsearch. Initial Configuration Let's start with some clean-up. Delete the files: Components/Pages/Counter.razor Components/Pages/Weather.razor Components/Pages/Home.razor Components/Layout/NavMenu.razor Components/Layout/NavMenu.razor.css Check the /Components/_Imports.razor file. You should have the following imports: Integrating Elastic into the project Now, let’s import the Elasticsearch components: We are going to remove the default sidebar to have more space for our application by removing it from the /Components/Layout/MainLayout.razor file: Now let's enter the Elasticsearch credentials for the user secrets : Using this approach, .Net 8 stores sensitive data in a separate location, outside of the project folder and makes it accessible using the IConfiguration interface. These variables will be available to any .Net project that uses the same user secrets. Then, let's modify the Program.cs file to read the secrets and mount the Elasticsearch client: First, import the necessary libraries: BlazorApp.Services: contains the Elasticsearch service. Elastic.Clients.Elasticsearch: imports the Elasticsearch client .Net 8 library. Elastic.Transport: imports the Elasticsearch transport library, which allows us to use the ApiKey class to authenticate our requests. Second, insert the following code before the var app = builder.Build() line: This code will read the Elasticsearch credentials from the user secrets and create an Elasticsearch client instance. After the ElasticSearch client initialization, add the following line to register the Elasticsearch service: The next step will be to build the search logic in the /Services/ElasticsearchService.cs file: First, import the necessary libraries and models: Second, add the class ElasticsearchService , constructor and variables: Configuring search Now, let's build our search logic: BuildFilters will build the filters for the search query using the selected facets by the user. BuildHybridQuery will build a hybrid search query that combines full text and semantic search. Next, add the search method: SearchBooksAsync : will perform the search using the hybrid query and return the results included aggregations for building the facets. FormatFacets : will format the aggregations response into a dictionary. ConvertFacetDictionary : will convert the facet dictionary into a more readable format. The next step is to create the models that will represent the data returned in the hits of the Elasticsearch query that will be printed as the results in our search page. We start by creating the file /Models/Book.cs and adding the following: Then, setting up the Elastic response in the /Models/Response.cs file and adding the following: Configuring a basic UI Next, add the SearchBar component. In the file /Components/Elasticsearch/SearchBar.razor and add the following: This component contains a search bar and a button to perform the search. Blazor provides great flexibility by allowing generating HTML dynamically using C# code within the same file. Afterwards, in the file /Components/Elasticsearch/Results.razor we will build the results component that will display the search results: Finally, we will need to create facets to filter the search results. Note: Facets are filters that allow users to narrow down search results based on specific attributes or categories, such as product type, price range, or brand. These filters are typically presented as clickable options, often in the form of checkboxes, to help users refine their search and find relevant results more easily. In Elasticsearch context, facets are created using aggregations . We set up facets by putting the following code In the file /Components/Elasticsearch/Facet.razor : This component reads from a terms aggregation on the author , categories , and status fields, and then produces a list of filters to send back to Elasticsearch. Now, let's put everything together. In /Components/Pages/Search.razor file: Our page is working! As you can see, the page is functional but lacks styles. Let's add some CSS to make it look more organized and responsive. Let's start replacing the layout styles. In the Components/Layout/MainLayout.razor.css file: Add the styles for the search page in the Components/Pages/Search.razor.css file: Our page starts to look better: Let's give it the final touches: Create the following files: Components/Elasticsearch/Facet.razor.css Components/Elasticsearch/Results.razor.css And add the styles for Facet.razor.css : For Results.razor.css : Final result: To run the application you can use the following command: dotnet watch You did it! Now you can search for books in your Elasticsearch index by using the search bar and filter the results by author, category, and status. Performing full text and semantic search By default our app will perform a hybrid search using both full text and semantic search . You can change the search logic by creating two separate methods, one for full text and another for semantic search, and then selecting one method to build the query based on the user's input. Add the following methods to the ElasticsearchService class in the /Services/ElasticsearchService.cs file: Both methods work similarly to the BuildHybridQuery method, but they only perform full text or semantic search. You can modify the SearchBooksAsync method to use the selected search method: You can find the complete application here Conclusion Blazor is an effective framework that allows you to build web applications using C#. Elasticsearch is a powerful search engine that allows you to build search applications. Combining both, you can easily build robust search applications, leveraging the power of ESRE to create a semantic search experience in a short time. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to What is Blazor? Why Blazor? What is ESRE? Configuring ELSER Indexing data Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Building a search app with Blazor and Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/search-app-with-esre-blazor","meta_description":"Learn how to build a search app using Blazor and Elasticsearch, including how to add the search bar, configure hybrid search, build search logic, and more."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Integration tests using Elasticsearch This series demonstrates improvements for integration tests using Elasticsearch and advanced techniques to further reduce execution time in Elasticsearch integration tests. Part1 Java How To October 3, 2024 Testing your Java code with mocks and real Elasticsearch Learn how to write your automated tests for Elasticsearch, using mocks and Testcontainers PP By: Piotr Przybyl Part2 How To November 13, 2024 Faster integration tests with real Elasticsearch Learn how to make your automated integration tests for Elasticsearch faster using various techniques for data initialization and performance improvement. PP By: Piotr Przybyl Part3 How To January 31, 2025 Advanced integration tests with real Elasticsearch Mastering advanced Elasticsearch integration testing: Faster, smarter, and optimized. PP By: Piotr Przybyl Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Integration tests using Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/integration-tests-using-elasticsearch","meta_description":"This series demonstrates improvements for integration tests using Elasticsearch and advanced techniques to further reduce execution time in Elasticsearch integration tests."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Live log and prosper: Elasticsearch newly specialized logsdb index mode Elasticsearch’s latest innovation in log management, logsdb, cuts the storage footprint of log data by up to 65%, enabling observability and security teams to expand visibility without exceeding their budget while keeping all data accessible and searchable. How To MS GK AS By: Mark Settle , George Kobar and Amena Siddiqi On December 12, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch's new index mode, logsdb, reduces log storage needs by up to 65% Today, we announce the general availability of Elasticsearch's new index mode, logsdb, which reduces the storage footprint of log data by up to 65% compared to recent versions of Elasticsearch without logsdb. This dramatic improvement enables observability and security teams to expand visibility without exceeding their budget while keeping all data immediately accessible for analysis. Logsdb optimizes the ordering of data, eliminates duplication by reconstructing non-stored field values on the fly with synthetic _source , and improves compression with advanced algorithms and codecs, leveraging columnar storage within Elasticsearch for efficient log storage and retrieval. Enhance analytics and reduce costs by improving storage efficiency with logsdb index mode Logs provide critical signals for detecting and remediating observability and security issues — and their utility is increasing as AI advancements ease the analysis of text-based data — so efficient storage and performant access matter more than ever. Unfortunately, the growing log volume generated by infrastructure and applications is driving up costs, forcing compromises that hamper analysis: limit collection, reduce retention, or relegate fresh data to siloed archive tiers. Logsdb directly addresses these challenges. With greater storage efficiency, you can collect more data and avoid the hassle of complicated data filtering. You can retain logs longer to support threat hunting, incident response, and compliance requirements. And because all data is always searchable, you can get fast insights, no matter how large your data set grows. Technical innovation behind logsdb index mode Logsdb index mode dramatically reduces the disk footprint of log data with smart index sorting, synthetic _source, and advanced compression. Implementing it can reduce log storage needs by up to 65%, compared to recent versions of Elasticsearch without logsdb. While logsdb currently uses more CPU during indexing, its efficient storage reduces overall costs for most customers. For customers who need long-term retention, we expect total cost of ownership (TCO) reductions of up to 50%. Smart index sorting improves storage efficiency by up to 30% and reduces query latency on some logging data sets by locating similar data close together. By default, it sorts indices by host.name and @timestamp. If your data has more suitable fields, you can specify them instead. Advanced compression significantly reduces storage requirements for text-heavy data like logs through Zstandard compression (Zstd), delta encoding, run-length encoding, and other smart codecs that are automatically chosen. Doc-values, which are stored in a columnar format optimized for compression and performance, enable efficient storage and retrieval of field values for sorting, aggregations, and scripting. Synthetic _source enables organizations to trim storage needs by another 20-40% by discarding the _source field and fully or partially reconstructing it on demand. While the feature sometimes requires more compute for indexing and retrieval, testing shows that it delivers measurable net efficiency improvements. Synthetic _source is built on nearly two years of production usage with metrics, with numerous enhancements for logs, including support for nearly all field types. Resulting storage savings are propagated through the index lifecycle phases. A storage reduction of 65% in the hot tier will result in the same reduction in the warm, cold, and frozen tiers, as well as reduce the footprint for storing snapshots in bucket storage. No visibility compromises: Retain all logs for observability and security Logs are the foundation of visibility into infrastructure and applications, providing the simplest and most essential signal for monitoring and troubleshooting. However, costs are rising as logging volumes grow. This challenge is forcing customers to implement complex filtering and management policies, delete data prematurely, and strand relevant logs in stores that require a day or longer to rehydrate before analysis. Without a complete, easily searchable, and accessible data set, finding and resolving issues is substantially more challenging. Logsdb index mode builds on breakthrough Elasticsearch capabilities like searchable snapshots and Automatic Import to address these pain points for operations and security teams: Reduce costs: Logsdb reduces the storage footprint of logs by up to 65%, enabling organizations to reduce storage expenses while retaining more data. This translates to cost savings across all storage tiers — from hot to frozen — and higher productivity for the observability and security teams who use this data. Preserve valuable data: Logsdb keeps all your log data and improves operational efficiency without relying on extra tools or complicated filters. With features like synthetic _source, preserve the value of data without storing the entire source document. Expand visibility: Logsdb provides efficient access to all data on one platform, without separate silos for observability, security, and historical data. For site reliability engineers (SREs), it accelerates problem resolution by enabling analysis of logs alongside metrics, traces, and business data. Likewise, for security operations center (SOC) teams, it accelerates investigation and remediation by eliminating blind spots. Streamline access to data: Logsdb lets SRE teams efficiently retain actionable data for troubleshooting, trending, and analysis. Similarly, SOC teams can swiftly search all of their data for investigation and threat hunting without incurring exorbitant costs. Logsdb is ready for your environment Elasticsearch logsdb index mode is generally available for Elastic Cloud Hosted and Self-Managed customers starting in version 8.17 and is enabled by default for logs in Elastic Cloud Serverless . Basic logsdb capabilities (including smart index sorting and advanced compression) are available to organizations with Standard, Gold, and Platinum licenses. Complete logsdb capabilities that further reduce storage requirements (including synthetic _source) are available to serverless customers and organizations with an Enterprise license. Elasticsearch logsdb in action Logsdb enables you to keep all your log data and improve operational efficiency without narrowing collection or discarding or siloing data. With capabilities like smart index sorting, advanced compression, and synthetic _source, keep and analyze the data you need within a budget that works for you. Want to experience it for yourself? Try Elastic at no cost . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Elasticsearch's new index mode, logsdb, reduces log storage needs by up to 65% Enhance analytics and reduce costs by improving storage efficiency with logsdb index mode Technical innovation behind logsdb index mode No visibility compromises: Retain all logs for observability and security Logsdb is ready for your environment Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Live log and prosper: Elasticsearch newly specialized logsdb index mode - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-logsdb-index-mode","meta_description":"Dive into Logsdb, Elasticsearch's new index mode. Discover Logsdb's capabilities and advantages, including how it reduces log storage needs by up to 65%."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Evaluating search relevance part 1 - The BEIR benchmark Learn to evaluate your search system in the context of better understanding the BEIR benchmark, with tips & techniques to improve your search evaluation processes. ML Research Python TP TV By: Thanos Papaoikonomou and Thomas Veasey On July 16, 2024 Part of Series Evaluating search relevance Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This is the first in a series of blog posts discussing how to think about evaluating your own search systems in the context of better understanding the BEIR benchmark. We will introduce specific tips and techniques to improve your search evaluation processes in the context of better understanding BEIR. We will also introduce common gotchas which make evaluation less reliable. Finally, we note that LLMs provide a powerful new tool in the search engineers' arsenal and we will show by example how one can use them to help evaluate search. Understanding the BEIR benchmark in search relevance evaluation To improve any system you need to be able to measure how well it is doing. In the context of search BEIR (or equivalently the Retrieval section of the MTEB leaderboard) is considered the “holy grail” for the information retrieval community and there is no surprise in that. It’s a very well-structured benchmark with varied datasets across different tasks. More specifically, the following areas are covered: Argument retrieval (ArguAna, Touche2020) Open-domain QA (HotpotQA, Natural Questions, FiQA) Passage retrieval (MSMARCO) Duplicate question retrieval (Quora, CQADupstack) Fact-checking (FEVER, Climate-FEVER, Scifact) Biomedical information retrieval (TREC-COVID, NFCorpus, BioASQ) Entity retrieval (DBPedia) Citation prediction (SCIDOCS) It provides a single statistic, nDCG@10, related to how well a system matches the most relevant documents for each task example in the top results it returns. For a search system that a human interacts with relevance of top results is critical. However, there are many nuances to evaluating search that a single summary statistic misses. Structure of a BEIR dataset Each benchmark has three artefacts: the corpus or documents to retrieve the queries the relevance judgements for the queries (aka qrels ). Relevance judgments are provided as a score which is zero or greater. Non-zero scores indicate that the document is somewhat related to the query. Dataset Corpus size #Queries in the test set #qrels positively labeled #qrels equal to zero #duplicates in the corpus Arguana 8,674 1,406 1,406 0 96 Climate-FEVER 5,416,593 1,535 4,681 0 0 DBPedia 4,635,922 400 15,286 28,229 0 FEVER 5,416,568 6,666 7,937 0 0 FiQA-2018 57,638 648 1,706 0 0 HotpotQA 5,233,329 7,405 14,810 0 0 Natural Questions 2,681,468 3,452 4,021 0 16,781 NFCorpus 3,633 323 12,334 0 80 Quora 522,931 10,000 15,675 0 1,092 SCIDOCS 25,657 1,000 4,928 25,000 2 Scifact 5,183 300 339 0 0 Touche2020 382,545 49 932 1,982 5,357 TREC-COVID 171,332 50 24,763 41,663 0 MSMARCO 8,841,823 6,980 7,437 0 324 CQADupstack (sum) 457,199 13,145 23,703 0 0 Table 1 : Dataset statistics. The numbers were calculated on the test portion of the datasets ( dev for MSMARCO ). Table 1 presents some statistics for the datasets that comprise the BEIR benchmark such as the number of documents in the corpus, the number of queries in the test dataset and the number of positive/negative (query, doc) pairs in the qrels file. From a quick a look in the data we can immediately infer the following: Most of the datasets do not contain any negative relationships in the qrels file, i.e. zero scores, which would explicitly denote documents as irrelevant to the given query. The average number of document relationships per query ( #qrels / #queries ) varies from 1.0 in the case of ArguAna to 493.5 ( TREC-COVID ) but with a value < 5 for the majority of the cases. Some datasets suffer from duplicate documents in the corpus which in some cases may lead to incorrect evaluation i.e. when a document is considered relevant to a query but its duplicate is not. For example, in ArguAna we have identified 96 cases of duplicate doc pairs with only one doc per pair being marked as relevant to a query. By “expanding” the initial qrels list to also include the duplicates we have observed a relative increase of ~1% in the nDCG@10 score on average. Example of duplicate pairs in ArguAna. In the qrels file only the first appears to be relevant (as counter-argument) to query (“test-economy-epiasghbf-pro02a”) When comparing models on the MTEB leaderboard it is tempting to focus on average retrieval quality. This is a good proxy to the overall quality of the model, but it doesn't necessarily tell you how it will perform for you. Since results are reported per data set, it is worth understanding how closely the different data sets relate to your search task and rescore models using only the most relevant ones. If you want to dig deeper, you can additionally check for topic overlap with the various data set corpuses. Stratifying quality measures by topic gives a much finer-grained assessment of their specific strengths and weaknesses. One important note here is that when a document is not marked in the qrels file then by default it is considered irrelevant to the query. We dive a little further into this area and collect some evidence to shed more light on the following question: “How often is an evaluator presented with (query, document) pairs for which there is no ground truth information?\". The reason that this is important is that when only shallow markup is available (and thus not every relevant document is labeled as such) one Information Retrieval system can be judged worse than another just because it “chooses” to surface different relevant (but unmarked) documents. This is a common gotcha in creating high quality evaluation sets, particularly for large datasets. To be feasible manual labelling usually focuses on top results returned by the current system, so potentially misses relevant documents in its blind spots. Therefore, it is usually preferable to focus more resources on fuller mark up of fewer queries than broad shallow markup. Leveraging the BEIR benchmark for search relevance evaluation To initiate our analysis we implement the following scenario (see the notebook ): First, we load the corpus of each dataset into an Elasticsearch index. For each query in the test set we retrieve the top-100 documents with BM25. We rerank, the retrieved documents using a variety of SOTA reranking models. Finally, we report the “judge rate” for the top-10 documents coming from steps 2 (after retrieval) and 3 (after reranking). In other words, we calculate the average percentage of the top-10 documents that have a score in the qrels file. The list of reranking of models we used is the following: Cohere's rerank-english-v2.0 and rerank-english-v3.0 BGE-base mxbai-rerank-xsmall-v1 MiniLM-L-6-v2 Retrieval Reranking Dataset BM25 (%) Cohere Rerank v2 (%) Cohere Rerank v3 (%) BGE-base (%) mxbai-rerank-xsmall-v1 (%) MiniLM-L-6-v2 (%) Arguana 7.54 4.87 7.87 4.52 4.53 6.84 Climate-FEVER 5.75 6.24 8.15 9.36 7.79 7.58 DBPedia 61.18 60.78 64.15 63.9 63.5 67.62 FEVER 8.89 9.97 10.08 10.19 9.88 9.88 FiQa-2018 7.02 11.02 10.77 8.43 9.1 9.44 HotpotQA 12.59 14.5 14.76 15.1 14.02 14.42 Natural Questions 5.94 8.84 8.71 8.37 8.14 8.34 NFCorpus 31.67 32.9 33.91 30.63 32.77 32.45 Quora 12.2 10.46 13.04 11.26 12.58 12.78 SCIDOCS 8.62 9.41 9.71 8.04 8.79 8.52 Scifact 9.07 9.57 9.77 9.3 9.1 9.17 Touche2020 38.78 30.41 32.24 33.06 37.96 33.67 TREC-COVID 92.4 98.4 98.2 93.8 99.6 97.4 MSMARCO 3.97 6.00 6.03 6.07 5.47 6.11 CQADupstack (avg.) 5.47 6.32 6.87 5.89 6.22 6.16 Table 2 : Judge rate per (dataset, reranker) pairs calculated on the top-10 retrieved/reranked documents From Table 2 , with the exception of TREC-COVID (>90% coverage), DBPedia (~65%), Touche2020 and nfcorpus (~35%), we see that the majority of the datasets have a labeling rate between 5% and a little more than 10% after retrieval or reranking. This doesn’t mean that all these unmarked documents are relevant but there might be a subset of them -especially those placed in the top positions- that could be positive. With the arrival of general purpose instruction tuned language models, we have a new powerful tool which can potentially automate judging relevance. These methods are typically far too computationally expensive to be used online for search, but here we are concerned with offline evaluation. In the following we use them to explore the evidence that some of the BEIR datasets suffer from shallow markup. In order to further investigate this hypothesis we decided to focus on MSMARCO and select a subset of 100 queries along with the top-5 reranked (with Cohere v2) documents which are currently not marked as relevant. We followed two different paths of evaluation: First, we used a carefully tuned prompt (more on this in a later post) to prime the recently released Phi-3-mini-4k model to predict the relevance (or not) of a document to the query. In parallel, these cases were also manually labeled in order to also assess the agreement rate between the LLM output and human judgment. Overall, we can draw the following two conclusions † \\dag † : The agreement rate between the LLM responses and human judgments was close to 80% which seems good enough as a starting point in that direction. In 57.6% of the cases (based on human judgment) the returned documents were found to be actually relevant to the query. To state this in a different way: For 100 queries we have 107 documents judged to be relevant, but at least 0.576 x 5 x 100 = 288 extra documents which are actually relevant! Here, some examples drawn from the MSMARCO / dev dataset which contain the query, the annotated positive document (from qrels ) and a false negative document due to incomplete markup: Example 1: Example 2: Manually evaluating specific queries like this is a generally useful technique for understanding search quality that complements quantitive measures like nDCG@10. If you have a representative set of queries you always run when you make changes to search, it gives you important qualitative information about how performance changes, which is invisible in the statistics. For example, it gives you much more insight into the false results your search returns: it can help you spot obvious howlers in retrieved results, classes of related mistakes, such as misinterpreting domain-specific terminology, and so on. Our result is in agreement with relevant research around MSMARCO evaluation. For example, Arabzadeh et al. follow a similar procedure where they employ crowdsourced workers to make preference judgments: among other things, they show that in many cases the documents returned by the reranking modules are preferred compared to the documents in the MSMARCO qrels file. Another piece of evidence comes from the authors of the RocketQA reranker who report that more than 70% of the reranked documents were found relevant after manual inspection. † \\dag † Update - September 9th: After a careful re-evaluation of the dataset we identified 15 more cases of relevant documents, increasing their total number from 273 to 288 Main takeaways & next steps The pursuit for better ground truth is never-ending as it is very crucial for benchmarking and model comparison. LLMs can assist in some evaluation areas if used with caution and tuned with proper instructions More generally, given that benchmarks will never be perfect, it might be preferable to switch from a pure score comparison to more robust techniques capturing statistically significant differences. The work of Arabzadeh et al. provides a nice of example of this where based on their findings they build 95% confidence intervals indicating significant (or not) differences between the various runs. In the accompanying notebook we provide an implementation of confidence intervals using bootstrapping . From the end-user perspective it’s useful to think about task alignment when reading benchmark results. For example, for an AI engineer who builds a RAG pipeline and knows that the most typical use case involves assembling multiple pieces of information from different sources, then it would be more meaningful to assess the performance of their retrieval model on multi-hop QA datasets like HotpotQA instead of the global average across the whole BEIR benchmark In the next blog post we will dive deeper into the use of Phi-3 as LLM judge and the journey of tuning it to predict relevance. Report an issue Related content Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Jump to Understanding the BEIR benchmark in search relevance evaluation Structure of a BEIR dataset Leveraging the BEIR benchmark for search relevance evaluation Main takeaways & next steps Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Evaluating search relevance part 1 - The BEIR benchmark - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/evaluating-search-relevance-part-1","meta_description":"Learn to evaluate your search system in the context of better understanding the BEIR benchmark, with tips & techniques to improve your search evaluation processes."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Improved inference performance with ELSER v2 Learn about the improvements we've made to the inference performance of ELSER v2, achieving a 60% to 120% speed increase over ELSER v1. ML Research TV QH VK By: Thomas Veasey , Quentin Herreros and Valeriy Khakhutskyy On October 17, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. It is well known that modern transformer based approaches to information retrieval often come with significantly higher resource costs when compared with traditional statistical approaches, such as BM25. This can make it challenging to apply these techniques in production. At large scale, at least as much attention needs to be paid to the resource usage of any retrieval solution as to its relevance in order to produce something practically useful. In this final two part blog of our series, we discuss some of the work we did for retrieval and inference performance for the release of version 2 of our Elastic Learned Sparse EncodeR model (ELSER), which we introduced in this previous blog post. In 8.11 we are releasing two versions of the model: one portable version which will run on any hardware and one version which is optimized for the x86 family of architectures. We're still making the deployment process easy though, by defaulting to the most appropriate model for your cluster's hardware. In this first part we focus on inference performance. In the second part we discuss the ongoing work we're doing to improve retrieval performance. However, first we briefly review the relevance we achieve for BEIR with ELSER v2. Improved relevance for BEIR with ELSER v2 For this release we extended our training data, including around 390k high quality question and answer pairs to our fine tune dataset, and improved the FLOPS regularizer based on insights we discussed in the past . Together these changes gave us a bump in relevance measured with our usual set of BEIR benchmark datasets. We plan to follow up with a full description of our training data set composition and the innovations we have introduced, such as improvements to cross-encoder distillation and the FLOPS regularizer at a later date. Since this blog post mainly focuses on performance considerations, we simply give the new NDCG@10 for ELSER v2 model in the table below. NDCG@10 for BEIR data sets for ELSER v1 and v2 (higher is better). The v2 results use the query pruning method described below Quantization in ELSER v2 Model inference in the Elastic Stack is run on CPUs. There are two principal factors which affect the latency of transformer model inference: the memory bandwidth needed to load the model weights and the number of arithmetic operations it needs to perform. ELSER v2 was trained from a BERT base checkpoint. This has just over 100M parameters, which amounts to about 418 MB of storage for the weights using 32 bit floating point precision. For production workloads for our cloud deployments we run inference on Intel Cascade Lake processors. A typical midsize machine would have L1 data, L2 and L3 cache sizes of around 64 KiB, 2 MiB and 33 MiB, respectively. This is clearly much smaller than model weight storage (although the number of weights which are actually used for any given inference is a function of text length). So for a single inference call we get cache misses all the way up to RAM. Halving the weight memory means we halve the memory bandwidth we need to serve an inference call. Modern processors support wide registers which let one perform the same arithmetic operations in parallel on several pieces of data, so called SIMD instructions. The number of parallel operations one can perform is a function of the size of each piece of data. For example, Intel processors allow one to perform 8 bit integer multiplication in 16 bit wide lanes. This means one gets roughly twice as many operations per cycle for int8 versus float32 multiplication and this is the dominant compute cost in an inference call. It is therefore clear if one were able to perform inference using int8 tensors there are significant performance improvements available. The process of achieving this is called quantization. The basic idea is very simple: clip outliers, scale the resulting numbers into the range 0 to 255 and snap them to the nearest integer. Formally, a floating point number x x x is transformed using ⌊ 255 u − l ( clamp ( x , l , u ) − l ) ⌉ \\left\\lfloor\\frac{255}{u - l}(\\text{clamp}(x, l, u) - l)\\right\\rceil ⌊ u − l 255 ​ ( clamp ( x , l , u ) − l ) ⌉ . One might imagine that the accuracy lost in this process would significantly reduce the model accuracy. In practice, large transformer model accuracy is fairly resilient to the errors this process introduces. There is quite a lot of prior art on model quantization. We do not plan to survey the topic in this blog and will focus instead on the approaches we actually used. For background and insights into quantization we recommend these two papers. For ELSER v2 we decided to use dynamic quantization of the linear layers. By default this uses per tensor symmetric quantization of activations and weights. Unpacking this, it rescales values to lie in an interval that is symmetric around zero - which makes the conversion slightly more compute efficient - before snapping. Furthermore, it uses one such interval for each tensor. With dynamic quantization the interval for each activation tensor is computed on-the-fly from their maximum absolute value . Since we want our model to perform well in a zero-shot setting, this has the advantage that we don't suffer from any mismatch in the data used to calibrate the model quantization and the corpus where it is used for retrieval. The maximum absolute weight for each tensor is known in advance, so these can be quantized upfront and stored in int8 format. Furthermore, we note that attention is itself built out of linear layers. Therefore, if the matrix multiplications in linear layers are quantized the majority of the arithmetic operations in the model are performed in int8. Our first attempt at applying dynamic quantization to every linear layer failed: it resulted in up to 20% loss in NDCG@10 for some of our BEIR benchmark data sets. In such cases, it is always worthwhile investigating hybrid quantization schemes. Specifically, one often finds that certain layers introduce disproportionately large errors when converted to int8. Typically, in such cases one performs layer by layer sensitivity analysis and greedily selects the layers to quantize while the model meets accuracy requirements. There are many configurable parameters for quantization which relate to exact details of how intervals are constructed and how they are scoped. We found it was sufficient to choose between three approaches for each linear layer for ELSER v2: Symmetric per tensor quantization, Symmetric per channel quantization and Float32 precision. There are a variety of tools which can allow one to observe tensor characteristics which are likely to create problems for quantization. However, ultimately what one always cares about is the model accuracy on the task it performs. In our case, we wanted to know how well the quantized model preserves the text representation we use for retrieval, specifically, the document scores. To this end, we quantized each layer in isolation and calculated the score MAPE of a diverse collection of query relevant document pairs. Since this had to be done on CPU and separately for every linear layer we limited this set to a few hundred examples. The figure below shows the performance and error characteristics for each layer; each point shows the percentage speed up in inference (x-axis) and the score MAPE (y-axis) as a result of quantizing just one layer. We run two experiments per layer: per tensor and per channel quantization. Relevance scores MAPE for layerwise quantization of ELSER v2 Note that the performance gain is not equal for all layers. The feed forward layers that separate attention blocks use larger intermediate representations so we typically gain more by quantizing their weights. The MLM head computes vocabulary token activations. Its output dimension is the vocabulary size or 30522. This is the outlier on the performance axis; quantizing this layer alone increases throughput by nearly 13%. Regarding accuracy, we see that quantizing the output of the 10 th feed forward module in the attention stack has a dramatic impact and many layers have almost no impact on the scores (< 0.5% MAPE). Interestingly, we also found that the MAPE is larger when quantizing higher feed forward layers. This is consistent with the fact that dropping feed forward layers altogether at the bottom of the attention stack has recently been found to be an effective performance accuracy trade off for BERT. In the end, we chose to disable quantization for around 20% of layers and use per channel quantization for around 15% of layers. This gave us a 0.1% reduction in average NDCG@10 across the BEIR suite and a 2.5% reduction in the worst case. So what does this yield in terms of performance improvements in the end? Firstly, the model size shrank by a little less than 40%, from 418 MB to 263MB. Secondly, inference sped up by between 40% and 100% depending on the text length. The figure below shows the inference latency on the left axis for the float32 and hybrid int8 model as a function of the input text length. This was calculated from 1000 different texts ranging for around 200 to 2200 characters (which typically translates to around the maximum sequence length of 512 tokens). For the short texts in this set we achieve a latency of around 50 ms or 20 inferences per second single threaded for an Intel Xeon CPU @ 2.80GH. Referring to the right axis, the speed-up for these short texts is a little over 100%. This is important because 200 characters is a long query so we expect similar improvements in query latency. We achieved a little under 50% throughput improvement for the data set as a whole. Speed up per thread from hybrid int8 dynamic quantisation of ELSER v2 using an Intel Xeon CPU Block layout of linear layers in ELSER v2 Another avenue we explored was using the Intel Extension for PyTorch (IPEX) . Currently, we recommend our users run Elasticsearch inference nodes on Intel hardware and it makes sense to optimize the models we deploy to make best use of it. As part of this project we rebuilt our inference process to use the IPEX backend. A nice side effect of this was that ELSER inference with float32 is 18% faster in 8.11 and we see increased throughput advantage from hyperthreading. However, the primary motivation was the latest Intel cores have hardware support for bfloat16 format, which makes better performance accuracy tradeoffs for inference than float32. We wanted to understand how this performs. We saw around 3 times speedup using bfloat16, but only with the latest hardware support; so until this is well enough supported in the cloud environment the use of bfloat16 models is impractical. We instead turned our attention to other features of IPEX. The IPEX library provides several optimizations which can be applied to float32 layers. This is handy because, as discussed, we retain around 20% of the model in float32 precision. Transformers don't afford simple layer folding opportunities, so the principal optimization is blocking of linear layers. Multi-dimensional arrays are usually stored flat to optimize cache use. Furthermore, to get the most out of SIMD instructions one ideally loads memory from contiguous blocks into the wide registers which implement them. The operations performed on the model weights in inference alter their access patterns. For any given compute graph one can in theory work out the weight layout which maximizes performance. The optimal arrangement also depends on the instruction set available and the memory bandwidth; usually this amounts to reordering weights into blocks for specific tensor dimensions. Fortunately, the IPEX library has implemented the optimal strategy for Intel hardware for a variety of layers, including linear layers. The figure below shows the effect of applying optimal block layout for float32 linear layers in ELSER v2. The performance was averaged over 5 runs. The effect is small however we verified it is statistically significant (p-value < 0.05). Also, it is consistently slightly larger for longer sequences, so for our representative collection of 1000 texts it translated to a little under 1% increase in throughput. Speed up per thread from IPEX optimize on ELSER v2 using an Intel Xeon CPU Another interesting observation we made is that the performance improvements are larger when using intra-op parallelism . We consistently achieved 2-5% throughput improvement across a range of text lengths using both our VM's allotted physical cores. In the end, we decided not to enable these optimisations. The performance gains we get from them are small and they significantly increase the model memory: our script file increased from 263MB to 505MB. However, IPEX and particularly hardware support for bfloat16 yield significant improvements for inference performance on CPU. This work got us a step closer to enabling this for Elasticsearch inference in the future. Conclusion In this post, we discussed how we were able to achieve between a 60% and 120% speed-up in inference compared to ELSER v1 by upgrading the libtorch backend in 8.11 and optimizing for x86 architecture. This is all while improving zero-shot relevance. Inference performance is the critical factor in the time to index a corpus. It is also an important part of query latency. At the same time, the index performance is equally important for query latency, particularly at large scale. We discuss this in part 2 . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid Retrieval Part 5: Improved inference performance with ELSER v2 Part 6: Optimizing retrieval with ELSER v2 Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Improved relevance for BEIR with ELSER v2 Quantization in ELSER v2 Block layout of linear layers in ELSER v2 Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving information retrieval in the Elastic Stack: Improved inference performance with ELSER v2 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-1","meta_description":"Learn about the improvements we've made to the inference performance of ELSER v2, achieving a 60% to 120% speed increase over ELSER v1."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Quickly create RAG apps with Vertex AI Gemini models and Elasticsearch playground Quickly create a RAG app with Vertex AI Gemini models and Elasticsearch playground Generative AI Integrations How To JV By: Jeff Vestal On September 27, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this blog, we will connect Elasticsearch to Google’s Gemini 1.5 chat model using Elastic’s Playground and Vertex AI API. The addition of Gemini models to Playground enables Google Cloud developers to quickly ground LLMs, test retrieval, tune chunking, and ship gen AI search apps to prod with Elastic. You will need an Elasticsearch cluster up and running. We will use a Serverless Project on Elastic Cloud. If you don’t have an account, you can sign up for a free trial . You will also need a Google Cloud account with Vertex AI Enabled. If you don’t have a Google Cloud account, you can sign up for a free trial . Steps to create RAG apps with Vertex AI Gemini models & Playground 1. Configuring Vertex AI First, we will configure a Vertex AI service account, which will allow us to make API calls securely from Elasticsearch to the Gemini model. You can follow the detailed instructions on Google Cloud’s doc page here , but we will cover the main points. Go to the Create Service Account section of the Google Cloud console. There, select the project which has Vertex AI enabled. Next, give your service account a name and optionally, a description. Click “Create and Continue”. Set the access controls for your project. For this blog, we used the “Vertex AI User” role, but you need to ensure your access controls are appropriate for your project and account. Click Done. The final setup in Google Cloud is to create an API key for the service account and download it in JSON format. Click “KEYS” in your service account then “ADD KEY” and “Create New”. Ensure you select “json” as the key type then click “CREATE”. The key will be created and automatically downloaded to your computer. We will need this key in the next section. 2. Connect to your LLM from Playground With Google Cloud configured, we can continue configuring the Gemini LLM connection in Elastic’s Playground. This blog assumes you already have data in Elasticsearch you want to use with Playground. If not, follow the Search Labs Blog Playground: Experiment with RAG applications with Elasticsearch in minutes to get started. In Kibana, Select Playground from the side navigation menu. In Serverless, this is under the “Build” heading. When that opens for the first time, you can select “Connect to an LLM”. Select “Google Gemini”: Fill out the form to complete the configuration. Open the JSON credentials file created and downloaded from the previous section, copy the complete JSON, and paste it into the “Credentials JSON” section. Then click “Save” 3. It’s Playground Time! Elastic’s Playground allows you to experiment with RAG context settings and system prompts before integrating into full code. By changing settings while chatting with the model, you can see which settings will provide the optimal responses for your application. Additionally, configure which fields in your Elasticsearch data are searched to add context to your chat completion request. Adding context will help ground the model and provide more accurate responses. This step uses Elastic’s ELSER sparse embeddings model , available built-in, for retrieving context via semantic search, that is passed on to the Gemini model. That’s it (for now) Conversational search is an exciting area where powerful large language models, such as those offered by Google Vertex AI are being used by developers to build new experiences. Playground simplifies the the process of prototyping and tuning, enabling you to ship your apps more quickly. Explore more ideas to build with Elasticsearch and Google Vertex AI, and happy searching! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps to create RAG apps with Vertex AI Gemini models & Playground 1. Configuring Vertex AI 2. Connect to your LLM from Playground 3. It’s Playground Time! That’s it (for now) Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Quickly create RAG apps with Vertex AI Gemini models and Elasticsearch playground - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/vertex-ai-elasticsearch-playground-fast-rag-apps","meta_description":"Learn how to quickly build a RAG app with Vertex AI Gemini models and Elasticsearch playground."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing kNN Query: An expert way to do kNN search Explore how the kNN query in Elasticsearch can be used and how it differs from top-level kNN search, including examples. Vector Database How To MS BT By: Mayya Sharipova and Benjamin Trent On December 7, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. kNN search as a top-level section kNN search in Elasticsearch is organized as a top level section of a search request. We have designed it this way so that: It can always return global k nearest neighbors regardless of a number of shards These global k results are combined with a results from other queries to form a hybrid search The global k results are passed to aggregations to form facets. Here is a simplified diagram how kNN search is executed internally (some phases are omitted) : Figure 1: The steps for the top level kNN search are: A user submits a search request The coordinator node sends a kNN search part of the request to data nodes in the DFS phase Each data node runs kNN search and sends back the local top-k results to the coordinator The coordinator merges all local results to form the global top k nearest neighbors. The coordinator sends back the global k nearest neighbors to the data nodes with any additional queries provided Each data node runs additional queries and sends back the local size results to the coordinator The coordinator merges all local results and sends a response to the user We first run kNN search in the DFS phase to obtain the global top k results. These global k results are then passed to other parts of the search request, such as other queries or aggregations. Even the execution looks complex, from a user’s perspective this model of running kNN search is simple, as the user can always be sure that kNN search returns the global k results. Introducing kNN query in Elasticsearch With time we realized there is also a need to represent kNN search as a query. Query is a core component of a search request in Elasticsearch, and representing kNN search as a query allows for flexibility to combine it with other queries to address more complex requests. kNN query, unlike the top level kNN search, doesn’t have a k parameter. The number of results (nearest neighbors) returned is defined by the size parameter, as in other queries. Similar to kNN search, the num_candidates parameter defines how many candidates to consider on each shard while executing a kNN search. kNN query is executed differently from the top level kNN search. Here is a simplified diagram that describes how a kNN query is executed internally (some phases are omitted): Figure 2: The steps for query based kNN search are: A user submits a search request The coordinator sends to the data nodes a kNN search query with additional queries provided Each data node runs the query and sends back the local size results to the coordinator node The coordinator node merges all local results and sends a response to the user We run kNN search on a shard to get num_candidates results; these results are passed to other queries and aggregations on a shard to get size results from the shard. As we don’t collect the global k nearest neighbors first, in this model the number of nearest neighbors collected and visible for other queries and aggregations depend on the number of shards. kNN query API examples Let’s look at API examples that demonstrate differences between the top level kNN search and kNN query. We create an index of products and index some documents: kNN query similar to the top level kNN search, has num_candidates and an internal filter parameter that acts as a pre-filter. kNN query can get more diverse results than kNN search for collapsing and aggregations. For the kNN query below, on each shard we execute kNN search to obtain 10 nearest neighbors which are then passed to collapse to get 3 top results. Thus, we will get 3 diverse hits in a response. The top level kNN search first gets the global top 3 results in the DFS phase, and then passes them to collapse in the query phase. We will get only 1 hit in a response, as all the global 3 nearest neighbors happened to be from the same brand. Similarly for aggregations, a kNN query allows us to get 3 distinct buckets, while kNN search only allows 1. Now, let’s look at other examples that show the flexibility of the kNN query. Specifically, how it can be flexibly combined with other queries. kNN can be a part of a boolean query (with a caveat that all external query filters are applied as post-filters for kNN search). We can use a _name parameter for kNN query to enhance results with extra information that tells if the kNN query was a match and its score contribution. kNN can also be a part of complex queries, such as a pinned query. This is useful when we want to display the top nearest results, but also want to promote a selected number of other results. We can even make the kNN query a part of our function_score query. This is useful when we need to define custom scores for results returned by kNN query: ​ kNN query being a part of dis_max query is useful when we want to combine results from kNN search and other queries, so that a document’s score comes from the highest ranked clause with a tie breaking increment for any additional clause. ​ kNN search as a query has been introduced with the 8.12 release. Please try it out, and we would appreciate any feedback. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to kNN search as a top-level section Introducing kNN query in Elasticsearch kNN query API examples Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing kNN Query: An expert way to do kNN search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/knn-query-elasticsearch","meta_description":"Explore how the kNN query in Elasticsearch can be used and how it differs from top-level kNN search, including examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus Use Elasticsearch Query Language (ES|QL) to run statistical analysis on demographic data index in Elasticsearch. ES|QL Python BA By: Baha Azarmi On August 20, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch Query Language (ES|QL) is designed for fast, efficient querying of large datasets. It has a straightforward syntax which will allow you to write complex queries easily, with a pipe based language, reducing the learning curve. We're going to use ES|QL to run statistical analysis and compare different odds. If you are reading this, you probably want to know how rich you can get before actually reaching the same odds of being hit by a bus. I can't blame you, I want to know too. Let's work out the odds so that we can make sure we win the lottery rather than get in an accident! What we are going to see in this blog is figuring out the probability of being hit by a bus and the probability of achieving wealth. We'll then compare both and understand until what point your chances of getting rich are higher, and when you should consider getting life insurance. So how are we going to do that? This is going to be a mix of magic numbers pulled from different articles online, some synthetics data and the power of ES|QL, the new Elasticsearch Query Language. Let's get started. Data for the ES|QL analysis The magic number The challenge starts here as the dataset is going to be somewhat challenging to find. We are then going to assume for the sake of the example that ChatGPT is always right. Let’s see what we get for the following question: Cough Cough… That sounds about right, this is going to be our magic number. Generating the wealth data Prerequisites Before running any of the scripts below, make sure to install the following packages: Now, there is one more thing we need, a representative dataset with wealth distribution to compute wealth probability. There is definitely some portion of it here and there, but again, for the example we are going to generate a 500K line dataset with the below python script. I am using python 3.11.5 in this example: It should take some time to run depending on your configuration since we are injecting 500K documents here! FYI, after playing with a couple of versions of the script above and the ESQL query on the synthetic data, it was obvious that the net worth generated across the population was not really representative of the real world. So I decided to use a log-normal distribution (np.random.lognormal) for income to reflect a more realistic spread where most people have lower incomes, and fewer people have very high incomes. Net Worth Calculation: Used a combination of random multipliers (np.random.uniform(0.5, 5)) and additional noise (np.random.normal(0, 10000)) to calculate net worth. Added a check to ensure no negative net worth values by using np.maximum(0, net_worths). Not only have we generated 500K documents, but we also used the Elasticsearch python client to bulk ingest all these documents in our deployment. Please note that you will find the endpoint to pass in as hosts Cloud ID in the code above. For the deployment API key, open Kibana, and generate the key in Stack Management / API Keys: The good news is that if you have a real data set, all you will need to do is to change the above code to read your dataset and write documents with the same data mapping. Ok we're getting there! The next step is pouring our wealth distribution. ES|QL wealth analysis Introducing ES|QL: A powerful tool for data analysis The arrival of Elasticsearch Query Language (ES|QL) is very exciting news for our users. It largely simplifies querying, analyzing, and visualizing data stored in Elasticsearch, making it a powerful tool for all data-driven use cases. ES|QL comes with a variety of functions and operators, to perform aggregations, statistical analyses, and data transformations. We won’t address them all in this blog post, however our documentation is very detailed and will help you familiarize with the language and the possibilities. To get started with ES|QL today and run the blog post queries, simply start a trial on Elastic Cloud , load the data and run your first ES|QL query. Understanding the wealth distribution with our first query To get familiar with the dataset, head to Discover in Kibana and switch to ES|QL in the dropdown on the left hand side: Let’s fire our first request: As you could expect from our indexing script earlier, we are finding the documents we bulk ingested, notice the simplicity of pulling data from a given dataset with ES|QL where every query starts with the From clause, then your index. In the query above given we have 500K lines, we limited the amount of returned documents to 10. To do this, we are passing the output of the first segment of the query via a pipe to the limit command to only get 10 results. Pretty intuitive, right? Alright, what would be more interesting is to understand the wealth distribution in our dataset, for this we will leverage one of the 30 functions ES|QL provides, namely percentile. This will allow us to understand the relative position of each data point within the distribution of net worth. By calculating the median percentile (50th percentile), we can gauge where an individual’s net worth stands compared to others. Like our first query, we are passing the output of our index to another function, Stats, which combined with the percentile function will output the median net worth: The median is about 54K, which unfortunately is probably optimistic compared to the real world, but we are not going to solve this here. If we go a little further, we can look at the distribution in more granularity by computing more percentiles: With the below output: The data reveals a significant disparity in wealth distribution, with the majority of wealth being concentrated among the richest individuals. Specifically, the top 5% (95th percentile) possess a disproportionately large portion of the total wealth, with a net worth starting at $852,988.26 and increasing dramatically in the higher percentiles. The 99th percentile individuals hold a net worth exceeding $2 million, highlighting the skewed nature of wealth distribution. This indicates that a substantial portion of the population has modest net worth, which is probably what we want for this example. Another way to look at this is to augment the previous query and grouping by age to see if there is, (in our synthetic dataset), a relation between wealth and age: This could be visualized in a Kibana dashboard. Simply: Navigate to Dashboard Add a new ES|QL visualization Copy and paste our query Move the age field to the horizontal axis in the visualization configuration Which will output: The above suggests that the data generator randomized wealth uniformly across the population age, there is no specific trend pattern we can really see. Median Absolute Deviation (MAD) We calculate the median absolute deviation (MAD) to measure the variability of net worth in a robust manner, less influenced by outliers. With a median net worth of 53 , 787.22 a n d a M A D o f 53,787.22 and a MAD of 53 , 787.22 an d a M A Do f 44,205.44, we can infer the typical range of Net Worth: Most individuals’ net worth falls within a range of 44 , 205.44 a b o v e a n d b e l o w t h e m e d i a n . T h i s g i v e s a t y p i c a l r a n g e o f a p p r o x i m a t e l y 44,205.44 above and below the median. This gives a typical range of approximately 44 , 205.44 ab o v e an d b e l o wt h e m e d ian . T hi s g i v es a t y p i c a l r an g eo f a pp ro x ima t e l y 9,581.78 to $97,992.66. The statistical showdown between Net Worth and Bus Collision Alright, this is the moment to understand how rich we can get, based on our dataset, before getting hit by a bus. To do that, we are going to leverage ES|QL to pull our entire dataset in chunks and load it into a pandas dataframe to build a net worth probability distribution. Finally, we will determine where the ends meet between the net worth and bus collision probabilities. The entire Python notebook is available here . I also recommend you read this blog post which walks you through using ES|QL with pandas dataframes. Helper functions As you can see in the previously referred blog post, we introduced support for ES|QL since version 8.12 of the Elasticsearch python client. Thus our notebook first defines the below functions: The first function is straightforward and executes an ES|QL query, the second is fetching the entire dataset from our index. Notice the trick in there that I am using a counter built-in to a field in my index to paginate through the data. This is workaround I am using while our engineering team is working on the support for pagination in ES|QL . Next, knowing that we have 500K documents in our index, we simply call these function to load the data in a data frame: Fit Pareto distribution Next, we fit our data to a Pareto distribution, which is often used to model wealth distribution because it reflects the reality that a small percentage of the population controls most of the wealth. By fitting our data to this distribution, we can more accurately represent the probabilities of different net worth levels. We can visualize the pareto distribution with the code below: `` Breaking point Finally, with the calculated probability, we determine the target net worth corresponding to the bus hit probability and visualize it. Remember, we use the magic number ChatGPT gave us for the probability of getting hit by a bus: Conclusion Based on our synthetic dataset, this chart vividly illustrates that the probability of amassing a net worth of approximately $12.5 million is as rare as the chance of being hit by a bus. For the fun of it, let’s ask ChatGPT what the probability is: Okay… $439 million? I think ChatGPT might be hallucinating again. Report an issue Related content Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo ES|QL Developer Experience April 15, 2025 ES|QL Joins Are Here! Yes, Joins! Elasticsearch 8.18 includes ES|QL’s LOOKUP JOIN command, our first SQL-style JOIN. TP By: Tyler Perkins ES|QL Inside Elastic April 15, 2025 Native joins available in Elasticsearch 8.18 Exploring LOOKUP JOIN, a new ES|QL command available in tech preview in Elasticsearch 8.18. CL By: Costin Leau ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Jump to Data for the ES|QL analysis The magic number Generating the wealth data Prerequisites ES|QL wealth analysis Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"An Elasticsearch Query Language (ES|QL) analysis: Millionaire odds vs. hit by a bus - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-query-language-esql-statistical-analysis","meta_description":"Learn how to use Elasticsearch Query Language (ES|QL) for statistical analysis through a practical example."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog From ES|QL to native Pandas dataframes in Python Learn how to export ES|QL queries as native Pandas dataframes in Python through practical examples. ES|QL Python How To QP By: Quentin Pradet On September 5, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Since Elasticsearch 8.15 or with Elasticsearch Serverless, ES|QL responses support the Apache Arrow streaming format . This blog post will show you how to take advantage of it in Python. In an earlier blog post , I demonstrated how to convert ES|QL queries to Pandas dataframes using CSV as an intermediate representation. Unfortunately, CSV requires explicit type declarations, is slow (especially for larger datasets) and does not handle nested arrays and objects. Apache Arrow lifts all these limitations. ES|QL to Pandas dataframes in Python Importing test data First, let's import some test data. As before, we will be using the employees sample data and mappings . The easiest way to load this dataset is to run these two Elasticsearch API requests in the Kibana Console . Converting dataset to a Pandas DataFrame object OK, with that out of the way, let's convert the full employees dataset to a Pandas DataFrame object using the ES|QL Arrow export: Even though this dataset only contains 100 records, we use a LIMIT command to avoid ES|QL warning us about potentially missing records. This prints the following dataframe: OK, so what actually happened here? Given format=\"arrow\" , Elasticsearch returns binary Arrow streaming data The Elasticsearch Python client looks at the Content-Type header and creates a PyArrow object Finally, PyArrow's Pandas integration converts the PyArrow object to a Pandas dataframe. Note that the types_mapper=pd.ArrowDtype parameter asks Pandas to use a PyArrow backend instead of a NumPy backend, since the source data is PyArrow. While this backend is not enabled by default for compatibility reasons, it has many advantages : it handles missing values, is faster, more interopable and supports more types. (This is not a zero copy conversion , however.) For this example to work, the Pandas and PyArrow optional dependencies need to be installed. If you want to use another dataframe library such as Polars instead, you don't need Pandas and can directly use polars.from_arrow to create a Polars DataFrame from the PyArrow table returned by the Elasticsearch client. One limitation is that Elasticsearch does not currently handle multi-valued fields, which is why we had to drop the is_rehired , job_positions and salary_change columns. This limitation will be lifted in a future version of Elasticsearch. Anyway, you now have a Pandas dataframe that you can use to analyze your data further. But you can also continue massaging the data using ES|QL, which is particularly useful when queries return more than 10,000 rows, the current maximum number of rows that ES|QL queries can return. More complex queries In the next example, we're counting how many employees are speaking a given language by using STATS ... BY (not unlike GROUP BY in SQL). And then we sort the result with the languages column using SORT : Unlike with CSV, we did not have to specify any types, as Arrow data already includes types. Here's the result: 21 employees speak 5 languages, wow! And 10 employees did not declare any spoken language. The missing value is denoted by <NA> , which is consistently used for missing data with the PyArrow backend. If we had used the NumPy backend instead, this column would have been converted to floats and the missing value would have been a confusing NaN , as NumPy integers don't have any sentinel value for missing data . Queries with parameters Finally, suppose that you want to expand the query from the previous section to only consider employees that speak N or more languages, with N being a variable parameter. For this we can use ES|QL's built-in support for parameters , which eliminates the risk of an injection attack associated with manually assembling queries with variable parts: which prints the following: Conclusion As we saw, ES|QL's native Arrow support makes working with Pandas and other DataFrame libraries even nicer than using CSV and it will continue to improve over time, with the multi-value support coming in a future version of Elasticsearch. Additional resources If you want to learn more about ES|QL, the ES|QL documentation is the best place to start. You can also check out this other Python example using Boston Celtics data . To know more about the Python Elasticsearch client itself, you can refer to the documentation , ask a question on Discuss with the language-clients tag or open a new issue if you found a bug or have a feature request. Thank you! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to ES|QL to Pandas dataframes in Python Importing test data Converting dataset to a Pandas DataFrame object More complex queries Queries with parameters Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"From ES|QL to native Pandas dataframes in Python - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-pandas-native-dataframes-python","meta_description":"Learn how to export ES|QL queries as native Pandas dataframes in Python through practical examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch piped query language, ES|QL, now generally available Elasticsearch Query Language (ES|QL) is now GA. Explore ES|QL's capabilities, learn about ES|QL in Kibana and discover future advancements. ES|QL How To CL GK By: Costin Leau and George Kobar On June 5, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Today, we are pleased to announce the general availability of ES|QL (Elasticsearch Query Language), a dynamic language designed from the ground up to transform, enrich, and simplify data investigations. Powered by a new query engine , ES|QL delivers advanced search using simple and familiar query syntax with concurrent processing, enhancing speed and efficiency regardless of the data source and structure. With ES|QL's piped syntax, users can easily chain multiple operations, simplifying complex data investigations and making querying more intuitive and iterative. To security and observability users, ES|QL will feel both familiar and innovative for exposing Elasticsearch's advanced search capabilities with an easy-to-use query language. Integrated with Kibana, ES|QL enhances the data visualization and analysis experience enabling users to conduct their entire investigation on one screen, without switching between multiple windows. With continuous development, we aim to establish ES|QL as a versatile language for all Elasticsearch use cases, including retrieval augmented generation (RAG). Integrating RAG with geospatial capabilities and ES|QL will enhance query accuracy from diverse data sources. The combination of ES|QL and the new Search AI Lake architecture provides enhanced scalability, cost efficiency, and simplified management by automatically adjusting resources based on demand. Decoupling compute from storage and index from search improves performance and flexibility, ensuring faster data retrieval and investigations across vast amounts of data. ES|QL will be a differentiator for teams facing increasing observability and security demands. This article will dive into the various benefits and ways you can use ES|QL for your own use cases. Advancements in Elasticsearch For over 14 years, QueryDSL has served as the foundational language in Elasticsearch, delivering search , observability , and security to numerous organizations . As user needs evolved, it became clear that they required more than what QueryDSL alone could provide. They sought a query language that could not only simplify and streamline data investigations but also enhance the querying experience by integrating searching, enrichment, aggregation, and visualization into a singular, efficient interface. They desired advanced search capabilities, including lookups with concurrent processing to handle vast data volumes from varied sources and structures. In response, we developed the Elasticsearch Query Language (ES|QL), drawing inspiration from vectorized query execution and other database technologies. With ES|QL, users can utilize a familiar pipe ('|') syntax to chain operations, allowing for transformative and detailed data analysis. Powered by a robust query engine, ES|QL offers advanced search capabilities with concurrent processing across cores and nodes, enabling users to query across diverse data sources and structures seamlessly. There is no translation or transpilation to Query DSL; each ES|QL query is parsed, analyzed, and validated for semantics and optimized into an execution plan executed in parallel on the relevant nodes holding the data. The target nodes handle the query, making on-the-fly adjustments to the execution plan using the framework provided by ES|QL. The result is lightning-fast queries that you get out of the box. The road to GA Since its introduction in 8.11 , ES|QL has been on a journey of refinement and enhancement. The beta phase allowed our engineering team to gather valuable feedback from the community, enabling us to iterate and address the top needs of our users. Throughout this process, we enhanced ES|QL's capabilities while ensuring stability, performance, and seamless integration into core data exploration and visualization UX and workflows you use daily. Here are some features that brought ES|QL to general availability. Stability and performance We have been busy enhancing the dedicated ES|QL query engine to ensure it maintains robust performance under load, safeguarding the stability of the running node. To wit, see below the improvements in grouping in the last 6 months (for more tests and exact details about the underlying change see the dedicated benchmark page). Additionally, we've implemented memory tracking for precise resource management and conducted thorough stress tests, including the rigorous HeapAttack , to ensure that memory usage is carefully monitored during resource-intensive queries. Our circuit breakers are also in place to prevent OutOfMemoryErrors (OOMEs) on large and small heap sizes nodes. Visualize data in Kibana Discover in a whole new way with ES|QL ES|QL together with Elastic AI assistant We are excited about bringing generative AI and ES|QL together by first integrating them into the Observability and Security AI assistant, allowing users to input natural language translated into ES|QL commands for an easy, iterative, and smooth workflow. Visualize and perform ES|QL queries or edit them using the inline editing flyout, and seamlessly embed them into dashboards. This enhancement shortens the workflow by allowing in-line visualization editing when creating charts, making it easier for users to manage and save their visualizations directly within the assistant. Delivering significant improvements in query generation and performance. Users can now use natural language to visualize ES|QL queries, edit them using the inline editing flyout, and seamlessly embed them into dashboards. This enhancement shortens the workflow by allowing in-line visualization editing when creating charts, making it easier for users to manage and save their visualizations directly within the assistant. Create and edit ES|QL charts directly from the Kibana dashboard Streamline your workflow and deliver quick insights into your data by creating and modifying charts built with ES|QL directly from within the Kibana Dashboard . You can also perform inline editing of the ES|QL query while in the chart to adapt to changes in troubleshooting or threat hunting quickly. ES|QL query history It can be frustrating to repeat yourself and equally annoying if you need to rerun a query you executed a few moments ago. Now with ES|QL, you can quickly access recent queries with ES|QL query history. View, re-run your last 20 ES|QL queries directly within Kibana Discover , ES|QL visualizations , Kibana alerts , or Kibana maps for quick and easy access. Hybrid planning and dynamic data reduction For large Elasticsearch deployments, we have been testing ES|QL across hundreds of nodes and up to hundreds of thousands of shards and fields to ensure that query performance consistently remains performant as the cluster grows and more nodes are added. We have extended ES|QL ability to perform hybrid planning to better deal with the dynamic nature of the data (whether it’s new fields added or new segments) and exploit the local data patterns particular to each node: After the coordinating node (that receives the ES|QL query and drives its execution) performs global planning based on the global view of the data, it broadcasts the plan to all data nodes that can execute the plan. However, before executing, each node changes the plan locally based on the actual storage statistics individual to each node. A common scenario is early filter evaluation in sparse mappings due to the schema evolution. We are proactively developing a dynamic data reduction technique for scenarios with large shard sizes that minimize I/O traffic between the coordinator and data nodes, as well as reducing the duration that Lucene readers remain open during queries. This approach, which includes sharing intermediate results, shows great promise in enhancing the efficiency and runtime of queries across multiple shards. Stay tuned for more information about query execution and architecture in future blogs. Async querying Async querying empowers users to run long-running ES|QL queries asynchronously. Clients no longer have to wait idly for results; instead, they can monitor progress and retrieve data once it's ready. By utilizing the wait_for_completion_timeout parameter, users can tailor their experience, choosing whether to wait synchronously or switch to asynchronous mode after a specified timeout. This enhancement not only offers greater flexibility but also optimizes resource management, ensuring a smoother and more efficient querying process for our users Long-running ES|QL queries can be executed asynchronously so the client can monitor the progress and retrieve the results when available instead of blocking for them: Through the wait_for_completion_timeout clients can pick a comfortable timeout to wait for the result (and have synchronous behavior) before switching to an asynchronous one. Improved language and ergonomics We've streamlined the STATS command to offer greater flexibility and simplicity in data analysis. Previously, users had to resort to additional EVAL commands for arbitrary computations alongside aggregations and groupings which required a separate EVAL command: This restriction is no longer necessary as aggregations accept expressions (and themselves can also be combined) directly inside the STATS command, eliminating the need for extra EVALs and column pollution due to temporary fields: Date time units ES|QL now boasts improved support for datetime filtering. Recognizing the common need for date-time arithmetic in filtering tasks, ES|QL now supports abbreviated units, making queries more intuitive and efficient. For example, users can now easily specify date ranges using familiar abbreviations like 'year,' 'month,' and 'week.' This update simplifies query construction, enabling users to express datetime conditions more succinctly and accurately. Implicit data type conversion for string literals To minimize the friction of creating dedicated types (such as dates) from string declarations, ES|QL now performs implicit conversions of string constants to their target type by using the built-in conversion functions: Note that Only constants (or literals) are candidates for conversions, columns are ignored - the user has to use conversion functions for those explicitly. Converting string literals to their numeric equivalent is NOT supported, as these can be directly declared as such; that is “1” + 2 will throw an error, simply declare the expression as 1+2 instead. Native ES|QL clients While ES|QL is universally available through the _query REST endpoint, work is underway for offering rich, opinionated APIs for accessing ES|QL natively in various popular languages. While completing all the items above will take several releases, one can use ES|QL already through the regular Elasticsearch clients , for example, to access ES|QL results as Java or PHP objects and manipulate them as dataframes in Python ; Jupyter users should refer to the dedicated getting started guide notebook . Since the initial release as technical preview in 8.11, ES|QL has been making its way through various parts of the Elasticsearch ecosystem. Such as observability where it is used to streamline OTel operations using a specialized AI assistant . And if we had more time, we’d also mention the many other functions introduced, like multi-value scalar fields, geo-spatial analysis (both scalar and aggregate functions) and date time handling. ES|QL in cross-cluster search in technical preview Cross-cluster search in Elasticsearch enables users to query data across multiple Elasticsearch clusters as if it were stored in a single cluster, delivering unified querying, global insights, and many other efficiencies. Now, in technical preview, ES|QL with cross-cluster search capabilities extends its querying power to span across distributed clusters, empowering users to leverage ES|QL for querying and analyzing data regardless of its location all from a single UI. While ES|QL is available as a basic license at no cost, using ES|QL in cross cluster search will require an Enterprise level license. To use ES|QL in cross-cluster search, use the FROM command with the format <remote_cluster_name>:<target>, to retrieve data from my-index-000001 on the remote cluster. Looking to the future Search, embeddings and RAG We are thrilled to share an exciting development: leveraging ES|QL for advanced information retrieval, including full-text search and AI/ML-powered exploration. Our team is dedicated to making ES|QL the optimal tool for scoring, hybrid ranking, and integrating with Large Language Models (LLMs) within Elasticsearch. This dedicated command will streamline the retrieval process, enabling users to filter and score results. In the below example, we showcase a comprehensive search scenario, combining range filters, fast queries, and hybrid search techniques. This is a preview of how it might look like, naming TBD (SEARCH or RETRIEVAL): For instance, the query above demonstrates retrieving the top 5 most popular images by rating, featuring the terms 'mountain lake' in their description and resembling a user-defined image vector. Behind the scenes, the engine intelligently manages filters, rearranges queries, and applies reranking strategies, ensuring optimal search performance. This advancement promises to revolutionize information retrieval in Elasticsearch, offering users unparalleled control and efficiency in exploring and discovering relevant content. Timeseries, metrics and O11y Elasticsearch provides a dedicated solution for metrics through the timeseries data streams (TSDS), a powerful concept that can reduce disk storage by up to 70% by using specialized types and routing. We plan on leveraging fully these capabilities in ES|QL - first by introducing a dedicated command: Inline stats - aggregations without data reduction The STATS command in ES|QL is invaluable for summarizing statistics, but it often poses a challenge when users want to aggregate data without losing its original context. For instance, if you wish to display the average category price alongside each individual t-shirt price, traditional aggregation methods can obscure the original data. Enter INLINESTATS: a feature designed to address this issue by performing 'inline' statistics. With INLINESTATS, users can compute statistics within each group and seamlessly integrate the results back into the original dataset, preserving the context of the originating groups. This powerful capability enhances the clarity and depth of statistical analysis in ES|QL, empowering users to derive meaningful insights while maintaining the integrity of their data. Get started today The introduction of ES|QL marks a significant stride forward in Elastic's capabilities, offering users a powerful and intuitive tool for data querying and analysis. With its streamlined syntax, robust functionality, and innovative features, ES|QL opens up new avenues for users to unlock insights and derive value from their data. Whether you're a seasoned Elasticsearch user or just getting started, ES|QL invites you to explore, experiment, and experience the power of Elasticsearch Query Language firsthand. Be sure to check out our demo playground full of examples or try on Elastic Cloud . Already have Elasticsearch running? Just upgrade your clusters to 8.14 and give it a try. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Advancements in Elasticsearch The road to GA Stability and performance Visualize data in Kibana Discover in a whole new way with ES|QL ES|QL together with Elastic AI assistant Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch piped query language, ES|QL, now generally available - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-piped-query-language-goes-ga","meta_description":"Elasticsearch Query Language (ES|QL) is now GA. Explore ES|QL's capabilities, learn about ES|QL in Kibana and discover future advancements."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog ES|QL queries to Java objects Learn how to perform ES|QL queries with the Java client. Follow this guide for step-by-step instructions, including examples. ES|QL Java How To LT By: Laura Trotta On May 2, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. ES|QL overview ES|QL is a new query language introduced by Elasticsearch that combines a simplified syntax with the pipe operator to enable users to intuitively extrapolate and manipulate data. The new version 8.13.0 of the official Java client introduced support for ES|QL queries, with a new API that allows for easy query execution and automatic translation of the results to java objects. How to perform ES|QL queries with the Java client Prerequisites Elasticsearch version >= 8.11.0 Java version >= 17 Ingesting data Before we start querying we need to have some data available: we're going to store this csv file into Elasticsearch by using the BulkIngester utility class available in the Java client. The csv lists books from the Amazon Books Reviews dataset , categorizing them using the following header row: First of all, we have to create the index to map the fields correctly: Then the Java class for the books: We're going to use Jackson's CSV mapper to read the file, so let's configure it: Then we'll read the csv file line by line and optimize the ingestion using the BulkIngester: The indexing will take around 15 seconds, but when it's done we'll have the books index filled with ~80K documents, ready to be queried. ES|QL Now it's time to extract some information from the books data. Let's say we want to find the latest reprints of Asimov's works: Thanks to the ObjectsEsqlAdapter using Book.class as the target, we can ignore what the json result of the ES|QL query would be, and just focus on the more familiar list of books that is automatically returned by the client. For those who are used to SQL queries and the JDBC interface, the client also provides the ResultSetEsqlAdapter , which can be used in the same way and instead returns a java.sql.ResultSet Another example, we now want to find out the top-rated books from Penguin Books: The Java code to retrieve the data stays the same since the result is again a list of books. There are exceptions of course, for example if a query uses the eval command to add a new column, the Java class should be modified to represent the new result. The full code for this article can be found in the official client repository . Feel free to reach out on Discuss for any questions or issues. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to ES|QL overview How to perform ES|QL queries with the Java client Prerequisites Ingesting data ES|QL Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"ES|QL queries to Java objects - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/esql-queries-to-java-objects","meta_description":"Learn how to perform ES|QL queries with the Java client. Follow this guide for step-by-step instructions, including examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Vector search introduction and implementation This series dives into the intricacies of vector search, how it is implemented in Elasticsearch, and how to run hybrid search queries in Elasticsearch. Part1 Vector Database February 6, 2025 A quick introduction to vector search This article is the first in a series of three that will dive into the intricacies of vector search, also known as semantic search, and how it is implemented in Elasticsearch. VC By: Valentin Crettaz Part2 Vector Database How To February 10, 2025 How to set up vector search in Elasticsearch Learn how to set up vector search and execute k-NN searches in Elasticsearch. VC By: Valentin Crettaz Part3 Vector Database How To February 17, 2025 Elasticsearch hybrid search Learn about hybrid search, the types of hybrid search queries Elasticsearch supports, and how to craft them. VC By: Valentin Crettaz Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Vector search introduction and implementation - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/vector-search-introduction-and-implementation","meta_description":"This series dives into the intricacies of vector search, how it is implemented in Elasticsearch, and how to run hybrid search queries in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Lucene Wrapped 2024 2024 has been another major year for Apache Lucene. In this blog, we’ll explore the key highlights. Lucene CH By: Chris Hegarty On January 3, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Apache Lucene has seen significant activity in 2024, with numerous releases including the first major update in three years, packed with exciting improvements and new features. Let’s explore some of the key highlights. Lucene & the community A project is only as strong as the community that supports it. Despite more than 20 years of development, the Lucene project remains vibrant and thrives thanks to its passionate and active contributors. In 2024, the Lucene project has seen more than 2,000 commits from 98 unique contributors, and almost 800 pull requests. The number of contributors continues to grow, with new committers and PMC members joining the project and helping drive its success. Lucene 10 2024 saw the first major release in almost 3 years - Lucene 10, with more than 2,000 commits from 185 unique contributors. While the development model that Lucene follows allows to deliver many improvements and features in minor releases, a major release affords the opportunity to bring larger features and modernizations. For example, Lucene 10 requires a minimum of Java 21. Bumping the minimum Java version ensures that Lucene can continue to take advantage of improvements that modern Java provides. The primary focus of Lucene 10 is to better utilize the hardware on which it runs. Let's take a quick look at some of the main highlights: More search parallelism - while search execution is already parallelized across segments, we now go further, parallelizing within segments. This decouples on-disk representation from the execution performance, allowing even single segments to benefit from the number of cores on modern systems. Better I/O parallelism - the straightforward synchronous I/O model that Lucene uses has been enhanced with a prefetch stage. This informs the OS that a region of an index file will be needed in the very near future, while not blocking the calling thread. Better CPU and storage efficiency with sparse indexing - Lucene 10 introduces support for sparse indexing, sometimes called primary-key indexing or zone indexing in other data stores. For more information about Lucene 10, check out the dedicated article on Lucene 10. Lucene research and innovation In 2024, Lucene has seen a surge of research and innovation, particularly in the areas of machine learning integration, vector search, and optimization for large-scale datasets, with reference form 10 separate research papers and publications . Some of the key research areas and developments include: Vector Search and Embedding Support - Lucene provides a powerful and scalable solution for vector-based search, enabling semantic retrieval at scale. By leveraging Lucene's robust indexing and search infrastructure, users can combine the best of traditional text search with the advanced capabilities of modern vector search, making Lucene a comprehensive solution for a wide range of search and information retrieval tasks. Hybrid Search Models - Research has also delved into hybrid search techniques, where Lucene combines traditional keyword-based search with modern vector-based retrieval. By merging term-based indexes with dense vector representations, Lucene can deliver more accurate and contextually relevant search results, bridging the gap between the precision of traditional search engines and the flexibility of semantic search. The ongoing research efforts in 2024 demonstrate Lucene’s adaptability to the evolving needs of modern search technologies, particularly in the context of AI, semantic search, and big data applications. The project continues to grow as a powerful, flexible, and efficient platform for both traditional and cutting-edge search use cases. 2024 Lucene releases Although not an exact reflection, the sheer volume of releases highlights the ongoing dedication and energy of the community. These updates include major enhancements to vector search performance and efficiency, support for madvise, optimizations for postings list decoding, further speed improvements through SIMD, and much more. Here’s the full list of releases: 10.1.0 (2024-12-20) 9.12.1 (2024-12-13) 10.0.0 (2024-10-14) 9.12.0 (2024-09-28) 8.11.4 (2024-09-24) 9.11.1 (2024-06-27) 9.11.0 (2024-06-06) 9.10.0 (2024-02-20) 8.11.3 (2024-02-08) 9.9.2 (2024-01-29) You can find more information and release notes at the Lucene Core page. Additionally, there are equivalent PyLucene releases. Wrapping up As Lucene matures, it continues to flourish thanks to its dedicated and vibrant community. As we’ve seen, 2024 has been an incredibly productive year, and we now look ahead to the exciting developments that 2025 will bring. Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Lucene & the community Lucene 10 Lucene research and innovation 2024 Lucene releases Wrapping up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Lucene Wrapped 2024 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/apache-lucene-wrapped-2024","meta_description":"Explore the key highlights, improvements and features of Apache Lucene, including an overview of Lucene 10."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial .NET Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Generative AI .NET +3 December 6, 2024 How to use Elasticsearch Vector Store Connector for Microsoft Semantic Kernel for AI Agent development Microsoft Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase. With the release of Semantic Kernel Elasticsearch Vector Store Connector, developers using Semantic Kernel for building AI agents can now plugin Elasticsearch as a scalable enterprise-grade vector store while continuing to use Semantic Kernel abstractions. FB SM By: Florian Bernd and Srikanth Manvi Developer Experience .NET October 15, 2024 NEST lifetime extended & Elastic.Clients.Elasticsearch (v8) Roadmap Announcing the extension of the NEST (v7) lifetime and providing a high level overview of the Elastic.Clients.Elasticsearch (v8) roadmap. FB By: Florian Bernd Vector Database .NET +2 October 9, 2024 Building a search app with Blazor and Elasticsearch Learn how to build a search application using Blazor and Elasticsearch, and how to use the Elasticsearch .NET client for hybrid search. GL By: Gustavo Llermaly .NET How To April 16, 2024 Elasticsearch .NET client evolution: From NEST to Elastic.Clients.Elasticsearch Learn about the evolution of the Elasticsearch .NET client and the transition from NEST to Elastic.Clients.Elasticsearch. FB By: Florian Bernd Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":".NET - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/dot-net-programming","meta_description":".NET articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to set up vector search in Elasticsearch Learn how to set up vector search and execute k-NN searches in Elasticsearch. Vector Database How To VC By: Valentin Crettaz On February 10, 2025 Part of Series Vector search introduction and implementation Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. This article is the second in a series of three that dives into the intricacies of vector search also known as semantic search and how it is implemented in Elasticsearch. The first part was focused on providing a general introduction to the basics of embeddings (aka vectors) and how vector search works under the hood. Armed with all the vector search knowledge learned in the first article, this second part will guide you through the meanders of how to set up vector search and execute k-NN searches in Elasticsearch. In the third part , we will leverage what we learned in the first two parts and build upon that knowledge by delving into how to craft powerful hybrid search queries in Elasticsearch. Some background first Even though Elasticsearch did not support vector search up until version 8.0 with the technical preview of the _knn_search API endpoint, it has been possible to store vectors using the dense_vector field type since the 7.0 release. At that point, vectors were simply stored as binary doc values but not indexed using any of the algorithms that we presented in our first article . Those dense vectors constituted the premises of the upcoming vector search features in Elasticsearch. If you’re interested in diving more into the discussions that led to the current implementation of vector search in Elasticsearch, you can refer to this issue , which details all the hurdles that Elastic had to jump over in order to bring this feature to market. Very briefly, since Elasticsearch already made heavy use of Lucene as their underlying search engine, we also decided to utilize the same technology as our vector engine, and we explained the rationale behind that decision in a very transparent way. With history matters out of the way, let’s now get to work. How to set up k-NN Vector search is available natively in Elasticsearch, and there’s nothing specific to install. We only need to create an index that defines at least one field of type dense_vector , which is where your vector data will be stored and/or indexed. The mapping below shows a dense_vector field called title_vector of dimension 3. Dense vectors stored and indexed in this field will use the dot_product similarity function that we introduced in the first article in this series. It is worth noting that up to 8.11 hnsw (i.e., Hierarchical Navigable Small Worlds ) was the only algorithm supported by Apache Lucene for indexing dense vectors. Since then, other algorithms have been added and in the future, Elasticsearch might provide additional methods for indexing and searching dense vectors, but since it fully relies on Apache Lucene that will depend on what unfolds on that front . The table below summarizes all available configuration parameters for the dense_vector field type provided by Elasticsearch: Table 1: The different configuration parameters for dense vectors Parameter Required Description dims Yes (<8.11) No (8.11+) The number of vector dimensions, which can’t exceed 1024 until 8.9.2, 2048 since 8.10.0 and 4096 since 8.11.0. Also, as of 8.11, this parameter is not required anymore and will default to the dimension of the first indexed vector. element_type No The data type of the vector element values. If unspecified, the default type is `float` (4 bytes), `byte` (1 byte) and `bit` are also available. index No Indicates whether to index vectors (if `true`) in a dedicated and optimized data structure or simply store them as binary doc values (if `false`). Until 8.10, the default value was `false` if not specified. As of 8.11, the default value is `true` if not specified. similarity Yes (<8.11) No (8.11+) Until 8.10, this parameter is required if `index` is `true` and defines the vector similarity metric to use for k-NN search. The available metrics are: a) `l2_norm`: L2 distance b) `dot_product`: dot product similarity c) `cosine`: cosine similarity d) `max_inner_product`: maximum inner product similarity. Also note that `dot_product` should be used only if your vectors are already normalized (i.e., they are unit vectors with magnitude 1), otherwise use `cosine` or `max_inner_product`. As of 8.11, if not specified, this parameter defaults to `l2_norm` if element_type is `bit` and to `cosine` otherwise. index_options No Here are the possible values for the `type` parameter depending on the version: a) Up until 8.11, only `hnsw` was supported. b) In 8.12, scalar quantization enabled `int8_hnsw` c) In 8.13, `flat` was added along with its scalar-quantized `int8_flat` sibling d) In 8.15, `int4_hnsw` and `int4_flat` were added e) In 8.18, binary quantization enabled `bbq_hnsw` and `bbq_flat`. You can check the official documentation to learn about their detailed description (https://www.elastic.co/guide/en/elasticsearch/reference/current/dense-vector.html#dense-vector-params) and how each algorithm can be configured. As we can see in the above table, since the version 8.11 the definition of vector fields has been drastically simplified: Regarding the support for scalar quantization added in 8.12, remember we talked about this compression technique in the first part of this series. We won’t dig deeper in this article, but you can learn more about how this was implemented in Lucene in another Elastic Search Labs article . Similarly, we won’t dive into better binary quantization (BBQ) added in 8.18, and we invite you to learn more about that new groundbreaking algorithm in this article . That’s all there is to it! By simply defining and configuring a dense_vector field, we can now index vector data in order to run vector search queries in Elasticsearch using either the knn search option or the knn DSL query (introduced in 8.12). Elasticsearch supports two different vector search modes: 1) exact search using the script_score query and 2) approximate nearest neighbor search using the knn search option or the knn query (8.12+). We’re going to describe both modes next. Exact search If you recall from the first article of this series, where we reviewed the vector search landscape, an exact vector search simply boils down to performing a linear search, or brute-force search, across the full vector space. Basically, the query vector will be measured against each stored vector in order to find the closest neighbors. In this mode, the vectors do not need to be indexed in an HNSW graph but simply stored as binary doc values, and the similarity computation is run by a custom Painless script. First, we need to define the vector field mapping in a way that the vectors are not indexed, and this can be done by specifying index: false and no similarity metric in the mapping: The advantage of this approach is that vectors do not need to be indexed, which drastically lowers the ingestion time since there’s no need to build the underlying HNSW graph. However, depending on the size of the data set and your hardware, search queries can slow down pretty quickly as your data volume grows, since the more vectors you add, the more time is needed to visit each one of them (i.e., linear search has an O(n) complexity). With the index being created and the data being loaded, we can now run an exact search using the following script_score query: As you can see, the script_score query is composed of two main elements, namely the query and the script . In the above example, the query part specifies a filter (i.e., price >= 100 ), which restrains the document set against which the script will be executed. If no query was specified, it would be equivalent to using a match_all query, in which case the script would be executed against all vectors stored in the index. Depending on the number of vectors, the search latency can increase substantially. Since vectors are not indexed, there’s no built-in algorithm that will measure the similarity of the query vector with the stored ones, this has to be done through a script, and luckily for us, Painless provides most of the similarity functions that we’ve learned so far, such as: l1norm(vector, field) : L1 distance (Manhattan distance) l2norm(vector, field) : L2 distance (Euclidean distance) hamming(vector, field) : Hamming distance cosineSimilarity(vector, field) : Cosine similarity dotProduct(vector, field): Dot product similarity Since we’re writing a script, it is also possible to build our own similarity algorithm . Painless makes this possible by providing access to doc[<field>].vectorValue , which allows iterating over the vector array, and doc[<field>].magnitude , which returns the length of the vector. To sum up, even though exact search doesn’t scale well, it might still be suitable for certain very small use cases, but if you know your data volume will grow over time, you need to consider resorting to k-NN search instead. That’s what we’re going to present next. Approximate k-NN search Most of the time, this is the mode you’re going to pick if you have a substantial amount of data and need to implement vector search using Elasticsearch. Indexing latency is a bit higher since Lucene needs to build the underlying HNSW graph to store and index all vectors. It is also a bit more demanding in terms of memory requirements at search time, and the reason it’s called “approximate” is because the accuracy can never be 100% like with exact search. Despite all this, approximate k-NN offers a much lower search latency and allows us to scale to millions, or even billions of vectors, provided your cluster is sized appropriately. Let’s see how this works. First, let’s create a sample index with adequate vector field mapping to index the vector data (i.e., index: true + specific similarity ) and load it with some data: Simple k-NN search After running these two commands, our vector data is now properly indexed in a scalar-quantized HNSW graph and ready to be searched. Up until 8.11, the only way to run a simple k-NN search was by using the knn search option, located at the same level as the query section you’re used to, as shown in the query below: In the above search payload, we can see that there is no query section like for lexical searches, but a knn section instead. We are searching for the two ( k: 2 ) nearest neighboring vectors to the specified query vector. From 8.12 onwards, a new knn search query has been introduced to allow for more advanced hybrid search use cases, which is a topic we are going to handle in our next article. Unless you have the required expertise to combine k-NN queries with other queries, Elastic recommends sticking to the knn search option, which is easier to use. The first thing to note is that the new knn search query doesn’t have a k parameter, instead it uses the size parameter like any other queries. The second notable thing is that the new knn query enables to post-filter k-NN search results by leveraging a bool query that combines one or more filters with the knn search query, as shown in the code below: The above query first retrieves the top 3 documents having the nearest neighboring vectors and then filters out the ones whose price is smaller than 100. It is worth noting that with this kind of post-filtering, you might end up with no results at all if your filters are too aggressive. Also note that this behavior is different from the usual boolean full-text queries, where the filter part is executed first in order to reduce the document set that needs to be scored. If you are interested to learn more about the differences between the knn top-level search option and the new knn search query, you can head over to another great Search Labs article for more details. Let’s now move on and learn more about the num_candidates parameter. The role of num_candidates is to increase or decrease the likelihood of finding the true nearest neighbor candidates. The higher that number, the slower the search but also the more likely the real nearest neighbors will be found. As many as num_candidates vectors will be considered on each shard, and the top k ones will be returned to the coordinator node, which will merge all shard-local results and return the top k vectors from the global results as illustrated in Figure 1, below: Figure 1: Nearest neighbor search accuracy using num_candidates Vectors id4 and id2 are the k local nearest neighbors on the first shard, respectively id5 and id7 on the second shard. After merging and reranking them, the coordinator node returns id4 and id5 as the two global nearest neighbors for the search query. Several k-NN searches If you have several vector fields in your index, it is possible to send several k-NN searches as the knn section also accepts an array of queries, as shown below: As we can see, each query can take a different value of k as well as a different boost factor. The boost factor is equivalent to a weight, and the total score will be a weighted average of both scores. Filtered k-NN search Similarly to what we saw with the script_score query earlier, the knn section also accepts the specification of a filter in order to reduce the vector space on which the approximate search should run. For instance, in the k-NN search below, we’re restricting the search to only documents whose price is greater than or equal to 100. Now, you might wonder whether the data set is first filtered by price and then the k-NN search is run on the filtered data set (pre-filtering) or the other way around, i.e., the nearest neighbors are first retrieved and then filtered by price (post-filtering). It’s a bit of both actually. If the filter is too aggressive, the problem with pre-filtering is that k-NN search would have to run on a very small and potentially sparse vector space and would not return very accurate results. Whereas post-filtering would probably weed out a lot of high-quality nearest neighbors. So, even though the knn section filter is considered as a pre-filter, it works during the k-NN search in such a way as to make sure that at least k neighbors can be returned. If you’re interested in the details of how this works, you can check out the following Lucene issue dealing with this matter. Filtered k-NN search with expected similarity In the previous section, we learned that when specifying a filter, we can reduce the search latency, but we also run the risk of drastically reducing the vector space to vectors that are partly or mostly dissimilar to the query vector. In order to alleviate this problem, k-NN search makes it possible to also specify a minimum similarity value that all returned vectors are expected to have. Reusing the previous query, it would look like this: Basically, the way it works is that the vector space will be explored by skipping any vector that either doesn’t match the provided filter or that has a lower similarity than the specified one up until the k nearest neighbors are found. If the algorithm can’t honor at least k results (either because of a too-restrictive filter or an expected similarity that is too low), a brute-force search is attempted instead so that at least k nearest neighbors can be returned. A quick word concerning how to determine that minimum expected similarity. It depends on which similarity metric you’ve chosen in your vector field mapping. If you have picked l2_norm , which is a distance function (i.e., similarity decreases as distance grows), you will want to set the maximum expected distance in your k-NN query, that is, the maximum distance that you consider acceptable. In other words, a vector having a distance between 0 and that maximum expected distance with the query vector will be considered “close” enough to be similar. If you have picked dot_product or cosine instead, which are similarity functions (i.e., similarity decreases as the vector angle gets wider), you will want to set a minimum expected similarity. A vector having a similarity between that minimum expected similarity and 1 with the query vector will be considered “close” enough to be similar. Applied to the sample filtered query above and the sample data set that we have indexed earlier, Table 1, below, summarizes the cosine similarities between the query vector and each indexed vector. As we can see, vectors 3 and 4 are selected by the filter (price >= 100), but only vector 3 has the minimum expected similarity (i.e., 0.975) to be selected. Table 2: Sample filtered search with expected similarity Vector Cosine similarity Price 1 0.8473 23 2 0.5193 9 3 0.9844 124 4 0.9683 1457 Limitations of k-NN Now that we have reviewed all the capabilities of k-NN searches in Elasticsearch, let’s see the few limitations that you need to be aware of: Up until 8.11, k-NN searches cannot be run on vector fields located inside nested documents. From 8.12 onwards, this limitation has been lifted. However, such nested knn queries do not support the specification of a filter. The search_type is always set to dfs_query_then_fetch , and it is not possible to change it dynamically. The ccs_minimize_roundtrips option is not supported when searching across different clusters with cross-cluster search. This has been mentioned a few times already, but due to the nature of the HNSW algorithm used by Lucene (as well as any other approximate nearest neighbors search algorithms for that matter), “approximate” really means that the k nearest neighbors being returned are not always the true ones. Tuning k-NN As you can imagine, there are quite a few options that you can use in order to optimize the indexing and search performance of k-NN searches. We are not going to review them in this article, but we really urge you to check them out in the official documentation if you are serious about implementing k-NN searches in your Elasticsearch cluster. Beyond k-NN Everything we have seen so far leverages dense vector models (hence the dense_vector field type), in which vectors usually contain essentially non-zero values. Elasticsearch also provides an alternative way of performing semantic search using sparse vector models. Elastic has created a sparse NLP vector model called Elastic Learned Sparse EncodeR , or ELSER for short, which is an out-of-domain (i.e., not trained on a specific domain) sparse vector model that does not require any fine-tuning. It was pre-trained on a vocabulary of approximately 30000 terms, and being a sparse model, it means that vectors have the same number of values, most of which are zero. The way it works is pretty simple. At indexing time, the sparse vectors (term / weight pairs) are generated using the inference ingest processor and stored in fields of type sparse_vector , which is the sparse counterpart to the dense_vector field type. At query time, a specific DSL query also called sparse_vector will replace the original query terms with terms available in the ELSER model vocabulary which are known to be the most similar to them given their weights. We won’t dive deeper into ELSER in this article, but if you’re eager to discover how this works, you can check out this seminal article as well as the official documentation, which explains the topic in great detail. A quick glimpse into some upcoming related topics Elasticsearch also supports combining lexical search and vector search, and that will be the subject of the next and final article of this series. So far, we’ve had to generate the embeddings vectors outside of Elasticsearch and pass them explicitly in all our queries. Would it be possible to just provide the query text and a model would generate the embeddings on the fly? Well, the good news is that this is possible with Elasticsearch either by leveraging a construct called query_vector_builder (for dense vectors) or using the new semantic_text field type and semantic DSL query (for sparse vectors), and you can learn more about these techniques in this article . Let’s conclude In this article, we delved deeply into Elasticsearch vector search support. We first shared some background on Elastic’s quest to provide accurate vector search and why we decided to use Apache Lucene as our vector indexing and search engine. We then introduced the two main ways to perform vector search in Elasticsearch, namely either by leveraging the script_score query in order to run an exact brute-force search or by resorting to using approximate nearest neighbor search via the knn search option or the knn search query introduced in 8.12. We showed how to run a simple k-NN search and, following up on that, we reviewed all the possible ways of configuring the knn search option and query using filters and expected similarity and how to run multiple k-NN searches at the same time. To wrap up, we listed some of the current limitations of k-NN searches and what to be aware of. We also invited you to check out all the possible options that can be used to optimize your k-NN searches. If you like what you’re reading, make sure to check out the other parts of this series: Part 1: A Quick Introduction to Vector Search Part 3: Hybrid Search Using Elasticsearch Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Some background first How to set up k-NN Exact search Approximate k-NN search Simple k-NN search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to set up vector search in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/vector-search-set-up-elasticsearch","meta_description":"Learn how to set up vector search and execute k-NN searches in Elasticsearch.\n"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Speeding Up Multi-graph Vector Search Explore multi-graph vector search in Lucene and discover how sharing information between segment searches enhances search speed. Lucene MS TV By: Mayya Sharipova and Thomas Veasey On March 12, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog explores multi-graph vector search in Lucene and how sharing information among segment searches in multi-graph vector search allowed us to achieve significant search speedups. The previous state of multi-graph vector search in Lucene As we have described before Lucene's and hence Elasticsearch's approximate kNN search is based on searching an HNSW graph for each index segment and combining results from all segments to find the global k nearest neighbors. When it was first introduced a multi-graph search was done sequentially in a single thread, searching one segment after another. This comes with some performance penalty because searching a single graph is sublinear in its size. In Elasticsearch 8.10 we parallelized vector search , allocating up to a thread per segment in kNN vector searches, if there are sufficient available threads in the threadpool. Thanks to this change, we saw query latency drop to half its previous value in our nightly benchmark. Even though we were searching segment graphs in parallel, they were still independent searches, each collecting its own top k results unaware of progress made by other segment searches. We knew from our experience with the lexical search that we could achieve significant search speedups by exchanging information about best results collected so far among segment searches and we thought we could apply the same sort of idea for vector search. Speeding up multi-graph vector search by sharing information between segment searches When graph based indexes such as HNSW search for nearest neighbors to a query vector one can think of their strategy as a combination of exploration and exploitation. In the case of HNSW this is managed by gathering a larger top-n match set than the top-k which it will eventually return. The search traverses every edge whose end vector is competitive with the worst match found so far in the expanded set. This means it explores parts of the graph which it already knows are not competitive and will never be returned. However, it also allows the search to escape local minima and ultimately achieve better recall. By contrast a pure exploitation approach simply seeks to decrease the distance to the kth best match at every iteration and will only traverse edges whose end vectors will be added to the current top-k set. So the size of the expanded match set is a hyperparameter which allows one to trade run time for recall by increasing or decreasing exploration in the proximity graph. As we discussed already, Lucene builds multiple graphs for different partitions of the data. Furthermore, at large scale, data must be partitioned and separate graphs built if one wants to scale retrieval horizontally over several machines. Therefore, a generally interesting question is \"how should one adapt this strategy in the case that several graphs are being searched simultaneously for nearest neighbors?\" Recall is significantly higher when one searches graphs independently and combines the top-k sets from each. This makes sense through the lens of exploration vs exploitation: the multi-graph search is exploring many more vectors and so is much less likely to get trapped in local minima for the similarity function. However, it pays a cost to do this in increased run time. Ideally, we would like recall to be more independent of the sharding strategy and search to be faster. There are two factors which impact the efficiency of search on multiple graphs vs a single graph: the edges which are present in the single graph and having multiple independent top-n sets. In general, unless the vectors are partitioned into disjoint regions the neighbors of a vector in each partition graph will only comprise a subset of the true nearest neighbors in the single graph. This means one pays a cost in exploring non-competitive neighbors when searching multiple graphs. Since graphs are built independently, one necessarily has to pay a “structural” cost from having several graphs. However, as we shall see we can mitigate the second cost by intelligently sharing state between searches. Given a shared global top-n set it is natural to ask how we should search portions of graphs that are uncompetitive, specifically, edges whose end vertices that are further than the nth worst global match. If we were searching a single graph these edges would not be traversed. However, we have to bear in mind that the different searches have different entry points and progress at different rates, so if we apply the same condition to multi-graph search it is possible that the search will stop altogether before we visit its closest neighbors to the query. We illustrate this in the figure below. Figure 1 Two graph fragments showing a snapshot of a simultaneous search gathering the top-2 set. In this case if we were to prune edges whose unvisited end vertices are not globally competitive we would never traverse the red dashed edge and fail to find the best matches which are all in Graph 2. To avoid this issue we devised a simple approach that effectively switches between different parameterizations of each local search based on whether it is globally competitive or not. To achieve this, as well as the global queue which is synchronized periodically, we maintain two local queues of the distances to the closest vectors to the query found for the local graph. One has size n and the other has size ⌊ g × n ⌋ \\lfloor g \\times n \\rfloor ⌊ g × n ⌋ . Here, g g g controls the greediness of non-competitive search and is some number less than 1. In effect, g g g is a free parameter we can use to control recall vs the speed up. As the search progresses we check two conditions when deciding whether to traverse an edge: i) would we traverse the edge if we were searching the graph alone, ii) is the end vertex globally competitive or is it locally competitive with the \"greedy\" best match set. Formally, if we denote the query vector q q q , the end vector of the candidate edge v e v_e v e ​ , the n th n^{\\text{th}} n th local best match v n v_n v n ​ , the ⌊ g × n ⌋ th \\lfloor g \\times n\\rfloor^{\\text{th}} ⌊ g × n ⌋ th local best match v g v_g v g ​ and the n th n^{\\text{th}} n th global best match v g b v_{gb} v g b ​ then this amounts to adding v e v_e v e ​ to the search set if d ( v e , q ) < d ( v n , q ) AND ( d ( v e , q ) < d ( v g , q ) OR d ( v e , q ) < d ( v g b , q ) ) d(v_e, q) < d(v_n, q)\\text{ AND }(d(v_e, q) < d(v_g, q)\\text{ OR }d(v_e, q) < d(v_{gb}, q)) d ( v e ​ , q ) < d ( v n ​ , q ) AND ( d ( v e ​ , q ) < d ( v g ​ , q ) OR d ( v e ​ , q ) < d ( v g b ​ , q )) Here, d ( ⋅ , ⋅ ) d(\\cdot,\\cdot) d ( ⋅ , ⋅ ) denotes the index distance metric. Note that this strategy ensures we always continue searching each graph to any local minimum and depending on the choice of g g g we still escape some local minima. Modulo some details around synchronization, initialization and so on, this describes the change to the search. As we show this simple approach yields very significant improvements in search latency together with recall which is closer, but still better, than single graph search. Impact on performance Our nightly benchmarks showed up to 60% faster vector search queries that run concurrent with indexing (average query latencies dropped from 54 ms to 32 ms). Figure 2 Query latencies that run concurrently with indexing dropped significantly after upgrading to Lucene 9.10, which contains the new changes. On queries that run outside of indexing we observed modest speedups, mostly because the dataset is not that big, containing 2 million vectors of 96 dims across 2 shards (Figure 3). But still for those benchmarks, we could see a significant decrease in the number of visited vertices in the graph and hence the number of vector operations (Figure 4). Figure 3 Whilst we see small drops in the latencies after the change for queries that run without concurrent indexing, particularly for retrieving the top-100 matches, the number of vector operations (Figure 4) is dramatically reduced. Figure 4 We see very significant decreases in the number of vector operations used to retrieve the top-10 and top-100 matches. The speedups should be clearer for larger indexes with higher dimension vectors: in testing we typically saw between 2 × 2\\times 2 × to 3 × 3\\times 3 × , which is also consistent with the reduction in the number of vector comparisons we see above. For example, we show below the speedup in vector search operations on the Lucene nightly benchmarks. These use vectors of 768 dimensions. It is worth noting that in the Lucene benchmarks the vector search runs in a single thread sequentially processing one graph after another, but the change positively affects this case as well. This happens because the global top-n set collected after first graph searches sets up the threshold for subsequent graph searches and allows them to finish earlier if they don't contain competitive candidates. Figure 5 The graph shows that with the change committed on Feb 7th, the number queries per second increased from 104 queries/sec to 219 queries/sec. Impact on recall The multi-graph search speedups come at the expense of slightly reduced recall. This happens because we may stop exploration of a graph that may still have good matches based on the global matches from other graphs. Two notes on the reduced recall: i) From our experimental results we saw that the recall is still higher than the recall of a single graph search, as if all segments were merged together into a single graph (Figure 6). ii) Our new approach achieves better performance for the same recall: it Pareto dominates our old multi-graph search strategy (Figure 7). Figure 6 We can see the recall of kNN search on multiple segments slightly dropped for both top-10 and top-100 matches, but in both cases it is still higher than the recall of kNN search on a single merged segment. Figure 7 The Queries Per Second is better in the candidate (with the current changes) than the baseline (old multi-graph search strategy) for the 10 million documents of the Cohere/wikipedia-22-12-en-embeddings dataset for each equivalent recall. Conclusion In this blog we showed how we achieved significant improvements in Lucene vector search performance while still achieving excellent recall by intelligently sharing information between the different graph searches. The improvement is a part of the Lucene 9.10 release and is a part of the Elasticsearch 8.13 release. We're not done yet with improvements to our handling of multiple graphs in Lucene. As well as further improvements to search, we believe we've found a path to achieve dramatically faster merge times. So stay tuned! Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to The previous state of multi-graph vector search in Lucene Speeding up multi-graph vector search by sharing information between segment searches Impact on performance Impact on recall Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Speeding Up Multi-graph Vector Search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/multi-graph-vector-search","meta_description":"Explore multi-graph vector search in Lucene and discover how sharing information between segment searches enhances search speed."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog GitHub Assistant: Interact with your GitHub repository using RAG and Elasticsearch This blog introduces a GitHub Assistant using RAG with Elasticsearch to enable semantic code queries, providing insights into GitHub repositories, which can be extended to PRs feedback, issues handling, and production readiness reviews. Generative AI Python FS By: Fram Souza On October 23, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. This project allows you to interact directly with a GitHub repository and leverage semantic search to understand the codebase. You'll learn how to ask specific questions about the repository's code and receive meaningful, context-aware responses. You can follow the GitHub code here . Key considerations Quality of data : The output is only as good as the input—ensure your data is clean and well-structured. Chunk size : Proper chunking of data is crucial for optimal performance. Performance evaluation : Regularly assess the performance of your RAG-based application. Components Elasticsearch : Serves as the vector database for efficient storage and retrieval of embeddings. LlamaIndex : A framework for building applications powered by LLM. OpenAI : Used for both the LLM and generating embeddings. Architecture Ingestion The process starts by cloning a GitHub repository locally to the /tmp directory. The SimpleDirectoryReader is then used to load the cloned repository for indexing, the documents are split into chunks based on file type, utilizing CodeSplitter for code files, along with JSON , Markdown , and SentenceSplitters for other formats, see: If you want to add more support language into this code, you can just add a new parser and extension to the parsers_and_extensions list. After parsing the nodes, embeddings are generated using the text-embedding-3-large model and stored in Elasticsearch. The embedding model is declared using the Setting bundle, which a global variable: This is then utilized in the main function as part of the Ingest Pipeline. Since it's a global variable, there's no need to call it again during the ingestion process: The code block above starts by parsing the documents into smaller chunks (nodes) and then initializes a connection to Elasticsearch. The IngestionPipeline is created with the specified Elasticsearch vector store, and the pipeline is executed to process the nodes and store their embeddings in Elasticsearch, while displaying the progress during the process. At this point we should have your data indexed in Elasticsearch with the embeddings generated and stored. Below is one example of how the document looks like in ESS: Query Once the data is indexed, you can query the Elasticsearch index to ask questions about the codebase. The query.py script allows you to interact with the indexed data and ask questions about the codebase. It retrieves a query input from the user, creates an embedding using the same OpenAIEmbedding model used in the index.py , and sets up a query engine with the VectorStoreIndex loaded from the Elasticsearch vector store. The query engine uses similarity search, retrieving the top 3 most relevant documents based on the query's similarity to the stored embeddings. The results are summarized in a tree-like format using response_mode=\"tree_summarize\" , you can see the code snippet below: Installation 1. Clone the repository : 2. Install required libraries : 3. Set up environment variables : Update the .env file with your Elasticsearch credentials and the target GitHub repository details (eg, GITHUB_TOKEN , GITHUB_OWNER , GITHUB_REPO , GITHUB_BRANCH , ELASTIC_CLOUD_ID , ELASTIC_USER , ELASTIC_PASSWORD , ELASTIC_INDEX ). Here's one example of the .env file: Usage 1. Index your data and create the embeddings by running : 2. Ask questions about your codebase by running : Example: Questions you might want to ask: Give me a detailed description of what are the main functionalities implemented in the code? How does the code handle errors and exceptions? Could you evaluate the test coverage of this codebase and also provide detailed insights into potential enhancements to improve test coverage significantly? Evaluation The evaluation.py code processes documents, generates evaluation questions based on the content, and then evaluates the responses for relevancy ( Whether the response is relevant to the question ) and faithfulness ( Whether the response is faithful to the source content ) using a LLM. Here’s a step-by-step guide on how to use the code: You can run the code without any parameters, but the example above demonstrates how to use the parameters. Here's a breakdown of what each parameter does: Document processing: --num_documents 5 : The script will process a total of 5 documents. --skip_documents 2 : The first 2 documents will be skipped, and the script will start processing from the 3rd document onward. So, it will process documents 3, 4, 5, 6, and 7. Question generation: After loading the documents, the script will generate a list of questions based on the content of these documents. --num_questions 3 : Out of the generated questions, only 3 will be processed. --skip_questions 1 : The script will skip the first question in the list and process questions starting from the 2nd question. --process_last_questions : Instead of processing the first 3 questions after skipping the first one, the script will take the last 3 questions in the list. Now what? Here are a few ways you can utilize this code: Gain insights into a specific GitHub repository by asking questions about the code, such as locating functions or understanding how parts of the code work. Build a multi-agent RAG system that ingests GitHub PRs and issues, enabling automatic responses to issues and feedback on PRs. Combine your logs and metrics with the GitHub code in Elasticsearch to create a Production Readiness Review using RAG, helping assess the maturity of your services. Happy RAG! Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Jump to Key considerations Components Architecture Ingestion Query Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"GitHub Assistant: Interact with your GitHub repository using RAG and Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/github-rag-elasticsearch","meta_description":"Explore the GitHub Assistant, which uses RAG & Elasticsearch to enable semantic code queries, PR feedback and insights into GitHub repositories."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing Retrievers - Search All the Things! Learn about Elasticsearch retrievers, including Standard, kNN, text_expansion, and RRF. Discover how to use retrievers with examples. Vector Database Python JV JC By: Jeff Vestal and Jack Conradson On May 28, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. In 8.14, Elastic introduced a new search capability called “Retrievers” in Elasticsearch. Keep reading to discover their simplicity and efficiency and how they can empower you in your search operations. Retrievers are a new abstraction layer added to the Search API in Elasticsearch. They offer the convenience of configuring multi-stage retrieval pipelines within a single _search API call. This architecture simplifies search logic in your application by eliminating the need for multiple Elasticsearch API calls for complex search queries. It also reduces the need for client-side logic, often required to combine results from multiple queries. Initial types of Retrievers There are three types of retrievers included in the initial release. Each of these retrievers is designed for a specific purpose, and when combined, they allow for complex search operations. The available types are: Standard - Return the top documents from a traditional query. These allow backward compatibility by supporting existing Query DSL request syntax, allowing you to migrate to the retriever framework at your own pace. kNN - Return the top documents from a kNN search. RRF - Combine and rank multiple first-stage retrievers into a single result set with no or minimal user tuning using the reciprocal rank fusion algorithm. An RRF retriever is a compound retriever whose filter element is propagated to its sub-retrievers. How are Retrievers different, and why are they useful? With traditional queries, the query is part of the overall search API call. Retrievers differ by being designed as standalone entities that can be used in isolation or easily combined. This modular approach allows for more flexibility when designing search strategies. Retrievers are designed to be part of a “retriever tree,” a hierarchical structure that defines search operations by clarifying their sequence and logic. This structure makes complex searches more manageable and easier for developers to understand and allows new functionality to be easily added in the future. Retrievers enable composability, allowing you to build pipelines and integrate different retrieval strategies. This allows for easy testing of varying retrieval combinations. They also provide more control over how documents are scored and filtered. You can, for example, specify a minimum score threshold, apply a complex filter without affecting scoring, and use parameters like terminate_after for performance optimizations. Backward compatibility is maintained with traditional query elements, automatically translating them to the appropriate retriever. Retrievers usage examples Let’s look at some examples of using retrievers. We are using an IMDB sample dataset. You can run the accompying jupyter notebook to ingest IMDB data into a Serverless Search project and run the below examples yourself! The high-level setup is: overview - a short summary of the movie names the movie's name overview_dense - a dense_vector generated from an e5-small model overview_sparse - a sparse vector using Elastic’s ELSER model. Only returning the text version of names and overview using fields and setting _source:false Standard - Search All the Text! kNN - Search all the Dense Vectors! text_expansion - Search all the Sparse Vectors! rrf - Combine All the Things! Current restrictions with Retrievers Retrievers come with certain restrictions users should be aware of. For example, only the query element is allowed when using compound retrievers. This enforces a cleaner separation of concerns and prevents the complexity from overly nested or independent configurations. Additionally, sub-retrievers may not use elements restricted from having a compound retriever as part of the retriever tree. These restrictions enforce performance and composability even with complex retrieval strategies. Retrievers are initially released as a technical preview, so their API may change. Conclusion Retrievers represent a significant step forward with Elasticsearch's retrieval capabilities and user-friendliness. They can be chained in a pipeline fashion where each retriever applies its logic and passes the results to the next item in the chain. By allowing for more structured, flexible, and efficient search operations, retrievers can significantly enhance the search experience. The following resources provide additional details on Retrievers. Semantic Reranking in Elasticsearch with Retrievers Retrievers API documentation Retrievers - Search Your Data documentation Try the above code yourself! You can run the accompying jupyter notebook to ingest IMDB data into an Elastic Serverless Search project! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Initial types of Retrievers How are Retrievers different, and why are they useful? Retrievers usage examples Standard - Search All the Text! kNN - Search all the Dense Vectors! Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing Retrievers - Search All the Things! - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-retrievers","meta_description":"Learn about Elasticsearch retrievers, including Standard, kNN, text_expansion, and RRF. Discover how to use retrievers with examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series GenAI for customer support This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! Part1 Inside Elastic July 22, 2024 GenAI for Customer Support — Part 2: Building a Knowledge Library This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time! CM By: Cory Mangini Part2 Inside Elastic August 9, 2024 GenAI for Customer Support — Part 3: Designing a chat interface for chatbots... for humans This series gives you an inside look at how we’re using generative AI in customer support. Join us as we share our journey in designing a GenAI chatbot interface. IM By: Ian Moersen Part3 Inside Elastic August 22, 2024 GenAI for customer support — Part 4: Tuning RAG search for relevance This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this section on tuning RAG search for relevance. AS By: Antonio Schönmann Part4 Inside Elastic November 8, 2024 GenAI for customer support — Part 5: Observability This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time, focusing in this entry on observability for the Support Assistant. AJ By: Andy James Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"GenAI for customer support - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/genai-for-customer-support","meta_description":"This series gives you an inside look at how we're using generative AI in customer support. Join us as we share our journey in real-time!"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Smokin' fast BBQ with hardware accelerated SIMD instructions How we optimized vector comparisons in BBQ with hardware accelerated SIMD (Single Instruction Multiple Data) instructions. Vector Database Lucene CH By: Chris Hegarty On December 4, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. As we continue on our mission to make Elasticsearch and Apache Lucene the best place to store and search your vector data, we introduced a new approach in BBQ (Better Binary Quantization). BBQ is a huge step forward, bringing significant (32x) efficiencies by compressing the stored vectors, while maintaining high ranking quality and at the same time offering Smokin' Fast performance. You can read about how BBQ quantizes float32 to single bit vectors for storage, outperforms traditional approaches like Product Quantization in indexing speed (20-30x less quantization time) and query speed (2-5x faster queries), see the awesome accuracy and recall BBQ achieves for various datasets, in the BBQ blog. Since high-level performance is already covered in the BBQ blog, here we'll go a bit deeper to get a better understanding of how BBQ achieves such great performance. In particular, we'll look at how vector comparisons are optimized by hardware accelerated SIMD (Single Instruction Multiple Data) instructions. These SIMD instructions perform data-level parallelism so that one instruction operates on multiple vector components at a time. We'll see how Elasticsearch and Lucene target specific low-level SIMD instructions, like the AVX's VPOPCNTQ on x64 and NEON's vector instructions on ARM, to speed up vector comparisons. Why do we care so much about comparing vectors? Vector comparisons dominate the execution time within a vector database and are commonly the single most costly factor. We see this in profile traces, whether it be with float32 , int8 , int4 , or other levels of quantization. This is not surprising, since a vector database has to compare vectors a lot! Whether it be during indexing, e.g. building the HNSW graph, or during search as the graph or partition is navigated to find the nearest neighbours. Low-level performance improvements in vector comparisons can be really impactful on overall high-level performance. Elasticsearch and Lucene support a number of vector similarity metrics, like dot product, cosine, and Euclidean, however we'll focus on just dot product, since the others can be derived from that. Even though we have the ability in Elasticsearch to write custom native vector comparators, our preference is to stay in Java-land whenever possible so that Lucene can more easily get the benefits too. Comparing query and stored vectors For our distance comparison function to be fast, we need to simplify what it's doing as much as possible so it translates to a set of SIMD instructions that execute efficiently on the CPU. Since we've quantized the vectors to integer values we can load more of their components into a single register and also avoid the more costly floating-point arithmetic - this is a good start, but we need more. BBQ does asymmetric quantization; the query vector is quantized to int4 , while the stored vectors are even more highly compressed to just single bit values. Since a dot product is the sum of the product of the component values, we can immediately see that the only query component values that can contribute positively to the result are the values where the stored vector is 1 . A further observation is that if we translate the query vector in a way where the respective positional bits (1, 2, 3, 4) from each component value are grouped together, then our dot product reduces to a set of basic bitwise operations; AND + bit count for each component, followed by a shift representing the respective position of the query part, and lastly a sum to get the final result. See the BBQ blog for a more detailed explaination, but the following image visualizes an example query translation. Logically, the dot product reduces to the following, where d is the stored vector, and q1 , q2 , q3 , q4 are the respective positional parts of the translated query vector: For execution purposes, we layout the translated query parts, by increasing respective bit position, contiguously in memory. So we now have our translated query vector, q[] , which is four times the size of that of the stored vector d[] . A scalar java implementation of this dot product looks like this: While semantically correct, this implementation is not overly efficient. Let's move on and see what a more optimal implementation looks like, after which we can compare the runtime performance of each. Where does the performance come from? The implementation we've seen so far is a straightforward naive scalar implementation. To speed things up we rewrite the dot product with the Panama Vector API in order to explicitly target specific SIMD instructions. Here's a simplified snippet of what the code looks like for just one of the query parts - remember we need to do this four times, one for each of the translated int4 query parts. Here we're explicitly targeting AVX, operating on 256 bits per loop iteration. First performing a logical AND between the vq and vd , then a bit count on the result of that, before finally adding it to the sum accumulator. While we're interested in the bit count, we do however interpret the bytes in the vectors as longs, since that simplifies the addition and ensures that we don't risk overflowing the accumulator. A final step is then needed to horizontally reduce the lanes of the accumulator vector to a scalar result, before shifting by the representative query part number. Disassembling this on my Intel Skylake, we can clearly see the body of the loop. The rsi and rdx registers hold the address of the vectors to be compared, from which we load the next 256 bits into ymm4 and ymm5 , respectively. With our values loaded we now perform a bitwise logical AND, vpand , storing the result in ymm4 . Next you can see the population count instruction, vpopcntq , which counts the number of bits set to one. Finally we add 0x20 (32 x 8bits = 256bits) to the loop counter and continue. We're not showing it here for simplicity, but we actually unroll the 4 query parts and perform them all per loop iteration, as this reduces the load of the data vector. We also use independent accumulators for each of the parts, before finally reducing. There are 128 bit variants whenever the vector dimensions do not warrant striding 256 bits at a time, or on ARM where it compiles to sequences of Neon vector instructions AND , CNT , and UADDLP . And, of course, we deal with the tails in a scalar fashion, which thankfully doesn't occur all that often in practice given the dimension sizes that most popular models use. We continue our experiments with AVX 512, but so far it's not proven worthwhile to stride 512 bits at a time over this data layout, given common dimension sizes. How much does SIMD improve things? When we compare the scalar and vectorized dot product implementations, we see from 8x to 30x throughput improvement for a range of popular dimensions from 384 to 1536, respectively. With the optimized dot product we have greatly improved the overall performance such that the vector comparison is no longer the single most dominant factor when searching and indexing with BBQ. For those interested, here are some links to the benchmarks and code . Wrapping up BBQ is a new technique that brings both incredible efficiencies and awesome performance. In this blog we looked at how vector distance comparison is optimized in BBQ by hardware accelerated SIMD instructions. You can read more about the index and search performance, along with accuracy and recall in the BBQ blog. BBQ is released as tech-preview in Elasticsearch 8.16, and in Serverless right now! Along with new techniques like BBQ, we're continuously improving the low-level performance of our vector database. You can read more about what we've already done in these other blogs; FMA , FFM , and SIMD . Also, expect to see more of these low-level performance focused blogs like this one in the future, as we keep improving the performance of Elasticsearch so that it's the best place to store and search your vector data. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Why do we care so much about comparing vectors? Comparing query and stored vectors Where does the performance come from? How much does SIMD improve things? Wrapping up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Smokin' fast BBQ with hardware accelerated SIMD instructions - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/bbq-vector-comparison-simd-instructions","meta_description":"Explore how Elastic BBQ achieves its performance. Mainly how vector distance comparisons are optimized in BBQ by hardware accelerated SIMD instructions."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). Lucene Vector Database BT By: Benjamin Trent On January 6, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Our Better Binary Quantization (BBQ) indices are now even better(er). Recall improvements across the board (in extreme cases up to 20%) and unlocking the future of quantizing vectors to any bit size. As of Elasticsearch 8.18, BBQ indices are now backed by our state of the art optimized scalar quantization algorithm. Scalar quantization history Introduced in Elasticsearch 8.12, scalar quantization was initially a simple min/max quantization scheme. Per lucene segment, we would find the global quantile values for a given confidence interval. These quantiles are then used as the minimum and maximum to quantize all the vectors. While this naive quantization is powerful, it only really works for whole byte quantization. Static confidence intervals mean static quantiles. This is calculated once for all vectors in a given segment and works well for higher bit values. In Elasticsearch 8.15, we added half-byte, or int4, quantization . To achieve this with high recall, we added an optimization step, allowing for the best quantiles to be calculated dynamically. Meaning, no more static confidence intervals. Lucene will calculate the best global upper and lower quantiles for each segment. Achieving 8x reduction in memory utilization over float32 vectors. Dynamically searching for the best quantiles to reduce the vector similarity error. This was done once, globally, over a sample set of the vectors and applied to all. Finally, now in 8.18, we have added locally optimized scalar quantization. It optimizes quantiles per individual vector. Allowing for exceptional recall at any bit size, even single bit quantization. What is Optimized Scalar Quantization? For an in-depth explanation of the math and intuition behind optimized scalar quantization, check out our blog post on Optimized Scalar Quantization . There are three main takeaways from this work: Each vector, is centered on the Apache Lucene segment's centroid. This allows us to make better use of the possible quantized vectors to represent the dataset as a whole. Every vector is individually quantized with a unique set of optimized quantiles. Asymmetric quantization is used allowing for higher recall with the same memory footprint. In short, when quantizing each vector: We center the vector on the centroid Compute a limited number of iterations to find the optimal quantiles. Stopping early if the quantiles are unchanged or the error (loss) increases Pack the resulting quantized vectors Store the packed vector, its quantiles, the sum of its components, and an extra error correction term Here is a step by step view of optimizing 2 bit vectors. After the fourth iteration, we would normally stop the optimization process as the error (loss) increased. The first cell is each individual components error. The second is the distribution of 2 bit quantized vectors. Third is how the overall error is changing. Fourth is current step's quantiles overlayed of the raw vector being quantized. Storage and retrieval of optimized scalar quantization The storage and retrieval of optimized scalar quantization vectors are similar to BBQ. The main difference is the particular values we store. Stored for every binary quantized vector: dims/8 bytes, upper and lower quantiles, an additional correction term, the sum of the quantized components. One piece of nuance is the correction term. For Euclidean distance , we store the squared norm of the centered vector. For dot product we store the dot product between the centroid and the uncentered vector. Performance Enough talk. Here are the results from four datasets. Cohere's 768 dimensioned multi-lingual embeddings. This is a well distributed inner-product dataset. Cohere's 1024 dimensioned multi-lingual embeddings. This embedding model is well optimized for quantization. E5-Small-v2 quantized over the quora dataset. This model typically does poorly with binary quantization. GIST-1M dataset. This scientific dataset opens some interesting edge cases for inner-product and quantization. Here are the results for Recall@10|50 Dataset BBQ BBQ with OSQ Improvement Cohere 768 0.933 0.938 0.5% Cohere 1024 0.932 0.945 1.3% E5-Small-v2 0.972 0.975 0.3% GIST-1M 0.740 0.989 24.9% Across the board, we see that BBQ backed by our new optimized scalar quantization improves recall, and dramatically so for the GIST-1M dataset. But, what about indexing times? Surely all this per vector optimizations must add up. The answer is no. Here are the indexing times for the same datasets. Dataset BBQ BBQ with OSQ Difference Cohere 768 368.62s 372.95s +1% Cohere 1024 307.09s 314.08s +2% E5-Small-v2 227.37s 229.83s < +1% GIST-1M 1300.03s* 297.13s -300% Since the quantization methodology works so poorly over GIST-1M when using inner-product, it takes an exceptionally long time to build the HNSW graph as the vector distances are not well distinguished. Conclusion Not only does this new, state of the art quantization methodology improve recall for our BBQ indices, it unlocks future optimizations. We can now quantize vectors to any bit size and we want to explore how to provide 2 bit quantization, striking a balance between memory utilization and recall with no reranking. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Scalar quantization history What is Optimized Scalar Quantization? Storage and retrieval of optimized scalar quantization Performance Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/optimized-scalar-quantization-elasticsearch","meta_description":"Learn about optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ)."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Java Categories All Articles Agent AutoOps Developer Experience Elastic Cloud Hosted Elastic Cloud Serverless Generative AI How To Ingestion Inside Elastic Integrations Lucene ML Research Search Analytics Search Relevance Vector Database Coding Languages Subscribe Integrations Java +1 October 8, 2024 LangChain4j with Elasticsearch as the embedding store LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Discover how to use it to build your RAG application in plain Java. DP By: David Pilato Java How To October 3, 2024 Testing your Java code with mocks and real Elasticsearch Learn how to write your automated tests for Elasticsearch, using mocks and Testcontainers PP By: Piotr Przybyl Integrations Java +1 September 23, 2024 Introducing LangChain4j to simplify LLM integration into Java applications LangChain4j (LangChain for Java) is a powerful toolset to build your RAG application in plain Java. DP By: David Pilato ES|QL Java +1 May 2, 2024 ES|QL queries to Java objects Learn how to perform ES|QL queries with the Java client. Follow this guide for step-by-step instructions, including examples. LT By: Laura Trotta Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Java - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/category/java-programming","meta_description":"Java articles from Elasticsearch Labs"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Playground: Experiment with RAG applications with Elasticsearch in minutes Learn about Elastic's Playground and how to use it to experiment with RAG applications using Elasticsearch. Vector Database Generative AI Integrations Developer Experience How To JM SC By: Joe McElroy and Serena Chou On June 28, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this blog, you'll learn about Playground and how to use it to experiment with Retrieval-Augmented Generation (RAG) applications using Elasticsearch. Update: Try the new Playground app in the Elastic demo gallery. What is Playground? Elastic's playground experience is a low-code interface for developers to explore grounding LLMs of their choice with their own private data in minutes. While prototyping conversational search, the ability to rapidly iterate on and experiment with key components of a RAG workflow (for example: hybrid search, or adding reranking ) are important— to get accurate and hallucination-free responses from LLMs. Elasticsearch vector database and the Search AI platform provides developers with a wide range of capabilities such as comprehensive hybrid search, and to use innovation from a growing list of LLM providers. Our approach in our playground experience allows you to use the power of those features, without added complexity. A/B test LLMs and choose different inference providers Playground’s intuitive interface allows you to A/B test different LLMs from model providers (like OpenAI and Anthropic) and refine your retrieval mechanism, to ground answers with your own data indexed into one or more Elasticsearch indices. The playground experience can leverage transformer models directly in Elasticsearch, but is also amplified with the Elasticsearch Open Inference API which integrates with a growing list of inference providers including Cohere and Azure AI Studio . The best context window with retrievers and hybrid search As Elasticsearch developers already know, the best context window is built with hybrid search . Your strategy for architecting towards this outcome requires access to many shapes of vectorized and plain text data, that can be chunked and spread across multiple indices. We’re helping you simplify query construction with newly introduced query retrievers to Search All the Things! With three key retrievers (available now in 8.14 and Elastic Cloud Serverless) hybrid search with scores normalized with RRF is one unified query away. Using retrievers, the playground understands the shape of the selected data and will automatically generate a unified query on your behalf. Store vectorized data and explore a kNN retriever, or add metadata and context to generate a hybrid search query by selecting your data. Coming soon, semantic reranking can easily be incorporated into your generated query for even higher-quality recall. Once you’ve tuned and configured your semantic search to production standards, you’re ready to export the code and either finalize the experience in your application with your Python Elasticsearch language client or LangChain Python integration. Playground is accessible today on Elastic Cloud Serverless and available today in 8.14 on Elastic Cloud . Using the Playground Playground is accessible from within Kibana (the Elasticsearch UI) by navigating to “Playground” from within the side navigation. Connect to your LLM Playground supports chat completion models such as GPT-4o from OpenAI, Azure OpenAI, or Anthropic through Amazon Bedrock. To start, you need to connect to either one of these model providers to bring your LLM of choice. Chat with your data Any data can be used, even BM25-based indices. Your data fields can optionally be transformed using text embedding models (like our zero-shot semantic search model ELSER), but this is not a requirement.Getting Started is extremely simple - just select the indices you want to use to ground your answers and start asking questions.In this example, we are going to use a PDF and start with using BM25, with each document representing a page of the PDF. Indexing a PDF document with BM25 with Python First, we install the dependencies. We use the pypdf library to read PDFs and request to retrieve them. Then we read the file, creating an array of pages containing the text. And we then import this into elasticsearch, under the my_pdf_index_bm25 index. Chatting with your data with Playground Once we have connected our LLM with a connector and chosen the index, we can start asking questions about the PDF. The LLM will now easily provide answers to your data. What happens behind the scenes? When we choose an index, we automatically determine the best retrieval method. In this case, BM25 keyword search is only available, so we generate a multi-match type query to perform retrieval. As we only have one field, we defaulted to searching for this. If you have more than one field, you can choose the fields you want to search to improve the retrieval of relevant documents. Asking a question When you ask a question, Playground will perform a retrieval using the query to find relevant documents matching your question. It will then use this as context and provide it with the prompt, grounding the answer that’s returned from your chosen LLM model. We use a particular field from the document for the context. In this example, Playground has chosen the field named “text,” but this can be changed within the “edit context” action. By default, we retrieve up to 3 documents for the context, but you can adjust the number from within the edit context flyout as well. Asking a follow up question Typically, the follow-up question is tied to a previous conversation. With that in mind, we ask the LLM to rewrite the follow-up question using the conversation into a standalone question, which is then used for retrieval. This allows us to retrieve better documents to use as context to help answer the question. Context When documents are found based on your question, we provide these documents to the LLM as context to ground the LLM’s knowledge when answering. We automatically choose a single index field we believe is best, but you can change this field by going to the edit context flyout. Improving retrieval with Semantic Search and Chunking Since our query is in the form of a question, it is important for retrieval to be able to match based on semantic meaning. With BM25 we can only match documents that lexically match our question, so we’ll need to add semantic search too. Sparse Vector Semantic search with ELSER One simple way to start with semantic search is to use Elastic’s ELSER sparse embedding model with our data. Like many models of this size and architecture, ELSER has a typical 512-token limit and requires a design choice of an appropriate chunking strategy to accommodate it. In upcoming versions of Elasticsearch, we’ll chunk by default as part of the vectorization process, but in this version, we’ll follow a strategy to chunk by paragraphs as a starting point. The shape of your data may benefit from other chunking strategies, and we encourage experimentation to improve retrieval. Chunking and ingesting the PDF with pyPDF and LangChain To simplify the example, we will use LangChain tooling to load and split the pages into passages. LangChain is a popular tool for RAG development that can be integrated and used with the Elasticsearch vector database and semantic reranking capabilities with our updated integration. Creating an ELSER inference endpoint The following REST API calls can be executed to download, deploy, and check the model's running status. You can execute these using Dev Tools within Kibana. Ingesting into Elasticsearch Next we will set up an index and attach a pipeline that will handle the inference for us. Splitting pages into passages and ingesting into Elasticsearch Now that the ELSER model has been deployed, we can start splitting the PDF pages into passages and ingesting them into Elasticsearch. That’s it! We should have passages ingested into Elasticsearch that have been embedded with ELSER. See it in action on Playground Now when selecting the index, we generate an ELSER-based query using the deployment_id for embedding the query string. When asking a question, we now have a semantic search query that is used to retrieve documents that match the semantic meaning of the question. Hybrid Search made simple Enabling the text field can also enable hybrid search. When we retrieve documents, we now search for both keyword matches and semantic meaning and rank the two result sets with the RRF algorithm. Improve the LLM’s answers With Playground, you can adjust your prompt, tweak your retrieval, and create multiple indices (chunking strategy and embedding models) to improve and compare your responses. In the future, we will provide hints on how to get the most out of your index, suggesting methods to optimize your retrieval strategy. System prompt By default, we provide a simple system prompt which you can change within model settings. This is used in conjunction with a wider system prompt. You can change the simple system prompt by just editing it. Optimizing context Good responses rely on great context. Using methods like chunking your content and optimizing your chunking strategy for your data is important. Along with chunking your data, you can improve retrieval by trying out different text embedding models to see what gives you the best results. In the above example, we have used Elastic’s own ELSER model, but the inference service supports a wide number of embedding models that may suit your needs better. Other benefits of optimizing your context include better cost efficiency and speed: cost is calculated based on tokens (input and output). The more relevant documents we can provide, aided by chunking and Elasticsearch's powerful retrieval capabilities, the lower the cost and faster latency will be for your users. If you notice, the input tokens we used in the BM25 example are larger than those in the ELSER example. This is because we effectively chunked our documents and only provided the LLM with the most relevant passages on the page. Final Step! Integrate RAG into your application Once you’re happy with the responses, you can integrate this experience into your application. View code offers example application code for how to do this within your own API. For now, we provide examples with OpenAI or LangChain, but the Elasticsearch query, the system prompt, and the general interaction between the model and Elasticsearch are relatively simple to adapt for your own use. Conclusion Conversational search experiences can be built with many approaches in mind, and the choices can be paralyzing, especially with the pace of innovation in new reranking and retrieval techniques, both of which apply to RAG applications. With our playground, those choices are simplified and intuitive, even with the vast array of capabilities available to the developer. Our approach is unique in enabling hybrid search as a predominant pillar of the construction immediately, with an intuitive understanding of the shape of the selected and chunked data and amplified access across multiple external providers of LLMs. Build, test, fun with playground Try the Playground demo or head over to Playground docs to get started today! Explore Search Labs on GitHub for new cookbooks and integrations for providers such as Cohere, Anthropic, Azure OpenAI, and more. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to What is Playground? A/B test LLMs and choose different inference providers The best context window with retrievers and hybrid search Using the Playground Connect to your LLM Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Playground: Experiment with RAG applications with Elasticsearch in minutes - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/rag-playground-introduction","meta_description":"Learn about Elastic's Playground and how to use it to experiment with RAG applications using Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Semantic search, leveled up: now with native match, knn and sparse_vector support Semantic text search becomes even more powerful, with native support for match, knn and sparse_vector queries. This allows us to keep the simplicity of the semantic query while offering the flexibility of the Elasticsearch query DSL. Search Relevance Vector Database KD By: Kathleen DeRusso On March 6, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elasticsearch’s semantic query is incredibly powerful, allowing users to perform semantic search over data configured in semantic_text fields. Much of this power lies in simplicity: just set up a semantic_text field with the inference endpoint you want to use, and then ingest content as if indexing content into a regular text field. The inference happens automatically and transparently, making it simple to set up and use a search index with semantic functionality. This ease of use does come with some tradeoffs: we simplified semantic search with semantic_text by making judgments on default behavior that fit the majority of use cases. Unfortunately, this means that some customizations available for traditional vector search queries aren’t present in the semantic query . We didn’t want to add all of these options directly to the semantic query, as that would undermine the simplicity that we strive for. Instead, we expanded the queries that support the semantic_text field, leaving it up to you to choose the best query that meets your needs. Let’s walk through these changes, starting with creating a simple index with a semantic_text field: We made match happen! First and most importantly, the match query will now work with semantic_text fields! This means that you can change your old semantic query: Into a simple match query: We can see the benefits of semantic search here because we’re searching for “song lyrics about love”, none of which appears in the indexed document. This is because of ELSER’s text expansion. But wait, it gets better! If you have multiple indices, and the same field name is semantic_text in one field and perhaps text in the other field, you can still run match queries against these fields. Let’s create another index, with the same field names, but different types ( text instead of semantic_text ). Here’s a simple example to illustrate: Here, searching for “crazy” brings up both the lexical match that has “crazy” in the title, and the semantic lyric “lose my mind.” There are some caveats to keep in mind when using the match functionality with semantic_text : The underlying semantic_text field has a limitation where you can’t use multiple inference IDs on the same field. This limitation extends to match — meaning that if you have two semantic_text fields with the same name, they need to have the same inference ID or you’ll get an error. You can work around this by creating different names and querying them in a boolean query or a compound retriever . Depending on what model you use, the scores between lexical (text) matches and semantic matches will likely be very different. In order to get the best ranking of results, we recommend using second stage rerankers such as semantic reranking or RRF . Semantic search using the match query is also available in ES|QL ! Here’s the same example as above, but using ES|QL: Expert-level semantic search with knn and sparse_vector Match is great, but sometimes you want to specify more vector search options than the semantic query supports. Remember, the tradeoff of making the semantic query as simple as it is involved making some decisions on default behavior. This means that if you want to take advantage of some of the more advanced vector search features, perhaps num_candidates or filter from the knn query or token pruning in the sparse_vector query , you won’t be able to do so using the semantic query. In the past, we provided some workarounds to this, but they were convoluted and required knowing the inner workings and architecture of the semantic_text field and constructing a corresponding nested query. If you’re doing that workaround now, it will still work—however, we now support query DSL using knn or sparse_vector queries on semantic_text fields. All about that dense (vector), no trouble Here’s an example script that populates a text_embedding model and queries a semantic_text field using the knn query: The knn query can be modified with extra options to enable more advanced queries against the semantic_text field. Here, we perform the same query but add a pre-filter against the semantic_text field: Keepin’ it sparse (vector), keepin’ it real Similarly, sparse embedding models can be queried more specifically using semantic_text fields as well. Here’s an example script that adds a few more documents and uses the sparse_vector query: The sparse_vector query can be modified with extra options, to enable more advanced queries against the semantic_text field. Here, we perform the same query but add token pruning against a semantic_text field: This example significantly decreases the token frequency ratio required to pruning, which helps us show differences with such a small dataset, though they’re probably more aggressive than you’d want to see in production (remember, token pruning is about pruning irrelevant tokens to improve performance, not drastically change recall or relevance). You can see in this example that the Avril Lavigne song is no longer returned, and the scores have changed due to the pruned tokens. (Note that this is an illustrative example, and we still recommend a rescore adding pruned tokens back into scoring for most use cases). You’ll note that with all of these queries if you’re only querying a semantic_text field, you no longer need to specify the inference ID in knn ’s query_vector_builder or in the sparse_vector query. This is because it will be inferred from the semantic_text field. You can specify it if you want to override with a different (compatible!) inference ID for some reason or if you’re searching combined indices that have both semantic_text and sparse_vector or dense_vector fields though. Try it out yourself We’re keeping the original semantic query simple, but expanding our semantic search capabilities to power more use cases and seamlessly integrate semantic search with existing workflows. These power-ups are native to Elasticsearch and are already available in Serverless. They’ll be available in stack-hosted Elasticsearch starting with version 8.18. Try it out today! Report an issue Related content Vector Database June 24, 2024 Elasticsearch new semantic_text mapping: Simplifying semantic search Learn how to use the new semantic_text field type and semantic query for simplifying semantic search in Elasticsearch. CD MP By: Carlos Delgado and Mike Pellegrini Vector Database July 23, 2024 Introducing the sparse vector query: Searching sparse vectors with inference or precomputed query vectors Learn about the Elasticsearch sparse vector query, how it works, and how to effectively use it. KD By: Kathleen DeRusso Vector Database How To December 7, 2023 Introducing kNN Query: An expert way to do kNN search Explore how the kNN query in Elasticsearch can be used and how it differs from top-level kNN search, including examples. MS BT By: Mayya Sharipova and Benjamin Trent Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Jump to We made match happen! Expert-level semantic search with knn and sparse_vector All about that dense (vector), no trouble Keepin’ it sparse (vector), keepin’ it real Try it out yourself Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Semantic search, leveled up: now with native match, knn and sparse_vector support - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/semantic-search-match-knn-sparse-vector","meta_description":"Explore the changes that leveled up semantic search and learn how to use those features, including native match, knn, and sparse_vector support."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Stateless — your new state of find with Elasticsearch Learn about Elasticsearch stateless and explore the stateless architecture, which brings performance improvements and reduces costs. ML Research LL TB QH By: Leaf Lin , Tim Brooks and Quin Hoxie On October 6, 2022 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. With stateless Elasticsearch, we are investing in building a new fully cloud native architecture to push the boundaries of scale and speed. In this blog, we explore where we started, the future of Elasticsearch with the introduction of a stateless architecture and the details of this architecture. Where we started The first version of Elasticsearch was released in 2010 as a distributed scalable search engine allowing users to quickly search for and surface critical insights. Twelve years and over 65,000 commits later, Elasticsearch continues to provide users with battle-tested solutions to a wide variety of search problems. Thanks to the efforts of over 1,500 contributors, including hundreds of full-time Elastic employees, Elasticsearch has constantly evolved to meet the new challenges that arise in the field of search. Early in Elasticsearch's life when data loss concerns were raised, the Elastic team underwent a multiyear effort to rewrite the cluster coordination system to guarantee that acknowledged data is stored safely. When it became clear that managing indices in large clusters was a hassle, the team worked on implementing an extensive ILM solution to automate this work by allowing users to predefine index patterns and lifecycle actions. As users found a need to store significant amounts of metric and time series data, various features such as better compression were added to reduce data size. As the storage cost of searching extensive amounts of cold data grew we invested in creating Searchable Snapshots as a way to search user data directly on low cost object stores. These investments lay the groundwork for the next evolution of Elasticsearch. With the growth of cloud-native services and new orchestration systems, we have decided it is time to evolve Elasticsearch to improve the experience when working with cloud-native systems. We believe that these changes present opportunities for operational, performance, and cost improvements while running Elasticsearch on Elastic Cloud . Where we are going — Adopting a stateless architecture One of the primary challenges when operating or orchestrating Elasticsearch is that it depends on numerous pieces of persistent state, it is therefore a stateful system. The three primary pieces are the translog, index store, and cluster metadata. This state means that storage must be persistent and cannot be lost during a node restart or replacement. The existing Elasticsearch architecture on Elastic Cloud must duplicate indexing across multiple availability zones to provide redundancy in the case of outages. We intend to shift the persistence of this data from local disks to an object store, like AWS S3. By relying on external services for storing this data, we will remove the need for indexing replication, significantly reducing the hardware associated with ingestion. This architecture also provides very high durability guarantees because of the way cloud object stores such as AWS S3, GCP Cloud Storage, and Azure Blob Storage replicate data across availability zones. Offloading index storage into an external service will also allow us to re-architect Elasticsearch by separating indexing and search responsibilities. Instead of having primary and replica instances handling both workloads, we intend to have an indexing tier and a search tier. Separating these workloads will allow them to be scaled independently and hardware selection to be more targeted for the respective use cases. It also helps solve a longstanding challenge where search and indexing load can impact one another. After undertaking a multi-month proof-of-concept and experimental phase, we are convinced that these object store services meet the requirements we envision for index storage and cluster metadata. Our testing and benchmarks indicate that these storage services can meet the high indexing needs of the largest clusters we have seen in Elastic Cloud. Additionally, backing the data in the object store reduces indexing costs and allows for simple tuning of the performance of search. In order to search data, Elasticsearch will use the battled-tested Searchable Snapshots model where data is permanently persisted in the cloud-native object store and local disks are used as caches for frequently accessed data. To help differentiate, we describe our existing model as \"node-to-node\" replication. In the hot tier for this model, the primary and replica shards both do the same heavy lifting to handle ingest and serve search requests. These nodes are \"stateful\" in that they rely on their local disks to safely persist the data for the shards they host. Additionally, primary and replica shards are constantly communicating to stay in sync. They do this by replicating the operations performed on the primary shard to the replica shard, which means that the cost of those operations (CPU, mainly) is incurred for each replica specified. The same shards and nodes doing this work for ingest are also serving search requests, so provisioning and scaling must be done with both workloads in mind. Beyond search and ingest, shards in the node-to-node replication model handle other intensive responsibilities, such as merging Lucene segments. While this design has its merits, we saw a lot of opportunity based on what we've learned with customers over the years and the evolution of the broader cloud ecosystem. The new architecture enables many immediate and future improvements, including: You can significantly increase ingest throughput on the same hardware, or to look at it another way, significantly improve efficiency for the same ingest workload. This increase comes from removing the duplication of indexing operations for every replica. The CPU-intensive indexing operations only need to happen once on the indexing tier, which then ships the resulting segments to an object store. From there, the data is ready to be consumed as-is by the search tier. You can separate compute from storage to simplify your cluster topology. Today, Elasticsearch has multiple data tiers (content, hot, warm, cold, and frozen) to match data with hardware profile. Hot tier is for near real-time search and frozen is for less frequently searched data. While these tiers provide value, they also increase complexity. In the new architecture, data tiers will no longer be necessary, simplifying the configuration and operation of Elasticsearch. We are also separating indexing from search, which further reduces complexity and allows us to scale both workloads independently. You can experience improved storage costs on the indexing tier by reducing the amount of data that must be stored on a local disk. Currently, Elasticsearch must store a full shard copy on hot nodes (both primary and replica) for indexing purposes. With the stateless approach of indexing directly to the object store, only a portion of that local data is required. For append only use cases, only certain metadata will need to be stored for indexing. This will significantly reduce the local storage required for indexing. You can lower storage costs associated with search queries. By making the Searchable Snapshots model the native mode of searching data, the storage cost associated with search queries will significantly decrease. Depending on the search latency needs for users, Elasticsearch will allow adjustments to increase local caching on frequently requested data. Benchmarking — 75% indexing throughput improvement In order to validate this approach we implemented an extensive proof of concept where data was only indexed on a single node and replication was achieved through cloud object stores. We found that we could achieve a 75% indexing throughput improvement by removing the need to dedicate hardware to indexing replication. Additionally, the CPU cost associated with simply pulling data from the object store was much lower than indexing the data and writing it locally, as is necessary for the hot tier today. This means that search nodes will be able to fully dedicate their CPU to search. These performance tests were performed on a two node cluster against all three major public cloud providers (AWS, GCP, and Azure). We intend to continue to build out larger benchmarks as we pursue a production stateless implementation. Indexing Throughput CPU Usage Stateless for us, savings for you The stateless architecture on Elastic Cloud will allow you to reduce indexing overhead, independently scale ingest and search, simplify data tier management, and accelerate operations, such as scaling or upgrading. This is the first milestone towards a substantial modernization of the Elastic Cloud platform. Become part of our Elasticsearch stateless vision Interested in trying out this solution before everyone else? You can reach out to us on discuss or on our community slack channel . We would love your feedback to help shape the direction of our new architecture. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Where we started Where we are going — Adopting a stateless architecture Benchmarking — 75% indexing throughput improvement Stateless for us, savings for you Become part of our Elasticsearch stateless vision Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Stateless — your new state of find with Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/stateless-your-new-state-of-find-with-elasticsearch","meta_description":"Learn about Elasticsearch stateless and explore the stateless architecture, which  brings performance improvements and reduces costs."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Better Binary Quantization (BBQ) vs. Product Quantization Why we chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch. Vector Database Lucene ML Research BT By: Benjamin Trent On November 18, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We have been progressively making vector search with Elasticsearch and Lucene faster and more affordable. Our main focuses have been not only improving the search speeds through SIMD, but also by reducting the cost through scalar quantization. First by 4x and then by 8x. However, this is still not enough. Through techniques like Product Quantization (referred to as PQ), 32x reductions can be achieved without significant costs in recall. We need to achieve higher levels of quantization to provide adequate tradeoffs for speed and cost. One way to achieve this is by focusing on PQ. Another is simply improving on binary quantization. Spoilers: BBQ is 10-50x faster at quantizing vectors than PQ BBQ is 2-4x faster at querying than PQ BBQ achieves the same or better recall than PQ So, what exactly did we test and how did it turn out? What exactly are we going to test? Both PQ and Better Binary Quantization have various pros vs. cons on paper. But we needed a static set of criteria from which to test both. Having an independent \"pros & cons\" list is too qualitative a measurement. Of course things have different benefits, but we want a quantitative set of criteria to aid our decision making. This is following a pattern similar to the decision making matrix explained by Rich Hickey . Our criteria were: Search speed Indexing speed flat Indexing speed with HNSW Merge speed Memory reduction possible Is the algorithm well known and battle tested in production environments? Is coarse grained clustering absolutely necessary? Or, how does this algorithm fair with just one centroid Brute-force oversampling required to achieve 95% recall HNSW indexing still works and can acheive +90% recall with similar reranking to brute-force Obviously, almost all the criteria were measurable, we did have a single qualitative criteria that we thought important to include. For future supportability, being a well known algorithm is important and if all other measures were tied, this could be the tipping point in the decision. How did we test it? Lucene and Elasticsearch are both written in Java, consequently we wrote two proof of concepts in Java directly. This way we get an apples-to-apples comparison on performance. Additionally, when doing Product Quantization, we only tested up to 32x reduction in space. While PQ does support further reduction in space by reducing the number of code books, we found that for many models recall quickly became unacceptable. Thus requiring much higher levels of oversampling. Additionally, we did not use Optimized PQ due to the compute constraints required for such a technique. We tested over different datasets and similarity metrics. In particular: e5Small , which only has 384 dimensions and whose vector space is fairly narrow compared to other models. You can see how poorly e5small with naive binary quantization performs in our bit vectors blog . Consequently, we wanted to ensure an evolution of binary quantization could handle such a model. Cohere's v3 model , which has 1024 dimensions and loves being quantized. If a quantization method doesn't work with this one, it probably won't work with any model. Cohere's v2 model , which has 768 dimensions and its impressive performance relies on the non-euclidean vector space of max-inner product. We wanted to ensure that it could handle non-euclidean spaces just as well as Product Quantization. We did testing locally on ARM based macbooks & remotely on larger x86 machines to make sure any performance differences we discovered were repeatable no matter the CPU architecture. Well, what were the results? e5small quora This was a smaller dataset, 522k vectors built using e5small. Its few dimensions and narrow embedding space make it prohibitive to use with naive binary quantization. Since BBQ is an evolution of binary quantization, verifying that it worked with such an adverse model in comparison with PQ was important. Testing on an M1 Max ARM laptop: Algorithm quantization build time (ms) brute-force latency (ms) brute-force recall @ 10:50 hnsw build time (ms) hnsw recall @ 10:100 hnsw latency (ms) BBQ 1041 11 99% 104817 96% 0.25 Product Quantization 59397 20 99% 239660 96% 0.45 CohereV3 This model excels at quantization. We wanted to do a larger number of vectors (30M) in a single coarse grained centroid to ensure our smaller scale results actually translate to higher number of vectors. This testing was on a larger x86 machine in google cloud: Algorithm quantization build time (ms) brute-force latency (ms) brute-force recall @ 10:50 hnsw build time (ms) hnsw recall @ 10:100 hnsw latency (ms) BBQ 998363 1776 98% 40043229 90% 0.6 Product Quantization 13116553 5790 98% N/A N/A N/A When it comes to index and search speed at similar recall, BBQ is a clear winner. Inner-product search and BBQ We have noticed in other experiments that non-euclidean search can be tricky to get correct when quantizing. Additionally, naive binary quantization doesn't care about vector magnitude, vital for inner-product. Well, footnote in hand, we spent a couple of days on the algebra as we needed to adjust the corrective measures applied at the end of the query estimation. Success! Algorithm recall 10:10 recall 10:20 recall 10:30 recall 10:40 recall 10:50 recall 10:100 BBQ 71% 87% 93% 95% 96% 99% Product Quantization 65% 84% 90% 93% 95% 98% That wraps it up The complete decision matrix for BBQ vs Product Quantization. We are pretty excited about Better Binary Quantization (BBQ). We have been hard at work kicking the tires and are continually surprised at the quality of results we get with just a single bit of information retained per vector dimension. Look for it coming in an Elasticsearch release near you! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Jump to What exactly are we going to test? How did we test it? Well, what were the results? e5small quora CohereV3 Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Better Binary Quantization (BBQ) vs. Product Quantization - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/bit-vectors-elasticsearch-bbq-vs-pq","meta_description":"Explore why Elastic chose to spend time working on Better Binary Quantization (BBQ) instead of product quantization in Lucene and Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open Inference API support for AlibabaCloud AI Search Discover how to use Elasticsearch vector database with AlibabaCloud AI Search, which offers inference, reranking, and embedding capabilities. Integrations Vector Database How To DK W By: Dave Kyle and Weizijun On September 18, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Our latest addition to the Elasticsearch Open Inference API is the integration of AlibabaCloud AI Search. This work enables Elastic users to connect directly with the AlibabaCloud AI platform. Developers building RAG applications using the Elasticsearch vector database can store and use dense and sparse embeddings generated from models hosted on AlibabaCloud AI Search platform with semantic_text. In addition, Elastic users now have integrated access to reranking models for enhanced semantic reranking and the Qwen LLM family. In this blog, we explore how to integrate AlibabaCloud's AI services with Elasticsearch. You'll learn how to set up and use Alibaba's completion, rerank, sparse embedding, and text embedding services within Elasticsearch. The broad set of supported models integrated into inference task types will enhance the relevance of many use cases including RAG. We’re grateful to the Alibaba team for contributing support for these task types to Elasticsearch open inference API! Let’s walk through examples of how to configure and use these services within an Elasticsearch environment. Note Alibaba uses the term service_id instead of model_id . Using a base model in AlibabaCloud AI Search platform This walkthrough assumes you already have an AlibabaCloud Account with access to AlibabaCloud AI Search platform. Next, you’ll need to create a workspace and API key for Inference creation. Creating an inference API endpoint in Elasticsearch In Elasticsearch, create your endpoint by providing the service as “alibabacloud-ai-search”, and the service settings including your workspace, the host, the service id and your api keys to access AlibabaCloud AI Search platform. In our example, we're creating a text embedding endpoint using \"ops-text-embedding-001\" as the service id. You will receive a response from Elasticsearch with the endpoint that was created successfully: Note that there are no additional settings for model creation. Elasticsearch will automatically connect to the AlibabaCloud AI Search platform to test your credentials and the service id, and fill in the number of dimensions and similarity measures for you. Next, let’s test our endpoint to ensure everything is set up correctly. To do this, we’ll call the perform inference API: The API call will return the generated embeddings for the provided input, which will look something like this: You are now ready to start exploring. After you have tried these examples, have a look at some new exciting innovations in Elasticsearch for semantic search use cases: The new semantic_text field simplifies storage and chunking of embeddings - just pick your model and Elastic does the rest! Introduced in 8.14, retrievers allow you to setup multi-stage retrieval pipelines But first, let’s dive into our examples! I. Completion To start, Alibaba Cloud provides several models for chat completion, with service IDs listed in their API documentation . Step 1: Configure the Completion Service First, set up the inference service for text completion: Response Step 2: Issue a Completion Request Using the configured endpoint, send a POST request to generate a completion: Returns Uniquely, for this Elastic Inference API integration with Alibaba, chat history can be included in the inputs, in this example, we’ve included the previous response and added: “What fun things are there?” The response clearly includes the history In future updates, we plan to allow users to explicitly include chat history, improving the ease of usage. II. Rerank Moving on to our next task type, rerank . Reranking helps re-order search results for improved relevance, using Alibaba's powerful models. If you want to read more about this concept, have a look at this blog on Elastic Search Labs . Step 1: Configure the Rerank Service Configure the reranking inference service: Step 2: Issue a Rerank Request Send a POST request to rerank your search query results: The rerank interface does not require a lot of configuration (task_settings), it returns the relevance scores ordered by the most relevant first and the index of the document in the input array. III. Sparse Embedding Alibaba provides a model specifically for sparse embeddings, we will use ops-text-sparse-embedding-001 for our example. Step 1: Configure the Sparse Embedding Service Step 2: Issue a Sparse Embedding query Sparse has task_settings for: input_type - either ingest or search return_token - if true include the token text in the response, else it is a number With return_token==false IV. Text Embedding Alibaba also offers text embedding models for different tasks. Step 1: Configure the Text Embedding Service Embeddings has one task_setting: input_type - either ingest or search Step 2: Issue a Text Embedding Request Send a POST request to generate a text embedding: AI search with Elastic and AlibabaCloud Whether you're using Elasticsearch for implementing hybrid search, semantic reranking, or enhancing RAG use cases with summarization, the connection to AlibabaCloud's AI Services opens up a new world of possibilities for Elasticsearch developers. Thanks again, Alibaba team, for the contribution! To dive deep, try this Jupyter notebook with an end-to-end example of using Inference API with the Alibaba Cloud AI Search. Read Alibaba Cloud's announcement about AI-powered search innovations with Elasticsearch. Users can start using this with Elasticsearch Serverless environments today and in an upcoming version of Elasticsearch. Happy searching! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Using a base model in AlibabaCloud AI Search platform Creating an inference API endpoint in Elasticsearch I. Completion Step 1: Configure the Completion Service Step 2: Issue a Completion Request Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch open Inference API support for AlibabaCloud AI Search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-alibaba-cloud-inference-api","meta_description":"Integrate AlibabaCloud's AI services with Elasticsearch. Learn how to use Elasticsearch vector database with AlibabaCloud AI Search, which offers inference, reranking, and embedding capabilities."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Indexing OneLake data into Elasticsearch Learn how to connect to OneLake and index documents into Elasticsearch. Then, take the configuration one step further by developing your own OneLake connector. Part1 Integrations Ingestion +1 January 23, 2025 Indexing OneLake data into Elasticsearch - Part 1 Learn to configure OneLake, consume data using Python and index documents in Elasticsearch to then run semantic searches. GL By: Gustavo Llermaly Part2 Integrations Ingestion +1 January 24, 2025 Indexing OneLake data into Elasticsearch - Part II Second part of a two-part article to index and search OneLake data into Elastic using a Custom connector. GL JR By: Gustavo Llermaly and Jeffrey Rengifo Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Indexing OneLake data into Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/indexing-onelake-data-into-elasticsearch","meta_description":"Learn how to connect to OneLake and index documents into Elasticsearch. Then, take the configuration one step further by developing your own OneLake connector."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Testing your Java code with mocks and real Elasticsearch Learn how to write your automated tests for Elasticsearch, using mocks and Testcontainers Java How To PP By: Piotr Przybyl On October 3, 2024 Part of Series Integration tests using Elasticsearch Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this post, we will introduce and explain two ways of testing software using Elasticsearch as an external system dependency. We'll cover tests using mocks as well as integration tests, show some practical differences between them, and give some hints on where to go for each style. Good tests for system confidence A good test is a test that increases the confidence of every person involved in the process of creating and maintaining an IT system. Tests aren't meant to be cool, fast, or to artificially increase code coverage. Tests play a vital role in ensuring that: What we want to deliver is going to work in production. The system satisfies the requirements and the contracts. There won't be regressions in the future. Developers (and other involved team members) are confident that what they have created will work. Of course, this doesn't mean that tests can't be cool, fast, or increase code coverage. The faster we can run our test suite, the better. It's just that in the pursuit of reducing the overall duration of the testing suite, we should not sacrifice the reliability, maintainability and confidence the automated tests give us. Good automated tests make the various team members more confident: Developers: they get to confirm that what they're doing works (even before the code they work on leaves their machine). Quality assurance team: they have less to test manually. System operators and SREs: are more relaxed, because the systems are easier to deploy and maintain. Last but not least: the architecture of a system. We love when systems are organized, easy to maintain, and the architecture is clean and serves its purpose. However, sometimes we might see an architecture which sacrifices too much for the excuse known as \"it's more testable this way\". There's nothing wrong with being very testable – only when the system is written primarily to be testable instead of serving the needs justifying its existence, we see a situation when the tail wags the dog. Two kinds of tests: Mocks & dependencies There are many ways the tests can be seen, and thus classified. In this post I'll focus on only one aspect of dividing the tests: using mocks (or stubs, or fakes, or ...) vs. using real dependencies. In our case the dependency is Elasticsearch. Tests using mocks are very fast because they don't need to start any external dependencies and everything happens only in memory. Mocking in automated testing is when fake objects are used instead of real ones to test parts of a program without using the actual dependencies. This is the reason they're needed and why they shine in any fast-detection-net tests, e.g. validation of input. There's no need to start a database and make a call to it only to verify that negative numbers in a request aren't allowed, for example. However, introducing mocks has several implications: Not everything and every time can be mocked easily, hence mocks have impact on the architecture of the system (which sometimes is great, sometimes not so much). Tests running on mocks might be fast, but developing such tests can take quite some time because the mocks deeply reflecting the systems they mimic usually aren't given for free. Someone who knows how the system works needs to write the mocks the proper way, and this knowledge can come from practical experience, studying documentation, and so on. Mocks need to be maintained. When your system depends on an external dependency, and you need to upgrade this dependency, someone has to ensure that the mocks mimicking the dependency also get updated with all the changes: breaking, documented and undocumented (which can also have an impact on our system). This becomes especially painful when you want to upgrade a dependency but your (using only mocks) test suite can't give you any confidence that all the tested cases are guaranteed to work. It takes discipline to ensure that the effort goes towards developing and testing the system, not the mocks. For these reasons many people advocate going exactly the opposite direction: never use mocks (or stubs, etc.), but rely solely on real dependencies. This approach works very nicely in demos or when the system is tiny and has only a few test cases generating huge coverage. Such tests can be integration tests (roughly speaking: checking a part of a system against some real dependencies) or end-to-end tests (using all real dependencies at the same time and checking the behavior of the system at all ends, while playing user workflows which define the system as usable and successful). A clear benefit of using this approach is that we also (often unintentionally) verify our assumptions about the dependencies, and how we integrate them against the system we're working on. However, when tests are using only real dependencies, we need to consider the following aspects: Some test scenarios don't need the actual dependency (e.g. to verify the static invariants of a request). Such tests usually aren't run in whole suites at developers' machines, because waiting for feedback would take too much time. They require more resources at CI machines, and it might take more time to tune things to not waste time & resources. It might not be trivial to initialize dependencies with test data. Tests with real dependencies are great for cordoning code before major refactoring, migration or dependency upgrade. They are more likely to be opaque tests, i.e. not being detailed about the internals of the system under test, but taking care of their results. The sweet spot: use both tests Instead of testing your system with just one kind of test, you can rely on both kinds where it makes sense and try to improve your usage of both of them. Run mock-based tests first because they are much faster, and only when all are successful, run slower dependency tests only after. Choose mocks for scenarios where external dependencies aren't really needed: when mocking would take too much time the code should be massively altered just for that; rely on external dependencies. There is nothing wrong in testing a piece of code using both approaches, as long as it makes sense. Example of SystemUnderTest For the next sections we're going to use an example which can be found here . It's a tiny demo application written in Java 21, using Maven as the build tool, relying on Elasticsearch client and using Elasticsearch's latest addition, using ES|QL (Elastic's new procedural query language). If Java is not your programming language, you should still be fine to understand the concepts we're going to discuss below and translate them to your stack. It's just using a real code example makes certain things easier to explain. The BookSearcher helps us handle search and analyze data, being books in our case (as demonstrated in one of the previous posts ). It requires Elasticsearch exactly in version 8.15.x as its only dependency (see isCompatibleWithBackend() ), e.g. because we're not sure if our code is forward-compatible, and we're sure it's not backwards compatible. Before upgrading Elasticsearch in production to a newer version, we shall first bump it in the tests to ensure that the behaviour of the System Under Test remains the same. We can use it to search for the number of books published in a given year (see numberOfBooksPublishedInYear ). We might also use it when we need to analyze our data set and find out the 20 most published authors between two given years (see mostPublishedAuthorsInYears ). Test with mocks to start with For creating the mocks used in our tests we're going to use Mockito , a very popular mocking library in Java ecosystem. We might begin with the following, to have mocks reset before each test: As we said before, not everything can be easily tested using mocks. But some things we can (and probably even should). Let's try verifying that the only supported version of Elasticsearch is 8.15.x for now (in the future we might extend the range once we confirm our system is compatible with future versions): We can verify similarly (simply by returning a different minor version), that our BookSearcher is not going to work with 8.16.x yet, because we're not sure if it's going to be compatible with it: Now let's see how we can achieve something similar when testing against a real Elasticsearch. For this we're going to use Testcontainers' Elasticsearch Module , which has only one requirement: it needs access to Docker, because it runs Docker containers for you. From a certain angle Testcontainers is simply a way of operating Docker containers, but instead of doing that in your Docker Desktop (or similar), in your CLI, or scripts, you can express your needs in the programming language you know. This makes fetching images, starting containers, garbage-collecting them after tests, copying files back and forth, executing commands, examining logs, etc. possible directly from your test code. The stub might look like this: In this example we rely on Testcontainers' JUnit integration with @Testcontainers and @Container , meaning we don't need to worry about starting Elasticsearch before our tests and stopping it after. The only thing we need to do is to create the client before each test and close it after each test (to avoid resource leaks, which could impact bigger test suites). Annotating a non-static field with @Container means, that a new container will be started for each test, hence we don't have to worry about stale data or resetting the container's state. However, with many tests, this approach might not perform well, so we're going to compare it with alternatives in one of the next posts. Note: By relying on docker.elastic.co (Elastic's official Docker image repository), you avoid exhausting your limits on Docker hub. It is also recommended to use the same version of your dependency in your tests and production environment, to ensure maximum compatibility. We also recommend being precise with selecting the version, for this reason, there is no latest tag for Elasticsearch images. Connecting to Elasticsearch in tests Elasticsearch Java client is capable of connecting to Elasticsearch running in a test container even with security and SSL/TLS enabled (which are default for versions 8.x, that's why we didn't have to specify anything related to security in the container declaration.) Assuming the Elasticsearch you're using in production has also TLS and some security enabled, it is recommended to go for the integration test setup as close to production scenario as possible, and therefore not disabling them in tests. How to obtain data necessary for connection, assuming the container is assigned to field or variable elasticsearch : elasticsearch.getHost() will give you the host on which the container is running (which most of the time will be probably \"localhost\" , but please don't hardcode this as sometimes, depending on your setup, it might be another name, therefore the host should always be obtained dynamically). elasticsearch.getMappedPort(9200) will give the host port you have to use to connect to Elasticsearch running inside the container (because every time you start the container, the outside port is different, so this has to be a dynamic call as well). Unless they were overwritten, the default username and password are \"elastic\" and \"changeme\" respectively. If there was no SSL/TLS certificate specified during the container setup, and the secured connectivity is not disabled (which is the default behaviour from versions 8.x), a self-signed certificate is generated. To trust it (e.g. like cURL can do ) the certificate can be obtained using elasticsearch.caCertAsBytes() (which returns Optional<byte[]> ), or another convenient way is to get SSLContext using createSslContextFromCa() . The overall result might look like this: Another example of creating an instance of ElasticsearchClient can be found in the demo project . Note : For creating client in production environments please refer to the documentation . First integration test Our very first test, verifying that we can create BookSearcher using Elasticsearch version 8.15.x, might look like this: As you can see, we don't need to set up anything else. We don't need to mock the version returned by Elasticsearch, the only thing we need to do is to provide BookSearcher with a client connected to a real instance of Elasticsearch, which has been started for us by Testcontainers. Integration tests care less about the internals Let's do a little experiment: let's assume that we have to stop extracting data from the result set using column indices, but have to rely on column names. So in the method isCompatibleWithBackend instead of we are going to have: When we re-run both tests we'll notice, that the integration test with real Elasticsearch still passes without any issues. However, the tests using mocks stopped working, because we mocked calls like rs.getInt(int) , not rs.getInt(String) . To have them passing, we now have to either mock them instead, or mock them both, depending on other use cases we have in our test suite. Integration tests can be a cannon to kill a fly Integration tests are capable of verifying the behaviour of the system, even if external dependencies aren't needed. However, using them this way is usually a waste of execution time and resources. Let's look at the method mostPublishedAuthorsInYears(int minYear, int maxYear) . The first two lines are as follows: The first statement is checking a condition, which doesn't depend on Elasticsearch (or any other external dependency) in any way. Therefore, we don't need to start any containers to merely verify, that if the minYear is greater than maxYear , an exception is thrown. A simple mocking test, which is also fast and not resource-heavy is more than enough to ensure that. After setting up the mocks, we can simply go for: Starting a dependency, instead of mocking, would be wasteful in this test case because there's no chance of making a meaningful call for this dependency. However, to verify the behaviour starting with String query = ... , that the query is written correctly, gives results as expected: the client library is capable of sending proper requests and responses, there are no syntax changes and so it's way easier to use an integration test, e.g.: This way, we can rest assured that when we feed our data to Elasticsearch (in this or any future version we choose to migrate to), our query is going to give us exactly what we expected: the data format didn't change, the query is still valid, and all the middleware (clients, drivers, security, etc.) will to continue to work. We don't have to worry about keeping the mocks up to date, the only change needed to ensure compatibility with e.g. 8.15 would be changing this: The same happens if you decide to e.g. use good old QueryDSL instead of ES|QL: the results you receive from the query (regardless of the language) should still be the same. Use both approaches when needed The case of the method mostPublishedAuthorsInYears illustrates that a single method can be tested using both methods. And perhaps even should be. Using only mocks means we have to maintain the mock and have zero confidence when upgrading our system. Using only integration tests would mean that we're wasting quite a lot of resources, without needing them at all. Let's recap Using both mocking and integration tests with Elasticsearch is possible. Use mocking tests as fast-detection-net and only if they pass successfully, start tests with dependencies (e.g. using ./mvnw test '-Dtest=!TestInt*' && ./mvnw test '-Dtest=TestInt*' or Failsafe and Surefire plugins). Use mocks when testing the behaviour of your system (\"lines of code\") where integration with external dependencies doesn't really matter (or could even be skipped). Use integration tests to verify your assumptions about and integration with external systems. Don't be afraid to test using both approaches – if it makes sense – according to the points above. One could make an observation, that being so strict about the version (in our case 8.15.x ) is too much. Using just the version tag alone could be, but please be aware that in this post it serves as the representation of all other features that might change between the versions. In the next installment in the series , we'll look at ways of initialising Elasticsearch running in a test container, with test data sets. Let us know if you built anything based on this blog or if you have questions on our Discuss forums and the community Slack channel . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Good tests for system confidence Two kinds of tests: Mocks & dependencies The sweet spot: use both tests Example of SystemUnderTest Test with mocks to start with Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Testing your Java code with mocks and real Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/tests-with-mocks-and-real-elasticsearch","meta_description":"Explore how to test java using Elasticsearch as an external system dependency. This blog covers integration and mocks testing as well as testcontainers."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Navigating an Elastic vector database An overview of operating a modern Elastic vector database with practical code samples. Vector Database How To JC By: Justin Castilla On September 25, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector databases are quickly becoming the de facto data store for semantic search , a type of search that considers context and meaning of content over traditional keyword search. Elastic has consistently provided modern tooling to perform semantic search, and it is important to identify and understand the larger mechanisms required to query a vector database. This article will cover the necessary components to operate a vector database within Elasticsearch. We will explore the underlying technologies to create an efficient system to ensure the best performance and speed balance. Topics covered include: Current Elastic Full-Text Search ( BM25 & TF/IDF ) Vectors, Embedding, and necessary Considerations Choosing an Embedding Model Indexing a Vector Vector Query Algorithms Running the code to create your own vector database To create your own vector database as shown within this article, you’ll need an Elasticsearch instance optimized for machine learning (8.13 or later) Docker Desktop or a similar Docker container manager Python 3.8 (or later) Elasticsearch Python Client An associated repository is available here to illustrate the various moving parts necessary for querying a vector database. We will observe code snippets throughout the article from this source. The example repository will create an index of approximately 10,000 objects supplied in a JSON file representing science fiction books reviewed on goodreads.com . Each object will have a vector embedding of text that describes the book. The goal of this article and repository is to demonstrate searching for books based on vector similarity between a provided query string and the embedded book description vectors. Here is a sample book object that we will be using within our code samples. Current Elastic full-text search ( BM25 & TF/IDF ) To understand the scope and benefit of vector databases over traditional full-text search databases, it is worth taking a moment to review the current underlying technology that powers Elasticsearch, BM25, and its predecessor, TF/IDF. TF/IDF TF/IDF is the statistical measure of the frequency and importance of a query term based on how often it appears in an individual document and its rate of occurrence within an entire index of documents. Term Frequency (TF): This is a measure of how often a term occurs within a document. A higher occurrence of the term within a document, the higher the likelihood that the document will be relevant to the original query. This is measured as a raw count or normalized count based on the occurrence of the term divided by the total number of terms in the document Inverse Document Frequency (IDF): This is a measure of how important a query term is based on the overall frequency of use over all documents in an index. A term that occurs across more documents is considered less important and informative as is thus given less weight in the scoring. This is calculated as the logarithm of the total documents divided by the number of documents that contain the query term. I D F = l o g ( n t ​ N ​ ) IDF=log(\\frac{nₜ}{​N}​)\\newline I D F = l o g ( ​ N n t ​ ​ ​ ) nₜ = number of documents containing the term N = total number of documents Using our example of Science Fiction books, lets break down a few sentences with a focus on the words \"space\" and \"whales\". \"The interstellar whales graceully swam through the cosmos onto their next adventure.\" \"The space pirates encountered a human space station on their trip to Venus, ripe for plunder\" \"In space no one can year you scream.\" \" Space is the place to be.\" \"Purrgil were a semi-sentient species of massive whales that lived in deep space , traveling from star system to star system.\" Space The term \"space\" appears in 4 documents (Sentences 2, 3, 4, and 5). Total number of documents (N) = 5. IDF formula for “space”: I D F = log ⁡ ( 5 4 ) ≈ 0.10 IDF=\\log \\left( \\frac{5}{4} \\right)≈0.10\\ I D F = lo g ( 4 5 ​ ) ≈ 0.10 \"space\" is considered less important because it appears frequently across 4 of the 5 sentences. Whales The term \"whales\" appears in 2 documents (Sentences 1 and 5). Total number of documents (N) = 5. IDF formula for “whales”: I D F = log ⁡ ( 5 2 ) ≈ 0.40 IDF=\\log \\left( \\frac{5}{2} \\right)≈0.40\\ I D F = lo g ( 2 5 ​ ) ≈ 0.40 \"Whales\" is considered relatively important because it appears in only 2 of 5 sentences. \"Whales\" is assigned a higher IDF value because it appears in fewer documents, making it more distinctive and relevant to the context of those documents. Conversely, \"space\" is more common across the documents, leading to a lower IDF score, thus indicating that it is less useful for distinguishing between the documents. In the context of our codebase's 10,000 book ojbjects, the term \"space\" occurs 1,319 times, while the term \"whale\" appears 6 times total. It is understandable that a search for \"space whales\" would first prioritize the occurence of \"whales\" as more important than \"space.\" While still considered a powerful search algorithm in its own right, TF/IDF fails to prevent search bias on longer documents that may have a proportionately larger amount of term occurrences compared to a smaller document. BM25 BM25 (Best Match 25) is an enhanced ranking function using IDF components to create a score outlining the relevance of a document over others within an index. Term Frequency Saturation: It has been noted that at a certain point, having a high occurrence of a query term in a document does not significantly increase its relevance compared to other documents with a similarly high count. BM25 introduces a saturation parameter for term frequency, which reduces the impact of the term frequency logarithmically as the count increases. This prevents very large documents with high term occurrences from disproportionately affecting the relevance score, ensuring that all scores level off after a certain point. Enhanced Inverse Document Frequency (IDF): In cases where a query term appears in all documents or has a very low occurrence, the IDF score might become zero. To avoid this, BM25 adds a value of 0.5 to the calculation, ensuring that the IDF scores do not reach zero or become too low. Average Document Length : This is a consideration of the actual length of the document. It's understood that longer documents may naturally have more occurrences of a term compared to shorter documents, which doesn't necessarily mean they are more relevant. This adjustment compensates for document length to avoid a bias towards longer documents simply due to their higher term frequencies. BM25 is an excellent tool for efficient search and retrieval of exact-matches with text. Since 2016 it has been the default search algorithm for Lucene (versions 6.0 and higher), the underlying low-level search engine used by Elasticsearch. It should be noted that BM25 cannot provide semantic search as with vectors, which provides an understanding of the context of the query, rather than pure keyword search. It should also be noted that vocabulary mismatch may cause issues with receiving proper results. As an example, a user-submitted query using the term \"cosmos\" may not retrieve the intended results if the words don't exactly match, such as documents containing the term \"space.\" It is understood that \"cosmos\" is another term for \"space\", but this isn't explicity known or checked for with the default BM25f algorithm. Knowing when to choose traditional keyword search over semantic search is crucial to ensure efficient use of computational resources. Running the code: A full-text search (BM25) Further Reading: The BM25 Algorithm and its variables BM25: A Guide to Modern Information Retrieveal BM25 The Next Generation of Lucene Relevance Vectors & vector databases Vector databases essentially store and index mathematical representations (vectors) of documents for similarity search. Vectorization of data allows for a normalization of complex and nuanced documents (text, images, audio, video, etc.) into a format that computers may compare with other vectors with consistent similarity results. It is important to understand the many mechanisms in place to provide a production-ready solution. What is a vector? A vector is a representation of data information projected into the mathematical realm as an array of numbers. With numbers instead of words, comparisons are very efficient for computers and thus offer a considerable performance boost. Nearly every conceivable data type (text, images, audio, video, etc.) used in computing may be converted into vector representations. Images are broken down to the pixel and visual patterns such as textures, gradients, corners, and transparencies are captured into numeric representations. Words, words in specific phrases, and entire sentences are also analyzed, assigned various sentiment, contextual, and synonym values and converted to arrays of floating points. It is within these multidimensional matrices where systems are able to discern numeric similarities in certain portions of the vector to find similarly colored inventory in commerce sites, answers to coding questions on Elastic.co, or recognize the voice of a Nobel Prize winner. Each data type benefits from a dedicated Vector Embedding Model, which can best identify and store the various characteristics of that particular type. A text embedding model excels at understanding common phrases and nuanced alliteration, while completely failing to recognize the emotions displayed on a posing figure in an image. Above we can see the embedding model receiving three different strings as input and producing three distinct vectors (arrays of floats) as output. Embedding models When converting a vector of data, in this case, text, a model is used. It should be noted that models An embedding model is a pre-trained machine-learning instance that converts text (words, phrases, and sentences) into numerical representations. These representations become multidimensional arrays of floats, with each dimension representing a different characteristic of the original text, such as sentiment, context, and syntactics. These various representations allow for comparison with other vectors to find similar documents and text fragments. Different embedding models have been developed that provide various benefits; some are extremely hardware efficient and can be run with less computational power. Some have a greater “understanding” of the context and content within the index it is storing within and can answer questions, perform text summarization, and lead a threaded conversation. Some focus on having an acceptable balance of performance and speed and efficiency. Choosing an embedding model We will briefly describe three models for text embeddings and their various properties. Word2Vec : Efficient and simple to train, this model provides good quality embeddings for words based on their context as well as semantic meanings. Word2VEc is best used in applications with semantic similarly needs, sentiment analysis, or when computational resources are limited. GloVe : Similar to Word2Vec in many respects, GloVe builds a global awareness across an entire index. By analyzing word co-occurrences throughout the entire text, GloVe captures both the frequency and context of words, resulting in embeddings that reflect the overall relationships and meanings of words in the totality of the stored vectors. BERT : Unlike Word2Vec and GloVe, BERT (Bidirectional Encoder Representations from Transformers) looks at the text before and after each word to establish a local context and semantic meaning. By pre-training this model on a large body of text, this model excels at facilitating question and answer tasks as well as sentiment analysis. Running the code: Creating an ingest pipeline to embed vectors For the coding example, a smaller, simpler version of BERT was chosen called sentence-transformers__msmarco-minilm-l-12-v3 . It is considered a MiniLM, which is more efficient than normal sized models yet still retains the performance needed for vector similarity. This is a good model choice for a non-production tutorial to get the code running quickly with no necessary fine tuning. More information about the model is available here Below we are creating an ingest pipeline for our index books . This means that all book objects that are created and stored in the Elasticsearch index will automatically have their book_description field converted to a vector embedding named description_embedding . This reduces the codebase necessary to create new book objects on the client side. If there is a failure, the documents will be stored in the failure-books index and an error message will be included in the documents. This allows you to view any errors that may have caused the failed embedding, and the ability to re-index the failed-index with updated code, ensuring no documents are lost or left behind. Note: Since the workload of embedding the vectors is passed on to Elastic via this Inference Ingest Pipeline, it should be noted that a larger tier of available CPU and RAM in this Elasticsearch cloud instance may be desirable to allow for quick embedding and indexing of new and updated documents. If the workload of embedding is left to a local application codebase, consideration should also be given to the necessary compute hardware during times of high throughput. The provided codebase for this article includes an option to embed the book_description locally to allow for comparison and compute pressure. This snippet is creating an ingest pipeline named text-embedding which creates an inference processor. The processor uses the sentence-transformers__msmarco-minilm-l-12-v3 model to copy and convert the book_description text to a vector embedding and stores it under the description_embedding property. Indexing vectors The method in which vectors are indexed has a significant impact on the performance and accuracy of search results. Indexing vectors entails storing them in specialized data structures designed to ensure efficient similarity search, speedy vector distance calculations, and ultimately vector retrievals as results. How you decide to store your vectors should be based on your unique data needs. It should also be noted that Elasticsearch uses the term index as a verb acted upon documents to mean adding a document to an index. Care should be taken so as to not confuse the two. There are two general methods of indexing documents: KNN, and ANN. It is important to make the distinction between the two and the tradeoffs when selecting one or the other. KNN Given k=4, four of the nearest vectors to the query vector (yellow) are selected and returned. KNN (K-Nearest Neighbors) will provide an exact result of the K closest neighbor vectors based on a provided distance metric. As with most things that return exact results, the tradeoff is speed, which must be sacrificed for accuracy. The KNN method collates every distance from a target point to all other existing points in a dimension of a vector. The distances are then sorted and the closest K are returned. In the diagram above, a k of 4 is requested and the four nearest vectors are returned ANN ANN (Approximate Nearest Neighbors) will provide an approximation of the nearest neighbors based on an established index of vectors that have had their dimensions lowered for easier and faster processing. The tradeoff is a sped-up seeking phase where traversal is aided by an index, which could be thought of as a predefined map of all the vectors. ANN is preferred over KNN in semantic search when speed, scalability, and resource efficiency are considered higher priority over exact precision. This makes ANN a practical choice for large-scale and real-time applications where fast, approximate matches are acceptable. Much like a Probablistic Data Structure, a slight hit to accuracy has been accepted for the sake of speed and size. As an example, think of various vector points being stored in shelves in a grocery store and you are given a map when you enter the facility. A premade map would allow you to find where you want to go vs traversing every single aisle from entrance to exit until you finally reach your intended point (KNN). HNSW HNSW (Hierarchical Navigable Small World) is the default ANN method with Elastic Vector Databases. It utilizes graph-based relationships (nodes and vertices) to efficiently traverse an index to find the nearest neighbors. In this case, the nodes are data points and the edges are connections to nearby neighbors. HNSW consists of many layers of graphs that have closer and closer distances to each other. With HNSW, each subsequent traversal to a higher layer exposes closer nodes, or vectors, until the desired vector(s) are found. This is similar to a traditional skip list data structure, where the higher layers have elements that are more far apart from each other, or “coarse” granularity, whereas the lower layers have elements that are closer to each other, or “finer” granularity. The traversal from a higher to lower plane of nodes and vertices means that not all vectors need to be searched and compared, only the nearest neighbors. This ensures that high dimensional vectors can be indexed and searched quickly and efficiently. HNSW is ideally used for real-time semantic search and recommendation systems that have high-dimensional vectors yet can accept approximate results over exact nearest neighbors. HNSW can be considered extraneous with smaller data sets, lower-dimensional vectors, or when memory size-constraints are a factor. Due to the complexity of the graph structure, HNSW may not be a good fit for a highly dynamic data store where inserts, updates, and deletions occur in high frequency. This is due to the overhead maintenance of the graph connections throughout the many different layers required. Like all facets of implementing a vector database, a balance must be struck between available resources, latency tolerance, and desired performance. Running the code: Creating an index Elasticsearch provides an indices.create() method documented here which creates an index based on a given index name and mapping of expected data types within the documents indexed. This allows for faster, efficient indexing and retrieval of documents based on numerical ranges, full text, and keyword searches as well as semantic search. Note that the description_embedding property is not included - that will be created automatically by the ingest pipeline defined above when inserting the book objects. Elasticsearch provides an indices.create() method which creates an index based on a given index name and mapping of expected data types within the documents indexed. This allows for faster, efficient indexing and retrieval of documents based on numerical ranges, full text, and keyword searches as well as semantic search. Note that the description_embedding property is not included - that will be created automatically by the ingest pipeline defined above. Now that the index books has been created, we can populate Elasticsearch with book objects to use for semantic search. Let’s start by storing a single book object in Elasticsearch. Here we are running the bulk insertion method which greatly reduces the time necessary to index our starting library of 10,000+ books. The bulk method is recommended when indexing numerous objects. More information can be found here . Note that in both single indexing and bulk indexing methods we are including the pipeline=’text-embedding’ argument to have Elasticsearch trigger our inference processor defined above every time a new book object is added. Below is the post-index document of a sample book that has been indexed into the books vector database. There are now two new fields: model_id : the embedding model that was use to create our vectors. description_embedding : The vector embedding created from our book_description field. It has been truncated in this article for space, as there are 384 total float values in the array (the dimensions of our specific model chosen.) Vector search algorithms With a vector database full of embedded vectors, semantic search and vector querying may now take place. Each index method and embedding model excel with the utilization of different algorithms to return optimized results. We will cover the most commonly used methods and discuss their specific applications. Cosine Similarity : Cosine Similarity, the default algorithm for Elasticsearch's vector search, measures the cosine of the angle between two vectors in a space. This distance metric is ideal for semantic search and recommendation systems. ANN indexes such as HNSW are optimized to be used with cosine similarity specifically. The smaller the angle, the more similar the vectors are, thus the nearer neighbor. The larger the angle, the less related they are, with 90 ° \\degree ° , or perpendicularity, being unrelated altogether. Any value over 90 ° \\degree ° is considered treading into the “opposite” of what a given vector contains. Above are three pairs vectors that display different directions. These directions illustrate how the pairs can be similar, dissimilar, or completely opposite from each other One caveat of Cosine Similarity is the curse of dimensionality. This happens when the distance between a vector pairing begins to reach an average value between other vector pairings since the space in which a vector occupies becomes vaster and vaster. The distances between vectors become farther and farther the more features or data points exist. This occurs in very high dimension vectors - care should be taken to evaluate different distance metrics to meet your needs Dot Product: Dot Product receives two vectors as input and sum the product of each of the individual components. D o t P r o d u c t = A 1 ⋅ ​ B 1 ​ + A 2 ⋅ B 2 ​ + + A n ⋅ ​ B n Dot \\thinspace Product = A₁⋅​B₁ ​+ A₂⋅B₂​ + + Aₙ⋅​Bₙ Do t P ro d u c t = A 1 ​ ⋅ ​ B 1 ​ ​ + A 2 ​ ⋅ B 2 ​ ​ + + A n ​ ⋅ ​ B n ​ As an example, if vector A contains the components [1,3,5] and vector B contains the components [4,9,1], the resulting computation would be as follows: A = [ 1 , 3 , 5 ] B = [ 4 , 9 , 1 ] ( 1 ⋅ 4 ) + ( 3 ⋅ 9 ) + ( 5 ⋅ 1 ) = 36 A = [1,3,5] \\newline B = [4,9,1] \\newline (1⋅4) + (3⋅9) + (5⋅1) = 36 A = [ 1 , 3 , 5 ] B = [ 4 , 9 , 1 ] ( 1 ⋅ 4 ) + ( 3 ⋅ 9 ) + ( 5 ⋅ 1 ) = 36 A higher sum value represents a closer similarity between the given vectors. If the value is or near a zero value, the vectors are perpendicular to each other, which makes them considered unrelated. A negative value means that the vectors are opposite each other. Euclidean (L2) : Euclidean Distance can be imagined as an n-dimensional extension of the Pythagorean Theorem (a² + b² = c²) , which is used to find the hypotenuse of a right triangle. For each component in a vector A, we determine the distance from its corresponding component in vector B. This can be achieved with the absolute value of one component’s value subtracted from the other. We then square the difference and add it to the next squared component difference, until every distance between every component in the two vectors has been determined, squared, and summed to create one final value. We then find the square root of that value to reach our Euclidean Distance between the two vectors A and B. E u c l i d e a n D i s t a n c e = ( A 1 ​ − B 1 ​ ) 2 + ( A 2 ​ − B 2 ​ ) 2 + ( A n ​ − B n ​ ) 2 Euclidean \\thinspace Distance = \\sqrt{ (A₁​− B₁​)²+(A₂​−B₂​)²+(Aₙ ​− Bₙ​)² } E u c l i d e an D i s t an ce = ( A 1 ​ ​ − B 1 ​ ​ ) 2 + ( A 2 ​ ​ − B 2 ​ ​ ) 2 + ( A n ​ ​ − B n ​ ​ ) 2 ​ As an example, we have 2 vectors A [3, 4, 5] and B [6, 7, 8] Our computation would be as follows: ( 3 ​ − 6 ​ ) 2 + ( 4 ​ − 7 ​ ) 2 + ( 5 ​ − 8 ​ ) 2 9 + 9 + 9 27 ≈ 5.196 \\sqrt{(3 ​− 6​)²+( 4​ − 7 ​)²+( 5 ​− 8 ​)²}\\newline \\sqrt{ 9 + 9 + 9 }\\newline \\sqrt{ 27 } \\approx 5.196 ( 3​ − 6​ ) 2 + ( 4​ − 7​ ) 2 + ( 5​ − 8​ ) 2 ​ 9 + 9 + 9 ​ 27 ​ ≈ 5.196 Between the two vectors A and B, the distance is approximately 5.196. Euclidean, much like Cosine, suffers from the curse of dimensionality, with most vector distances becoming homogeneously similar at higher dimensionalities. For this reason Euclidean is recommended for lower-dimensionality vectors. Manhattan (L1): Manhattan distance, similar to Euclidean distance, sums the distances of the components of two corresponding vectors. Instead of finding the exact direct distance between two points as a line, Manhattan can be thought of as using a grid system, much like the block layout in the city of Manhattan, New York. As an example, if a person walks 3 blocks north, then 4 blocks east to reach their destination, then you will have traveled a total distance of 7 blocks. This can be generalized as: S u m ( ∣ A 1 ​ − B 1 ∣ + ∣ A 2 ​ − B 2 ∣ + ∣ A n − B n ∣ ) Sum(| A₁​− B₁| + | A₂​− B₂| + | Aₙ− Bₙ|) S u m ( ∣ A 1 ​ ​ − B 1 ​ ∣ + ∣ A 2 ​ ​ − B 2 ​ ∣ + ∣ A n ​ − B n ​ ∣ ) In our numbered example, we can establish our origin as [0,0] and our destination as [3,4]. Therefore this computation would apply: S u m ( ∣ 0 − 3 ∣ + ∣ 0 − 4 ∣ ) S u m ( 3 + 4 ) = 7 Sum( | 0 - 3 | + | 0 - 4 | )\\newline Sum( 3 + 4 ) = 7 S u m ( ∣0 − 3∣ + ∣0 − 4∣ ) S u m ( 3 + 4 ) = 7 Unlike Euclidean and Cosine, Manhattan scales well in higher-dimensional vectors and is a great candidate for feature-rich vectors. Changing similarity algorithms To set a specific similarity algorithm of a vector field type in Elasticsearch, use the similarity field in the index mappings object. Elasticsearch allows you to define the similarity as dot_prodcut or l2_norm (Euclidean). With no similarity field definition, Elasticsearch defaults to cosine . Here we are choosing l2_norm as our similarity metric for our description_embedding field: Putting it all together: Utilizing a vector database Now that we have a fundamental understanding of how to create a vector, and the methods to compare vector similarity, we need to understand the sequences of events to successfully utilize our vector database. We shall assume now that we have a database full of vector embeddings representing data. All of the vectors have been created using the same model. We receive raw query data. This query data text is embedded using the same model we used previously. This gives us a resulting query vector that will have the same dimensions and features as the vectors existing in our database. We run a similarity algorithm between our query vector and the index of vectors to find the vectors with the highest degree of similarity based on our chosen distance metric and indexing method. We receive our results based on their similarity score. Each vector returned should also have the original unembedded data as well as any pertinent information to the dataset subject matter. In the code sample below, we execute the search command with a knn argument that contains what field to compare ( description_embedding ) and the original query string along with which model to use to embed the query. The search method converts the query to a vector and runs the similarity algorithm. As a response, we receive a payload back from the Elastic cloud containing an array of book objects that have been sorted by similarity score, with 0 being the least relevant, and 1 being a perfect match. Here is a truncated version of the response: Conclusion Operating a vector database within Elasticsearch opens up new possibilities for efficiently managing and querying complex datasets, far beyond what traditional full-text search methods like BM25 or TF/IDF can offer. By selecting and testing vector embedding models and similarity algorithms against your specific use case, you can enable sophisticated semantic search functionality that understands the nuances of your data, be it text, images, or other multimedia. This is critical in applications that require precise and context-aware search results, such as recommendation systems, natural language processing, and image recognition. In the process of building a vector database around the volume of book objects in our repository, hopefully you will see the utility of searching through the individual book descriptions with natural human language. This provides an opportunity to speak to a librarian or book store clerk for our data. By providing contextual understanding to your query input and matching it with the existing documents that have already been processed and vectorized, the power of semantic search providing the ideal use case. RAG (Retrieval Augmented Generation) is the process of using a transformer model, such as ChatGPT, that has been granted access to your curated documents to generate natural language answers to natural language queries. This provides an enhanced user experience and can handle complex queries. Conversely, thought should also be given before and after implementing a vector database as to whether or not semantic search is necessary for your specific use case. Well-crafted queries in a traditional full-text query ecosystem may return the same or better results with lower computational overhead. Care must be given to evaluate the complexity and anticipated scale of your data before opting for a vector database, as smaller or simpler datasets might not remarkably benefit from the addition of vector embeddings. Oftentimes fine-tuning indexing strategies and implementing ranking models within a traditional search framework can provide more efficient performance without the need for machine learning enhancements. As vector databases and the supporting technologies continue to evolve, staying informed about the latest developments, such as generative AI integrations and further tuning techniques like tokenization and quantization, will be crucial. These advancements will not only enhance the performance and scalability of your vector database but also ensure that it remains adaptable to the growing demands of modern applications. With the right tools and knowledge, you can fully harness the power of Elasticsearch's vector capabilities to deliver cutting-edge solutions to your daily tasks. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Running the code to create your own vector database Current Elastic full-text search ( BM25 & TF/IDF ) TF/IDF Space Whales Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Navigating an Elastic vector database - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-vector-database-practical-example","meta_description":"A tutorial on building, managing and operating a Elastic vector database with practical code samples. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Using the Elasticsearch Go client for keyword search, vector search & hybrid search This series explains how to use the Elasticsearch Go client for traditional keyword search, vector search and hybrid search. Part1 Go How To October 31, 2023 Perform text queries with the Elasticsearch Go client Learn how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client through a practical example. CR LS By: Carly Richmond and Laurent Saint-Félix Part2 Vector Database How To November 1, 2023 Perform vector search in Elasticsearch with the Elasticsearch Go client Learn how to perform vector search in Elasticsearch using the Elasticsearch Go client through a practical example. CR LS By: Carly Richmond and Laurent Saint-Félix Part3 Vector Database How To November 2, 2023 Using hybrid search for gopher hunting with Elasticsearch and Go Learn how to achieve hybrid search by combining keyword and vector search using Elasticsearch and the Elasticsearch Go client. CR LS By: Carly Richmond and Laurent Saint-Félix Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using the Elasticsearch Go client for keyword search, vector search & hybrid search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/elasticsearch-go-client-for-keyword-vector-and-hybrid-search","meta_description":"This series explains how to use the Elasticsearch Go client for traditional keyword search, vector search and hybrid search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Using Cohere embeddings with Elastic-built search experiences Elasticsearch now supports Cohere embeddings! This blog explains how to use Cohere embeddings with Elastic-built search experiences. Integrations Vector Database How To SC JB DK By: Serena Chou , Jonathan Buttner and Dave Kyle On April 11, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Elasticsearch open inference API adds support for Cohere Embeddings We're pleased to announce that Elasticsearch now supports Cohere embeddings! Releasing this capability has been a great journey of collaboration with the Cohere team, with more to come. Cohere is an exciting innovator in the generative AI space and we're proud to enable developers to use Cohere's incredible text embeddings with Elasticsearch as the vector database, to build semantic search use cases. This blog goes over Elastic's approach to shipping and explains how to use Cohere embeddings with Elastic-built search experiences. Elastic's approach to shipping: frequent, production ready iterations Before we dive in, if you're new to Elastic (welcome!), we've always believed in investing in our technology of choice (Apache Lucene) and ensuring contributions can be used as production grade capabilities, in the fastest release mode we can provide. Let's dig into what we've built so far, and what we will be able to deliver soon: In August 2023 we discussed our contribution to Lucene to enable maximum inner product and enable Cohere embeddings to be first class citizens of the Elastic Stack. This was contributed first into Lucene and released in the Elasticsearch 8.11 version . In that same release we also introduced the tech preview of our /_inference API endpoint which supported embeddings from models managed in Elasticsearch, but quickly in the following release, we established a pattern of integration with third party model providers such as Hugging Face and OpenAI . Cohere embeddings support is already available to customers participating in the preview of our stateless offering on Elastic Cloud and soon will be available in an upcoming Elasticsearch release for all. You'll need a Cohere account, and some working knowledge of the Cohere Embed endpoint . You have a choice of available models, but if you're just trying this out for the first time we recommend using the model embed-english-v3.0 or if you're looking for a multilingual variant try embed-multilingual-v3.0 with dimension size 1024. In Kibana , you'll have access to a console for you to input these next steps in Elasticsearch even without an IDE set up. When you choose to run this command in the console you should see a corresponding 200 for the creation of your named Cohere inference service. In this configuration we've specified that the embedding_type is byte which will be the equivalent to asking Cohere to return signed int8 embeddings. This is only a valid configuration if you're choosing to use a v3 model. You'll want to set up the mappings in the index to prepare for the storage of your embeddings that you will soon retrieve from Cohere. Elasticsearch vector database for Cohere embeddings In the definition of the mapping you will find an excellent example of another contribution made by the Elastic team to Lucene, the ability to use Scalar Quantization . Just for fun, we've posted the command you would see in our Getting Started experience that ingests a simple book catalog. At this point you have your books content in an Elasticsearch index and now you need to enable Cohere to generate embeddings on the documents! To accomplish this step, you'll be setting up an ingest pipeline which utilizes our inference processor to make the call to the inference service you defined in the first PUT request. If you weren't ingesting something as simple as this books catalog, you might be wondering how you'd handle token limits for the selected model. If you needed to, you could quickly amend your created ingest pipeline to chunk large documents , or use additional transformation tools to handle your chunking prior to first ingest. If you're looking for additional tools to help figure out your chunking strategy, look no further than these notebooks in Search Labs . Fun fact, in the near future, this step will be made completely optional for Elasticsearch developers. As was mentioned at the beginning of this blog, this integration we're showing you today is a firm foundation for many more changes to come. One of which will be a drastic simplification of this step, where you won't have to worry about chunking at all, nor the construction and design of an ingest pipeline. Elastic will handle those steps for you with great defaults! You're set up with your destination index, and the ingest pipeline, now it's time to reindex to force the documents through the step. Elastic kNN search for Cohere vector embeddings Now you're ready to issue your first vector search with Cohere embeddings. It's as easy as that. If you have already achieved a good level of understanding of vector search, we highly recommend you read this blog on running kNN as a query - which unlocks expert mode! This integration with Cohere is offered in Serverless and in Elasticsearch 8.13 . Happy Searching, and big thanks again to the Cohere team for their collaboration on this project! Looking to use more of Cohere's capabilities?: Read about our support for Cohere's Rerank 3 model Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Elasticsearch open inference API adds support for Cohere Embeddings Elastic's approach to shipping: frequent, production ready iterations Elasticsearch vector database for Cohere embeddings Elastic kNN search for Cohere vector embeddings Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using Cohere embeddings with Elastic-built search experiences - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-cohere-embeddings-support","meta_description":"Elasticsearch now supports Cohere embeddings! This blog explains how to use Cohere embeddings with Elastic-built search experiences."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog / Series Jira connector tutorials Learn how to integrate Elasticsearch with Jira using Elastic’s Jira native connector and explore optimization techniques. Part1 Integrations Ingestion +1 January 15, 2025 Elastic Jira connector tutorial part I Reviewing a use case for the Elastic Jira connector. We'll be indexing our Jira content into Elasticsearch to create a unified data source and do search with Document Level Security. GL By: Gustavo Llermaly Part2 Integrations Ingestion +1 January 16, 2025 Elastic Jira connector tutorial part II: Optimization tips After connecting Jira to Elasticsearch, we'll now review best practices to escalate this deployment. GL By: Gustavo Llermaly Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Jira connector tutorials - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/series/jira-connector-tutorials","meta_description":"Learn how to integrate Elasticsearch with Jira using Elastic’s Jira native connector and explore optimization techniques."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Benchmarking passage retrieval In this blog post, we'll examine benchmark solutions to compare retrieval methods. We use a collection of data sets to benchmark BM25 against two dense models and illustrate the potential gain using fine-tuning strategies with one of those models. Generative AI GC QH TV By: Grégoire Corbière , Quentin Herreros and Thomas Veasey On July 13, 2023 Part of Series Improving information retrieval in the Elastic Stack Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In a previous blog post , we discussed common approaches to information retrieval and introduced the concepts of models and training stages. Here, we will examine benchmark solutions to compare various methods in a fair manner. Note that the task of benchmarking is not straightforward and can lead to misperceptions about how models perform in real-world scenarios. Historically, comparisons between BM25 and learned retrieval models have been based on limited data sets, or even only on the training data set of these dense models: MSMARCO, which may not provide an accurate representation of the models' performance on your data. Despite this approach being useful for demonstrating how well a dense model performs against BM25 in a specific domain, it does not capture one of BM25's key strengths: its ability to perform well in many domains without the need for supervised fine-tuning. Therefore, it may be considered unfair to compare these two methods using such a specific data set. The BEIR paper (\" BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models ,\" by Takhur et al. 2021) offers to address the issue of evaluating information retrieval methods in a generic setting. The paper proposes a framework using 18 publicly available data sets from a diverse range of topics to benchmark state-of-the-art retrieval systems. In this post, we use a subcollection of those data sets to benchmark BM25 against two dense models that have been specifically trained for retrieval. Then we will illustrate the potential gain achievable using fine-tuning strategies with one of those dense models. We plan to return to this benchmark in our next blog post, since it forms the basis of the testing we have done to enhance Elasticsearch relevance using language models in a zero-shot setting. BEIR data sets Performance can vary greatly between retrieval methods, depending on the type of query, document size, or topic. In order to assess the diversity of data sets and to identify potential blind spots in our benchmarks, a classification algorithm trained to recognize natural questions was used to understand queries typology. The results are summarized in Table 1. In our benchmarks, we choose not to include MSMARCO to solely emphasize performance in unfamiliar settings. Evaluating a model in a setting that is different from its training data is valuable when the nature of your use case data is unknown or resource constraints prevent adapting the model specifically. Search relevance metrics Selecting the appropriate metric is crucial in evaluating a model's ranking ability accurately. Of the various metrics available, three are commonly utilized for search relevance: Mean Reciprocal Rank (MRR) is the most straightforward metric. While it is easy to calculate, it only considers the first relevant item in the results list and ignores the possibility that a single query could have multiple relevant documents. In some instances, MRR may suffice, but it is often not precise enough. Mean Average Precision (MAP) excels in ranking lists and works well for binary relevance ratings (a document is either relevant or non-relevant). However, in data sets with fine-grained ratings, MAP is not able to distinguish between a highly relevant document and a moderately relevant document. Also, it is only appropriate if the list is reordered since it is not sensitive to order; a search engineer will prefer that the relevant documents appear first. Normalized Discounted Cumulative Gain (NDCG) is the most complete metric as it can handle multiple relevant documents and fine-grained document ratings. This is the metric we will examine in this blog and future ones. All of these metrics are applied to a fixed-sized list of retrieved documents. The list size can vary depending on the task at hand. For example, a preliminary retrieval before a reranking task might consider the top 1000 retrieved documents, while a single-stage retrieval might use a smaller list size to mimic a user's search engine behavior. We have chosen to fix the list size to the top 10 documents, which aligns with our use case. BM25 and dense models out-of-domain In our previous blog post , we noted that dense models, due to their training design, are optimized for specific data sets. While they have been shown to perform well on this particular data set, in this section we explore if they maintain their performance when used out-of-domain. To do this, we compare the performance of two state-of-the-art dense retrievers ( msmarco-distilbert-base-tas-b and msmarco-roberta-base-ance-fristp ) with BM25 in Elasticsearch using the default settings and English analyzer. Those two dense models both outperform BM25 on MSMARCO (as seen in the BEIR paper ), as they are trained specifically on this data set. However, they are usually worse out-of-domain. In other words, if a model is not well adapted to your specific data, it’s very likely that using kNN and dense models would degrade your retrieval performance in comparison to BM25. Fine-tuning dense models The portrayal of dense models in the previous description isn't the full picture. Their performance can be improved by fine-tuning them for a specific use case with some labeled data that represents that use case. If you have a fine-tuned embedding model, the Elastic Stack is a great platform to both run the inference for you and retrieve similar documents using ANN search. There are various methods for fine-tuning a dense model, some of which are highly sophisticated. However, this blog post won't delve into those methods as it's not the focus. Instead, two methods were tested to gauge the potential improvement that can be achieved with not a lot of domain specific training data. The first method (FineTuned A) involved using labeled positive documents and randomly selecting documents from the corpus as negatives. The second method (FineTuned B) involved using labeled positive documents and using BM25 to identify documents that are similar to the query from BM25's perspective, but aren't labeled as positive. These are referred to as \"hard negatives.\" Labeling data is probably the most challenging aspect in fine-tuning. Depending on the subject and field, manually tagging positive documents can be expensive and complex. Incomplete labeling can also create problems for hard negatives mining, causing adverse effects on fine-tuning. Finally, changes to the topic or semantic structure in a database over time will reduce retrieval accuracy for fine-tuned models. Conclusion We have established a foundation for information retrieval using 13 data sets. The BM25 model performs well in a zero-shot setting and even the most advanced dense models struggle to compete on every data set. These initial benchmarks indicate that current SOTA dense retrieval cannot be used effectively without proper in-domain training. The process of adapting the model requires labeling work, which may not be feasible for users with limited resources. In our next blog, we will discuss alternative approaches for efficient retrieval systems that do not require the creation of a labeled data set. These solutions will be based on hybrid retrieval methods. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid retrieval Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to BEIR data sets Search relevance metrics BM25 and dense models out-of-domain Fine-tuning dense models Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving information retrieval in the Elastic Stack: Benchmarking passage retrieval - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/improving-information-retrieval-elastic-stack-benchmarking-passage-retrieval","meta_description":"In this blog post, we'll examine benchmark solutions to compare retrieval methods. We use a collection of data sets to benchmark BM25 against two dense models and illustrate the potential gain using fine-tuning strategies with one of those models."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog LangChain4j with Elasticsearch as the embedding store LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Discover how to use it to build your RAG application in plain Java. Integrations Java How To DP By: David Pilato On October 8, 2024 Part of Series Introducing LangChain4j: Building RAG apps in plain Java Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In the previous post , we discovered what LangChain4j is and how to: Have a discussion with LLMs by implementing a ChatLanguageModel and a ChatMemory Retain chat history in memory to recall the context of a previous discussion with an LLM This blog post is covering how to: Create vector embeddings from text examples Store vector embeddings in the Elasticsearch embedding store Search for similar vectors Create embeddings To create embeddings, we need to define an EmbeddingModel to use. For example, we can use the same mistral model we used in the previous post . It was running with ollama: A model is able to generate vectors from text. Here we can check the number of dimensions generated by the model: To generate vectors from a text, we can use: Or if we also want to provide Metadata to allow us filtering on things like text, price, release date or whatever, we can use Metadata.from() . For example, we are adding here the game name as a metadata field: If you'd like to run this code, please checkout the Step5EmbedddingsTest.java class. Add Elasticsearch to store our vectors LangChain4j provides an in-memory embedding store. This is useful to run simple tests: But obviously, this could not work with much bigger dataset because this datastore stores everything in memory and we don't have infinite memory on our servers. So, we could instead store our embeddings into Elasticsearch which is by definition \"elastic\" and can scale up and out with your data. For that, let's add Elasticsearch to our project: As you noticed, we also added the Elasticsearch TestContainers module to the project, so we can start an Elasticsearch instance from our tests: To use Elasticsearch as an embedding store, you \"just\" have to switch from the LangChain4j in-memory datastore to the Elasticsearch datastore: This will store your vectors in Elasticsearch in a default index. You can also change the index name to something more meaningful: If you'd like to run this code, please checkout the Step6ElasticsearchEmbedddingsTest.java class. Search for similar vectors To search for similar vectors, we first need to transform our question into a vector representation using the same model we used previously. We already did that, so it's not hard to do this again. Note that we don't need the metadata in this case: We can build a search request with this representation of our question and ask the embedding store to find the first top vectors: We can iterate over the results now and print some information, like the game name which is coming from the metadata and the score: As we could expect, this gives us \"Out Run\" as the first hit: If you'd like to run this code, please checkout the Step7SearchForVectorsTest.java class. Behind the scene The default configuration for the Elasticsearch Embedding store is using the approximate kNN query behind the scene. But this could be changed by providing another configuration ( ElasticsearchConfigurationScript ) than the default one ( ElasticsearchConfigurationKnn ) to the Embedding store: The ElasticsearchConfigurationScript implementation runs behind the scene a script_score query using a cosineSimilarity function . Basically, when calling: This now calls: In which case the result does not change in term of \"order\" but just the score is adjusted because the cosineSimilarity call does not use any approximation but compute the cosine for each of the matching vectors: If you'd like to run this code, please checkout the Step7SearchForVectorsTest.java class. Conclusion We have covered how easily you can generate embeddings from your text and how you can store and search for the closest neighbours in Elasticsearch using 2 different approaches: Using the approximate and fast knn query with the default ElasticsearchConfigurationKnn option Using the exact but slower script_score query with the ElasticsearchConfigurationScript option The next step will be about building a full RAG application, based on what we learned here. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Create embeddings Add Elasticsearch to store our vectors Search for similar vectors Behind the scene Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"LangChain4j with Elasticsearch as the embedding store - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/langchain4j-elasticsearch-embedding-store","meta_description":"LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Learn how to use LangChain4j & Elasticsearch to build a RAG app in Java."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Understanding fused multiply-add (FMA) within vector similarity computations in Lucene Learn how to use fused multiply-add (FMA) within vector similarity computations in Lucene and discover how FMA can improve performance. Lucene Vector Database CH By: Chris Hegarty On November 20, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In Lucene 9.7.0 we added support that leverages SIMD instructions to perform data-parallelization of vector similarity computations. Now we’re pushing this even further with the use of Fused Multiply-Add (FMA). What is fused multiply-add (FMA) Multiply and add is a common operation that computes the product of two numbers and adds that product with a third number. These types of operations are performed over and over during vector similarity computations. a ∗ b + c a * b + c a ∗ b + c Fused multiply-add (FMA) is a single operation that performs both the multiply and add operations in one - the multiplication and addition are said to be “fused” together. FMA is typically faster than a separate multiplication and addition because most CPUs model it as a single instruction. FMA also produces more accurate results. Separate multiply and add operations on floating-point numbers have two rounds; one for the multiplication, and one for the addition, since they are separate instructions that need to produce separate results. That is effectively, r o u n d ( r o u n d ( a ∗ b ) + c ) round(round(a * b) + c) ro u n d ( ro u n d ( a ∗ b ) + c ) Whereas FMA has a single rounding, which applies only to the combined result of the multiplication and addition. That is effectively, r o u n d ( a ∗ b + c ) round(a * b + c) ro u n d ( a ∗ b + c ) Within the FMA instruction the a * b produces an infinite precision intermediate result that is added with c , before the final result is rounded. This eliminates a single round, when compared to separate multiply and add operations, which results in more accuracy. FMA in Lucene: Under the hood So what has actually changed? In Lucene we have replaced the separate multiply and add operations with a single FMA operation. The scalar variants now use Math::fma , while the Panama vectorized variants use FloatVector::fma . If we look at the disassembly we can see the effect that this change has had. Previously we saw this kind of code pattern for the Panama vectorized implementation of dot product. The vmovdqu32 instruction loads 512 bits of packed doubleword values from a memory location into the zmm0 register. The vmulps instruction then multiplies the values in zmm0 with the corresponding packed values from a memory location, and stores the result in zmm0 . Finally, the vaddps instruction adds the 16 packed single precision floating-point values in zmm0 with the corresponding values in zmm4 , and stores the result in zmm4 . With the change to use FloatVector::fma , we see the following pattern: Again, the first instruction is similar to the previous example, where it loads 512 bits of packed doubleword values from a memory location into the zmm0 register. The vfmadd231ps (this is the FMA instruction), multiplies the values in zmm0 with the corresponding packed values from a memory location, adds that intermediate result to the values in zmm4 , performs rounding and stores the resulting 16 packed single precision floating-point values in zmm4 . The vfmadd231ps instruction is doing quite a lot! It’s a clear signal of intent to the CPU about the nature of the computations that the code is running. Given this, the CPU can make smarter decisions about how this is done, which typically results in improved performance (and accuracy as previously described). Performance improvements with FMA In general, the use of FMA typically results in improved performance. But as always you need to benchmark! Thankfully, Lucene deals with quite a bit of complexity when determining whether to use FMA or not, so you don’t have to. Things like, whether the CPU even has support for FMA, if FMA is enabled in the Java Virtual Machine, and only enabling FMA on architectures that have proven to be faster than separate multiply and add operations. As you can probably tell, this heuristic is not perfect, but goes a long way to making the out-of-the-box experience good. While accuracy is improved with FMA, we see no negative effect on pre-existing similarity computations when FMA is not enabled. Along with the use of FMA, the suite of vector similarity functions got some (more) love. All of dot product, square, and cosine distance, both the scalar and Panama vectorized variants have been updated. Optimizations have been applied based on the inspection of disassembly and empirical experiments, which have brought improvements that help fill the pipeline keeping the CPU busy; mostly through more consistent and targeted loop unrolling, as well as removal of data dependencies within loops. It’s not straightforward to put concrete performance improvement numbers on this change, since the effect spans multiple similarity functions and variants, but we see positive throughput improvements, from single digit percentages in floating-point dot product, to higher double digit percentage improvements in cosine. The byte based similarity functions also show similar throughput improvements. Wrapping up In Lucene 9.7.0, we added the ability to enable an alternative faster implementation of the low-level primitive operations used by Vector Search through SIMD instructions. In the upcoming Lucene 9.9.0 we built upon this to leverage faster FMA instructions, as well as to apply optimization techniques more consistently across all the similarity functions. Previous versions of Elasticsearch are already benefiting from SIMD, and the upcoming Elasticsearch 8.12.0 will have the FMA improvements. Finally, I'd like to call out Lucene PMC member Robert Muir for continuing to make improvements in this area, and for the enjoyable and productive collaboration. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to What is fused multiply-add (FMA) FMA in Lucene: Under the hood Performance improvements with FMA Wrapping up Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Understanding fused multiply-add (FMA) within vector similarity computations in Lucene - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/vector-similarity-computations-fma-style","meta_description":"Learn how to use fused multiply-add (FMA) within vector similarity computations in Lucene and discover how FMA can improve performance."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing Elasticsearch Relevance Engine (ESRE) — Advanced search for the AI revolution Explore the Elasticsearch Relevance Engine (ESRE) by Elastic. ESRE powers gen AI solutions for private data sets with a vector database and ML models for semantic search. ML Research MR By: Matt Riley On June 21, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We’ve introduced the Elasticsearch Relevance Engine (ESRE) , new capabilities for creating highly relevant AI search applications. ESRE builds on Elastic’s leadership in search and over two years of machine learning research and development. The Elasticsearch Relevance Engine combines the best of AI with Elastic’s text search. ESRE gives developers a full suite of sophisticated retrieval algorithms and the ability to integrate with large language models (LLMs). Even better, it’s accessible via a simple, unified API that Elastic’s community already trusts, so developers around the world can start using it immediately to elevate search relevance. The Elasticsearch Relevance Engine’s configurable capabilities can be used to help improve relevance by: Applying advanced relevance ranking features including BM25f, a critical component of hybrid search Creating, storing, and searching dense embeddings using Elastic’s vector database Processing text using a wide range of natural language processing (NLP) tasks and models Letting developers manage and use their own transformer models in Elastic for business specific context Integrating with third-party transformer models such as OpenAI’s GPT-3 and 4 via API to retrieve intuitive summarization of content based on the customer’s data stores consolidated within Elasticsearch deployments Enabling ML-powered search without training or maintaining a model using Elastic’s out-of-the-box Learned Sparse Encoder model to deliver highly relevant, semantic search across a variety of domains Easily combining sparse and dense retrieval using Reciprocal Rank Fusion (RRF), a hybrid ranking method that gives developers control to optimize their AI search engine to their unique mix of natural language and keyword query types Integrating with third-party tooling such as LangChain to help build sophisticated data pipelines and generative AI applications The evolution of search has been driven by a constant need to improve relevance and the ways in which we interact with search applications. Highly relevant search results can lead to increased user engagement on search apps with significant downstream impacts on both revenue and productivity. In the new world of LLMs and generative AI, search can go even further — understanding user intent to provide a level of specificity in responses that’s never been seen before. Notably, every search advancement delivers better relevance while addressing new challenges posed by emerging technologies and changing user behaviors. Whether expanding on keyword search to offer semantic search or enabling new search modalities for video and images, new technology requires unique tools to deliver better experiences for search users. By the same token, today’s world of artificial intelligence calls for a new, highly scalable developer toolkit that’s been built on a tech stack with proven, customer-tested capabilities. With generative AI’s momentum and increased adoption of technologies like ChatGPT, as well as growing awareness of large language model capabilities, developers are hungry to experiment with technology to improve their applications. The Elasticsearch Relevance Engine ushers in a new age of capabilities in the world of generative AI and meets the day with powerful tools that any developer team can use right away. The Elasticsearch Relevance Engine is available on Elastic Cloud — the only hosted Elasticsearch offering to include all of the new features in this latest release. You can also download the Elastic Stack and our cloud orchestration products, Elastic Cloud Enterprise and Elastic Cloud for Kubernetes, for a self-managed experience. ChatGPT and Elasticsearch Elastic Learned Sparse Encoder blog Accessing machine learning models in Elastic Privacy-first AI search using LangChain and Elasticsearch Overcoming the limitations of generative AI models The Elasticsearch Relevance Engine is well positioned to help developers evolve quickly and address these challenges of natural language search, including generative AI. Enterprise data/context aware: The model might not have sufficient internal knowledge relevant to a particular domain. This stems from the data set that the model is trained on. In order to tailor the data and content that LLMs generate, enterprises need a way to feed models proprietary data so they can learn to furnish more relevant, business-specific information. Superior relevance: The Elasticsearch Relevance Engine makes integrating data from private sources as simple as generating and storing vector embeddings to retrieve context using semantic search. Vector embeddings are numerical representations of words, phrases, or documents that help LLMs understand the meanings of words and their relationships. These embeddings enhance transformer model output at speed and scale. ESRE also lets developers bring their own transformer models into Elastic or integrate with third-party models. We also realized that the emergence of late interaction models allows us to provide this out of the box — without the need for extensive training or fine tuning on third-party data sets. Since not every development team has the resources nor expertise to train and maintain machine learning models nor understand the trade-offs to scale, performance, and speed, the Elasticsearch Relevance Engine also includes Elastic Learned Sparse Encoder, a retrieval model built for semantic search across diverse domains. The model pairs sparse embeddings with traditional, keyword-based BM25 search for an easy to use Reciprocal Rank Fusion (RRF) scorer for hybrid search. ESRE gives developers machine learning-powered relevance and hybrid search techniques on day one. Privacy and security: Data privacy is central to how enterprises use and securely pass proprietary data over a network and between components, even when building innovative search experiences. Elastic includes native support for role-based and attribute-based access control to ensure that only those roles with access to data can see it, even for chat and question answering applications. Elasticsearch can support your organization’s need to keep certain documents accessible to privileged individuals, helping your organization to maintain universal privacy and access controls across all of your search applications. When privacy is of the utmost concern, keeping all data within your organization’s network can be not only paramount, but obligatory. From allowing your organization to implement deployments that are in an air-gapped environment to supporting access to secure networks, ESRE provides the tools you need to help your organization keep your data safe. Size and cost: Using large language models can be prohibitive for many enterprises due to data volumes and required computing power and memory. Yet businesses that want to build their own generative AI apps like chatbots need to marry LLMs with their private data. The Elasticsearch Relevance Engine gives enterprises the engine to deliver relevance efficiently with precision context windows that help reduce the data footprint without hassle and expense. Out of date: The model is frozen in time at the point when training data is collected. So the content and data that generative AI models create is only as fresh as data they’re trained on. Integrating corporate data is an inherent need to power timely results from LLMs. Hallucinations: When answering questions or conversing with the model, it may invent facts that sound trustworthy and convincing, but are in-fact projections that aren’t factual. This is another reason that grounding LLMs with contextual, customized knowledge is so critical to making models useful in a business context. The Elasticsearch Relevance Engine lets developers link to their own data stores via a context window in generative AI models. The search results added can provide up-to-date information that’s from a private source or specialized domain, and therefore can return more factual information when prompted instead of relying solely on a model’s so-called \"parametric\" knowledge. Supercharged by a vector database The Elasticsearch Relevance Engine includes a resilient, production grade vector database by design. It gives developers a foundation on which to build rich, semantic search applications. Using Elastic’s platform, development teams can use dense vector retrieval to create more intuitive question-answering that’s not constrained to keywords nor synonyms. They can build multimodal search using unstructured data like images, and even model user profiles and create matches to personalize search results in product and discovery, job search, or matchmaking applications. These NLP transformer models also enable machine learning tasks like sentiment analysis, named entity recognition, and text classification. Elastic’s vector database lets developers create, store, and query embeddings that are highly scalable and performant for real production applications. Elasticsearch excels at high-relevance search retrieval. With ESRE, Elasticsearch provides context windows for generative AI linked to an enterprise’s proprietary data, allowing developers to build engaging, more accurate search experiences. Search results are returned according to a user’s original query, and developers can pass that data on to the language model of their choice to provide an answer with added context. Elastic supercharges question-answer and personalization capabilities with relevant contextual data from your enterprise content store that’s private and tailored to your business. Delivering superior relevance out-of-the-box for all developers With the release of the Elasticsearch Relevance Engine, we’re making Elastic’s proprietary retrieval model readily available. The model is easy to download and works with our entire catalog of ingestion mechanisms like the Elastic web crawler , connectors or API. Developers can use it out of the box with their searchable corpus, and it’s small enough to fit within a laptop’s memory. Elastic’s Learned Sparse Encoder provides semantic search across domains for search use cases such as knowledge bases, academic journals, legal discovery, and patent databases to deliver highly relevant search results without the need to adapt or train it. Most real-world testing shows hybrid ranking techniques are producing the most relevant search result sets. Until now, we've been missing a key component — RRF. We're now including RRF for your application searching needs so you can pair vector and textual search capabilities. Machine learning is on the leading edge of enhancing search result relevance with semantic context, but too often its cost, complexity, and resource demands make it insurmountable for developers to implement it effectively. Developers commonly need the support of specialized machine learning or data science teams to build highly relevant AI-powered search. These teams spend considerable time selecting the right models, training them on domain-specific data sets, and maintaining models as they evolve due to changes in data and its relationships. Learn how Go1 uses Elastic’s vector database for scalable, semantic search . Developers who don’t have the support of specialized teams can implement semantic search and benefit from AI-powered search relevance from the start without the effort and expertise required for alternatives. Starting today, all customers have the building blocks to help achieve better relevance and modern, smarter search. Try it out Read about these capabilities and more . Existing Elastic Cloud customers can access many of these features directly from the Elastic Cloud console . Not taking advantage of Elastic on cloud? See how to use Elasticsearch with LLMs and generative AI . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch, Elasticsearch Relevance Engine, ESRE, Elastic Learned Sparse Encoder and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Overcoming the limitations of generative AI models Supercharged by a vector database Delivering superior relevance out-of-the-box for all developers Try it out Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing Elasticsearch Relevance Engine (ESRE) — Advanced search for the AI revolution - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/introducing-elasticsearch-relevance-engine-esre","meta_description":"Explore the Elasticsearch Relevance Engine (ESRE) by Elastic. ESRE powers gen AI solutions for private data sets with a vector database and ML models for semantic search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog MAXSCORE & block-max MAXSCORE: More skipping with block-max MAXSCORE Learn about MAXSCORE, block-max MAXSCORE & WAND. Improve the MAXSCORE algorithm to evaluate disjunctive queries more like a conjunctive query. Lucene AG By: Adrien Grand On December 6, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction to WAND and MAXSCORE How do you quickly identify the top-k matches of a disjunctive query using an inverted index? This is the problem that the WAND 1 and MAXSCORE 2 algorithms try to solve. These two algorithms are based on the same idea: taking advantage of the maximum impact score (the maximum contribution of a particular term to the overall score) for each term, so as to skip hits whose score cannot possibly compare greater than the score of the current k-th top hit (referred to as \"minimum competitive score\" below). For instance, if you are searching for hits that contain the or fox , and the has a maximum contribution of 0.2 to the score while the minimum competitive score is 0.5, then there is no point in evaluating hits that only contain the anymore, as they have no chance of making it to the top hits. However, WAND and MAXSCORE come with different performance characteristics: WAND typically evaluates fewer hits than MAXSCORE but with a higher per-hit overhead. This makes MAXSCORE generally perform better on high k values or with many terms - when skipping hits is hard, and WAND perform better otherwise. Block-max MAXSCORE & MAXSCORE in Lucene While Lucene first implemented a variant of WAND called block-max WAND , it later got attracted to the lower overhead of MAXSCORE and started using block-max MAXSCORE for top-level disjunctions in July 2022 (annotation EN in Lucene's nightly benchmarks). The MAXSCORE algorithm is rather simple: it sorts terms by increasing maximum impact score and partitions them into two groups: essential terms and non-essential terms. Non-essential terms are terms with low maximum impact scores whose sum of maximum scores is less than the minimum competitive score. Essential terms are all other terms. Essential terms are used to find candidate matches while non-essential terms are only used to compute the score of a candidate. MAXSCORE usage example Let's take an example: you are searching for the quick fox , and the maximum impact scores of each term are respectively 0.2 for the , 0.5 for quick and 1 for fox . As you start collecting hits, the minimum competitive score is 0, so all terms are essential and no terms are non-essential. Then at some point, the minimum competitive score reaches e.g. 0.3, meaning that a hit that only contains the has no chance of making it to the top-k hits. the moves from the set of essential terms to the set of non-essential terms, and the query effectively runs as (the) +(quick fox) . The + sign here is used to express that a query clause is required, such as in Lucene's classic query parser . Said another way, from that point on, the query will only match hits that contain quick or fox and will only use the to compute the final score. The below table summarizes cases that MAXSCORE considers: Minimum competitive score interval Query runs as [0, 0.2] +(the quick fox) (0.2, 0.7] (the) +(quick fox) (0.7, 1.7] (the quick) +(fox) (1.7, +Infty) No more matches The last case happens when the minimum competitive score is greater than the sum of all maximum impact scores across all terms. It typically never happens with regular MAXSCORE, but may happen on some blocks with block-max MAXSCORE. Improving MAXSCORE to intersect terms Something that WAND does better than MAXSCORE is to progressively evaluate queries less and less as a disjunction and more and more as a conjunction as the minimum competitive score increases, which yields more skipping. This raised the question of whether MAXSCORE can be improved to also intersect terms? The answer is yes: for instance if the minimum competitive score is 1.3, then a hit cannot be competitive if it doesn't match both quick and fox . So we modified our block-max MAXSCORE implementation to consider the following cases instead: Minimum competitive score interval Query runs as [0, 0.2] +(the quick fox) (0.2, 0.7] (the) +(quick fox) (0.7, 1.2] (the quick) +(fox) (1.2, 1.5] (the) +quick +fox (1.5, 1.7] +the +quick +fox (1.7, +Infty) No more matches Now the interesting question is whether these new cases we added are likely to occur or not? The answer depends on how good your score upper bounds are, your actual k value, whether terms actually have matches in common, etc., but it seems to kick in especially often in practice on queries that either have two terms, or that combine two high-scoring terms and zero or more low-scoring terms (e.g. stop words), such as the query we looked at in the above example. This is expected to cover a sizable number of queries in many query logs. Implementing this optimization yielded a noticeable improvement on Lucene's nightly benchmarks (annotation FU), see OrHighHigh (11% speedup) and OrHighMed (6% speedup). It was released in Lucene 9.9 and should be included in Elasticsearch 8.12. We hope you'll enjoy the speedups! Footnotes Broder, A. Z., Carmel, D., Herscovici, M., Soffer, A., & Zien, J. (2003, November). Efficient query evaluation using a two-level retrieval process. In Proceedings of the twelfth international conference on Information and knowledge management (pp. 426-434). ↩ Turtle, H., & Flood, J. (1995). Query evaluation: strategies and optimizations. Information Processing & Management, 31(6), 831-850. ↩ Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Introduction to WAND and MAXSCORE Block-max MAXSCORE & MAXSCORE in Lucene MAXSCORE usage example Improving MAXSCORE to intersect terms Footnotes Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"MAXSCORE & block-max MAXSCORE: More skipping with block-max MAXSCORE - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/more-skipping-with-bm-maxscore","meta_description":"Learn about MAXSCORE, block-max MAXSCORE & WAND. Improve the MAXSCORE algorithm to evaluate disjunctive queries more like a conjunctive query."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent Learn about different options to ingest data from AWS S3 into Elastic Cloud. This blog covers how to ingest data from AWS S3 using Elastic Agent. Ingestion How To HL By: Hemendra Singh Lodhi On October 10, 2024 Part of Series How to ingest data from AWS S3 into Elastic Cloud Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. This is the second installment in a multi-part blog series exploring different options for ingesting data from AWS S3 into Elastic Cloud. Check out the other parts of the series: Part 1: Elastic Serverless Forwarder Part 3: Elastic S3 Connector In this blog we will learn about how to ingest data from AWS S3 using Elastic Agent. Note 1: See different options comparison in Part 1 : Elastic Serverless Forwarde r Note 2: Elastic Cloud deployment is a prerequisite to follow along the steps described below. Elastic Cloud Check the Part 1 : Elastic Serverless Forwarder of the blog series on how to get started with Elastic Cloud. Skip this if you already have a active deployment. Elastic Agent for AWS S3 data ingestion Another option to ingest data from AWS S3 is using Elastic Agent . Elastic Agent is a single, unified way to ingest data such as logs, metrics. Elastic agent is installed on an instance such as EC2 and using integrations can connect to the AWS services such as S3 and can forward the data to Elasticsearch. High level Elastic Agent working: A policy is created which is like a manifest file and consist of instructions for agent. In the policy integrations are added which are essentialy modules consists of assets such as configs, mappings, dashboards etc. Agents are installed with the required policy. Agent will perform ingestion action based on the integrations. Features of Elastic Agent Ships both Logs & Metrics Support data transfer over AWS PrivateLink Support all integrations and agent can be managed using Fleet (comes default with Elastic Cloud) Agents needs to be installed and maintaned and there is no autoscaling. Using Fleet can simplify the agent maintenance. Good performance out of the box and performance parameters can be configured to use performance presets . Preset can be used depending on the data type and ingestion requirement. More about Fleet server scalability here Cost is of EC2 instance for agent installation and for SQS notification Data flow High level data flow for Elastic Agent based data ingestion: VPC flow log is configured to write to S3 bucket Once log is written to S3 bucket, S3 event notifications is sent to SQS Elastic agent polls SQS queue for new message. Based on the metadata in the message it reads the log data from S3 bucket and send it to Elasticsearch SQS is recommeded for performance so that agent can read only the new updated objects in S3 bucket instead of polling entire bucket each time Set up For Steps (1)-(2), follow the details from Part 1 : Elastic Serverless Forwarder : 1. Create S3 Bucket to store VPC flow logs 2. Enable VPC Flow logs and send to S3 bucket created above 3. Create SQS queue with default settings Note: Create SQS queue in same region as S3 bucket Provide queue name sqs-vpc-flow-logs-elastic-agent and keep the other setting as default: Update the SQS Access Policy (Advance) to allow s3 bucket to send notification to SQS queue. Replace account-id with your AWS account id. Keep other options as default. Here, we are specifying S3 to send message to SQS queue (ARN) from the S3 bucket: Note the SQS URL, in queue setting under Details: 4. Enable VPC flow log event notification in S3 bucket Go to S3 bucket s3-vpc-flow-logs-elastic -> Properties and Create event notification Provide name and on what event type you want to trigger SQS. We have selected object create when any object is added to the bucket: Select destination as SQS queue and choose sqs-vpc-flow-logs-elastic-agent : Once saved, configuration will look like below: Confirm VPC flow logs are published in S3 bucket: Confirm S3 event notification is sent to SQS queue: 5. Install Elastic Agent on EC2 instance Launch an EC2 instance To get the installation commands, Go to: Kibana -> Fleet -> Add Agent Create new agent policy aws-vpc-flow-logs-s3-policy and click Create Policy. Once policy is created, copy the instruction to install Elastic Agent . Leave other settings as default: Login to EC2 instance and run the commands: Upon successful completion, status will be updated on fleet page: Update policy aws-vpc-flow-logs-s3-policy with aws integration. This will push aws integration configuration to the agent which is subscribed to this policy. More on how fleet and agent work together is here . Kibana -> Fleet -> Agent policies. Select the policy aws-vpc-flow-logs-s3-policy and click Add integration. This will take you to the integration page search for AWS integration. Choosing AWS integration is better if you want monitor more than 1 AWS service: Provide AWS Access Key ID and Secret Access Key for authentication and allow Elastic Agent to read from AWS services. There are other authentication options available. Details here . Namespace option is used to segregate the data based on environment or any other identifier: Toggle off other services and use Collect VPC flow logs from S3 . Update S3 bucket and SQS queue URL copied earlier. Leave advance settings as default: Scroll down and click Existing hosts option as we have already intalled the agent and select the policy aws-vpc-flow-logs-s3-policy . Save and continue. This will push the configured integration to Elastic Agent: Go to Kibana -> Fleet -> Agent policies and policy aws-vpc-flow-logs-s3-policy is updated with AWS integration. After a couple of minutes, you can validate flow logs are ingested from S3 into Elastic. Go to Kibana -> Discover: 6. Monitor VPC flow logs in Kibana dashboards Integrations comes with assets such as dashboard which are pre-built for common use cases. Go to Kibana -> Dashboard and search for VPC Flow logs: More dashboards! As promised, here are few dashboards that can help monitor AWS services used in our setup using the Elastic agent ingestion method. This will help in tracking usage and help in optimisation. We will use the same setup used in the Elastic Agent data ingestion option to configure settings and populate dashboards. Go to Kibana -> Fleet -> aws-vpc-flow-logs-s3-policy . Select AWS integration and toggle on the required service and fill in the details. Some of the interesting Dashboards: Note: All dashboards are available under Kibana->Analytics->Dashboards [Metrics AWS] Lambda overview If you have implemented ingestion using Elastic Serverless Forwarder, then you can use this dashboard to track AWS Lambda metrics. It mainly shows Lambda function duration, errors, and any function throttling: [Metrics AWS] S3 overview This dashboard outlines S3 usage and helps in monitoring bucket size, number of objects, etc. This can help in optimisation of S3 usage by tracking stale buckets and objects: [Logs AWS] S3 server access log overview This dashboard shows S3 server access logging and provides detailed records for the requests that are made to a bucket. This can be useful in security and access audits and can also help in learning how users access your S3 buckets and objects: [Metrics AWS] Usage overview This dashboard shows the general usage of AWS services and highlights API usage against AWS services. This can help in understanding the service usage and potential optimisation: [Metrics AWS] Billing overview This dashboard shows the billing usage by service and helps monitor how many $$ are spent for the services: [Metrics AWS] SQS overview This dashboard shows SQS queues utilisation showing messages sent, received and any delay in sending messages. This is important in monitoring the SQS queues for any issues as it is an important component in the architecture. Any issues with SQS can potentially cause delay in data ingestion: [Metrics AWS] EC2 overview If you are using the Elastic agent ingestion method, then you can monitor the utilisation of the EC2 instance for CPU, memory, disk, etc. hosting the Elastic agent, which can be helpful in sizing the instance if there is a high traffic load. This can also be used for your other EC2 instances: [Elastic Agent] S3 input metrics This dashboard shows the detailed utilisation of Elastic agent showing how Elastic agent is processing S3 inputs and monitoring interaction with SQS and S3. The dashboard shows aggregated metrics of the Elastic agent on reading SQS messages and S3 objects and forwarding them to Elasticsearch. Together with the [Metrics AWS] EC2 Overview dashboard, this can help in understanding the utilisation of EC2 and Elastic agent and can potentially helps in scaling these components: Conclusion Elasticsearch provides multiple options to sync data from AWS S3 into Elasticsearch deployments. In this walkthrough, we have demonstrated that it is relatively easy to implement Elastic Agent ingestion options and leverage Elastic's industry-leading search capabilities. In Part 3 of this series , we'll dive into using Elastic S3 Native Connector as another option for ingesting AWS S3 data. Don't forget to check out Part 1 : Elastic Serverless Forwarder of the series and Part 3: Elastic S3 Connector . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Elastic Cloud Elastic Agent for AWS S3 data ingestion Features of Elastic Agent Data flow Set up Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to ingest data from AWS S3 into Elastic Cloud - Part 2 : Elastic Agent - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/ingest-aws-s3-data-elastic-cloud-elastic-agent","meta_description":"Learn how to ingest data from AWS S3 into Elastic Cloud using Elastic Agent. A tutorial with setup steps and dashboards for monitoring."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Hybrid search with multiple embeddings: A fun and furry search for cats! A walkthrough of how to implement different types of search - lexical, vector and hybrid - on multiple embeddings (text and image). It uses a simple and playful search application on cats. Vector Database How To JL By: Jo Ann de Leon On October 31, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Did you know that Elastic can be used as a powerful vector database? In this blog, we’ll explore how to generate, store, and query vector embeddings alongside traditional lexical search. Elastic’s strength lies in its flexibility and scalability, making it an excellent choice for modern search use cases. By integrating vector embeddings with Elastic, you can improve search relevance, and enhance search capabilities across various data types—including non-textual documents like images. But it gets even better! Learning Elastic’s search features can be fun too. In this article, we’ll show you how to search for your favorite cats using Elastic to search both text descriptions and images of cats. Through a simple Python app that accompanies this article, you’ll learn how to implement both vector and keyword-based searches. We’ll guide you through generating your own vector embeddings, storing them in Elastic and running hybrid queries - all while searching for adorable feline friends. Whether you're an experienced developer or new to Elasticsearch, this fun project is a great way to understand how modern search technologies work. Plus, if you love cats, you'll find it even more engaging. So let’s dive in and set up the Elasticats app while exploring Elasticsearch’s powerful capabilities. Before we begin, let’s make sure that you have your Elastic cloud ID and API key ready. Make a copy of the .env-template file , save it as .env and plug in your Elastic cloud credentials. Application architecture Here’s a high-level diagram that depicts our application architecture: Generating and storing vector embeddings Before we can perform any type of search, we first need to have data. Our data.json contains the list of cat documents that we will index in Elasticsearch. Each document describes a cat and has the following mappings: Each cat’s photo property points to the location of the cat’s image. When we call the reindex function in our application, it will generate two embeddings: 1. First is a vector embedding for each cat’s image. We used the clip-ViT-B-32 model. Image models allow you to embed images and text into the same vector space. This allows you to implement image search either as text-to-image or image-to-image search. 2. The second embedding is for the summary text about each cat that is up for adoption. We used a different model which is all-MiniLM-L6-v2. We then store the embeddings as part of our documents. We’re now ready to call the reindex function. From the terminal, run the following command: We can now run our web application: Our initial form looks like this: As you can see, we have exposed some of the keywords as filters (e.g. age, gender, size, etc.) that we will use as part of our queries. Executing different types of searches The following workflow diagram shows the different search paths available in our web application. We’ll walk through each scenario. Lexical search The simplest scenario is a “match all” query which basically returns all cats in our index. We don’t use any of the filters nor enter a description or upload an image. If any of the filters were supplied in the form, then we perform a boolean query. In this scenario, no description is entered so we’re applying the filters in our “match all” query. Vector search In our web form, we are able to upload a similar image of a cat(s). By uploading an image, we can do a vector search by transforming the uploaded image into an embedding and then performing a knn search on the image embeddings that were previously stored. First, we save the uploaded image in an uploads folder. We then create a knn query for the image embedding. Notice that the vector search can be performed with or without the filters (from the boolean query). Also, note that k=5 which means that we’re only returning the top 5 similar documents (cats). Try any of these images stored in the images/<breed> folder: Abyssinian Dahlia - 72245105_3.jpg American shorthair Uni - 64635658_2.jpg Sugarplum - 72157682_4.jpeg Persian Sugar - 72528240_2.jpeg Hybrid search The most complex scenario in our application is when some text is entered into the description field. Here, we perform 3 different types of search and combine them into a hybrid search. First, we perform a lexical “match” query on the actual text input. We also create 2 knn queries: Using the model for the text embedding, we generate an embedding for the text input and perform a knn search on the summary embedding. Using the model for the image embedding, we generate another embedding for the text input and perform a knn search on the image embedding. I mentioned earlier that image models allow you to do not just an image-to-image search as we’ve seen in the vector search scenario above, but it also allows you to do a text-to-image search. This means that if I type “black cats” in the description, it will search for images that may contain or resemble black cats! We then utilize the Reciprocal Rank Fusion (RRF) retriever to effectively combine and rank the results from all three queries into a single cohesive result set. RRF is a method designed to merge multiple result sets, each with potentially different relevance indicators, into one unified set. Unlike simply joining the result arrays, RRF applies a specific formula to rank documents based on their positions in the individual result sets. This approach ensures that documents appearing in multiple queries are given higher importance, leading to improved relevance and quality of the final results. By using RRF, we avoid the complexities of manually tuning weights for each query and achieve a balanced integration of diverse search strategies. To further illustrate, the following is a table showing the ranking of the individual result sets when we search for “sisters”. Using the RRF formula (with the default ranking constant k=60), we can then derive the final score for each document. Sorting the final scores in descending order then gives us the final ranking of the documents. “Willow & Nova” is our top hit (cat)! Cat (document) Lexical ranking knn (on img_embedding) ranking knn (on summary_embedding) ranking Final Score Final Ranking Sugarplum 1 3 0.0322664585 2 Willow & Nova 2 1 1 0.0489159175 1 Zoe & Zara 2 0.01612903226 4 Sage 3 2 0.03200204813 3 Primrose 4 0.015625 5 Dahlia 5 0.01538461538 7 Luke & Leia 4 0.015625 6 Sugar & Garth 5 0.01538461538 8 Here are some other tests you can use for the description: “sisters” vs “siblings” “tuxedo” “black cats” with “American shorthair” breed filter “white” Conclusion Besides the obvious — **cats!** — Elasticats is a fantastic way to get to know Elasticsearch. It’s a fun and practical project that lets you explore search technologies while reminding us of the joy that technology can bring. As you dive deeper, you’ll also discover how Elasticsearch’s ability to handle vector embeddings can unlock new levels of search functionality. Whether it’s for cats, images, or other data types, Elastic makes search both powerful and enjoyable! Feel free to contribute to the project or fork the repository to customize it further. Happy searching, and may you find the cat of your dreams! 😸 Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Application architecture Generating and storing vector embeddings Executing different types of searches Lexical search Vector search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Hybrid search with multiple embeddings: A fun and furry search for cats! - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/hybrid-search-multiple-embeddings","meta_description":"A walkthrough of how to implement lexical, vector and hybrid search on multiple embeddings (text & image), including how to generate vector embeddings."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data from AWS S3 into Elastic Cloud - Part 3: Elastic S3 Connector Learn about different options to ingest data from AWS S3 into Elastic Cloud. This time we will focus on Elastic S3 Connector. Ingestion Integrations How To HL By: Hemendra Singh Lodhi On November 5, 2024 Part of Series How to ingest data from AWS S3 into Elastic Cloud Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. This is the third installment in a multi-part blog series exploring different options for ingesting data from AWS S3 into Elastic Cloud. Check out the other parts of the series: Part 1: Elastic Serverless Forwarder Part 2: Elastic Agent In this blog, we will learn about how to ingest data from AWS S3 using the Elastic S3 Native connector . Elastic Native connectors are available directly within your Elastic Cloud environment. Customers have the option to use self-managed connector clients that provide the highest degree of customization options and flexibility. Note 1: See the comparison between the options in Part 1 : Elastic Serverless Forwarder. Note 2: An Elastic Cloud deployment is a prerequisite to follow along the steps described below. Elastic Cloud Check the Part 1 of the blog series on how to get started with Elastic Cloud. Elastic S3 Native Connector This option for ingesting S3 data is quite different from the earlier ones in terms of use case. This time we will use Elastic S3 Native Connector which is available in Elastic Cloud. Connectors sync data from data sources and create searchable, read only replicas of the data source. They ingest the data and transform them into Elasticsearch documents. The Elastic S3 Native Connector is a good option to ingest data suitable for content search. For example, you can sync your company's private data (such as internal knowledge data and other files) in S3 buckets and perform a text-based search or can perform vector/semantic search through the use of large language models (LLMs). S3 connectors may not be a suitable option for ingesting Observability data such as logs & metrics as its main use case is ingesting content. Features Native connectors are available by default in Elastic Cloud and customers can use self-managed connectors too if they need further customization. Currently an Enterprise Search node (at least 1) must be configured in your cluster to use connectors. Basic & advanced sync rules are available for data filtering at source, such as specific bucket prefix. Synced data is always stored in content tier which is used for search related use cases. The connector offers default and custom options for data filtering, extracting and transforming content . Connector connection is a public egress (outbound) from Elastic Cloud and using Elastic Traffic Filter has no impact as Elastic Traffic Filter (Private Link) connection is is a one-way private egress from AWS. This means that data transfer will be over the public network (HTTPS) and Connector connection is independent of the traffic filter use. Connector scaling depends on the volume of data ingested from the source. The Enterprise Search node sizing depends on Elasticsearch sizing as well, and it is recommended to reach out to Elastic for large-scale data ingestion. Generally, a 2GB–4GB RAM size for Enterprise Search is sufficient for light to medium use cases. Cost will be for object storage in S3 only. There is no data transfer cost from S3 buckets to Elasticsearch when they are within the same AWS Region. There will be some data transfer cost for cross-region data sync i.e S3 bucket and Elasticsearch deployment are in different region. More on AWS Data transfer pricing here . Data flow of Elastic S3 Connector The Elastic S3 Connector syncs data between the S3 bucket and Elasticsearch, as per the below high level flow: Elastic S3 Connector is configured with S3 bucket information and credentials with the necessary permissions to connect to the bucket and sync data. Based on the sync rules , the connector will pull the data from the specified bucket(s). Ingest pipelines perform data parsing and filtering before indexing. When you create a connector, ent-search-generic-ingestion pipeline is available by default which performs most of the common data processing tasks. Custom ingest pipelines can be defined too for transforming data as needed. Connection is over public (HTTPS) network to AWS S3. Note 1: Content larger than 10MB will not be synced. If you are using self-managed connectors, you can use the self-managed local extraction service to handle larger binary files. Note 2: Original permissions at the S3-bucket level are not synced and all indexed documents are visible to the user with Elastic deployment access. Customers can manage documents permissions at Elastic using Role based access control , Document level security & Field level security More information on Connector architecture is available here . Set up Elastic S3 Native Connector 1. Create the S3 bucket, here named elastic-s3-native-connector : AWS Console -> S3 -> Create bucket. You may leave other settings as default or change as per the requirements. This bucket will store data to be synced to Elastic. The connector supports a variety of file types . For the purpose of ingestion testing we will upload a few pdf and text files. 2. Login to Kibana and navigate to Search-> Content -> Connectors. Search for S3 connector. Provide a name for connector, we have given aws-s3-data-connector : The Connector will show a warning message if there is no Enterprise Search node detected, similar to the below: Login to Elastic Cloud Console and edit your deployment. Under Enterprise Search, select the node size and zones and save: You can provide a different index name or the same as connector name. We are using the same index name: Provide AWS credentials and bucket details for elastic-s3-native-connector : When you update the configuration, the connector can display a validation error if there is some delay in updating AWS credentials and bucket names. You can provide the required information and ignore the error banner. This is a known thing as connectors communicate asynchronously with Kibana and for any configuration update there is some delay in communication between the connector and Kibana. The error will go away once the sync starts or a refresh starts after some time: 3. Once configuration is done successfully, click on the \"Sync\" button to perform the initial full content sync. For a recurring sync, configure the sync frequency under the Scheduling tab. It is disabled by default, so you'll need to toggle the \"Enable\" button to enable it. Once scheduling is done, the connector will run at the configured time and pull all the content from the S3 bucket. Elastic Native Connector can only sync files of size 10MB and less. Any files more than 10MB of size will be ignored and will not be synced. You either have to chunk the files accordingly or use a self-managed connector to customize the behavior: Search after AWS S3 data ingestion Once the data is ingested, you can validate directly from the connector under Documents tab: Also, Elasticsearch provides Search applications feature which enable users to build search-powered applications. You can create search applications based on your Elasticsearch indices, build queries using search templates, and easily preview your results directly in the Kibana Search UI. Enhance the search experience of AWS S3 ingested data with Playground Elastic provides Playground functionality to implement Retrieval Augmented Generation(RAG) based question answering with LLM to enhance the search experience on the ingested data. In our case, once the data is ingested from S3, you can configure Playground and use its chat interface, which takes your questions, retrieves the most relevant results from your Elasticsearch documents, and passes those documents to the LLM to generate tailored responses. Check out this great blog post from Han Xiang Choong showcasing the Playground feature using S3 data ingestion. Conclusion In this blog series, we have seen 3 different options Elasticsearch provides to sync and ingest data from AWS S3 into Elasticsearch deployments. Depending on the use case and requirements, customers can choose the best option for them and ingest data via Elastic Serverless Forwarder , Elastic Agent or the S3 Connector. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Elastic Cloud Elastic S3 Native Connector Features Data flow of Elastic S3 Connector Set up Elastic S3 Native Connector Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to ingest data from AWS S3 into Elastic Cloud - Part 3: Elastic S3 Connector - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/ingest-aws-elastic-s3-connector","meta_description":"Learn how to sync and ingest data from AWS S3 into Elasticsearch deployments using the Elastic S3 Connector. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Vector similarity techniques and scoring Explore vector similarity techniques and scoring in Elasticsearch, including L1 & L2 distance, cosine similarity, dot product similarity and max inner product similarity. Vector Database VC By: Valentin Crettaz On May 13, 2024 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. When the need for searching free text arises and Ctrl+F / Cmd+F don't cut it anymore, using a lexical search engine is usually the next logical choice that comes to mind. Lexical search engines excel at analyzing and tokenizing the text to be searched into terms that can be matched at search time, but they usually fall short when it comes to understanding and making sense of the true meaning of the text being indexed and searched. That's exactly where vector search engines shine. They can index the same text in such a way that it can be searched based on both the meaning it represents and its relationships with other concepts having similar or related meaning. In this blog, we will briefly touch upon how vectors are a great mathematical concept for conveying the meaning of text. We'll then dive deeper into the different similarity techniques supported by Elasticsearch when it comes to searching for neighboring vectors, i.e., searching for vectors carrying a similar meaning, and how to score them. What are vector embeddings? This article doesn't delve deeply into the intricacies of vector embeddings. If you're looking to explore this topic further or need a primer before continuing, we recommend checking out the following guide . In a nutshell, vector embeddings are obtained through a machine learning process (e.g. deep learning neural networks) that transforms any kind of unstructured input data (e.g., raw text, image, video, sound, etc.) into numerical data that carries their meaning and relationships. Different flavors of unstructured data require different kinds of machine learning models that have been trained to \"understand\" each type of data. Each vector locates a specific piece of data as a point in a multidimensional space and that location represents a set of features the model uses to characterize the data. The number of dimensions depends on the machine learning model, but they usually range from a couple hundred to a few thousand. For instance, OpenAI Embeddings models boasts 1536 dimensions, while Cohere Embeddings models can range from 382 to 4096 dimensions. The Elasticsearch dense_vector field type supports up to 4096 dimensions as of the latest release. The true feat of vector embeddings is that data points that share similar meaning are close together in the space. Another interesting aspect is that vector embeddings also help capture relationships between data points. How do we compare vectors? Knowing that unstructured data is sliced and diced by machine learning models into vector embeddings that capture the similarity of the data along a high number of dimensions, we now need to understand how the matching of those vectors works. It turns out that the answer is pretty simple. Vector embeddings that are close to one another represent semantically similar pieces of data. So, when we query a vector database, the search input (image, text, etc.) is first turned into a vector embeddings using the same machine learning model that has been used for indexing all the unstructured data, and the ultimate goal is to find the nearest neighboring vectors to that query vector. Hence, all we need to do is figure out how to measure the \"distance\" or \"similarity\" between the query vector and all the existing vectors indexed in the database - it's that simple. Distance, similarity and scoring Luckily for us, measuring the distance or similarity between two vectors is an easy problem to solve thanks to vector arithmetics. So, let’s look at the most popular distance and similarity functions that are supported by Elasticsearch. Warning, math ahead! Just before we dive in, let's have a quick look at scoring. Factually, Lucene only allows scores to be positive. All the distance and similarity functions that we will introduce shortly yield a measure of how close or similar two vectors are, but those raw figures are rarely fit to be used as score since they can be negative. For this reason, the final score needs to be derived from the distance or similarity value in a way that ensures the score will be positive and a bigger score corresponds to a higher ranking (i.e. to closer vectors). L1 distance The L1 distance, also called the Manhattan distance, of two vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B is measured by summing up the pairwise absolute difference of all their elements. Obviously, the smaller the distance δ L 1 \\delta_{L1} δ L 1 ​ , the closer the two vectors are. The L1 distance formula (1) is pretty simple, as can be seen below: δ L 1 ( A ⃗ , B ⃗ ) = ∑ 1 ≤ i ≤ n ∣ A i − B i ∣ (1) \\tag{1} \\delta_{L1}(\\vec{A}, \\vec{B}) = \\sum_{\\mathclap{1\\le i\\le n}} \\vert A_i-B_i \\vert δ L 1 ​ ( A , B ) = 1 ≤ i ≤ n ∑ ​ ∣ A i ​ − B i ​ ∣ ( 1 ) Visually, the L1 distance can be illustrated as shown in the image below (in red): Computing the L1 distance of the following two vectors A ⃗ = ( 1 2 ) \\vec{A} = \\binom{1}{2} A = ( 2 1 ​ ) and B ⃗ = ( 2 0.5 ) \\vec{B} = \\binom{2}{0.5} B = ( 0.5 2 ​ ) would yield ∣ 1 – 2 ∣ + ∣ 2 – 0.5 ∣ = 2.5 \\vert 1–2 \\vert + \\vert 2–0.5 \\vert = 2.5 ∣1–2∣ + ∣2–0.5∣ = 2.5 Important: It is worth noting that the L1 distance function is only supported for exact vector search (aka brute force search) using the script_score DSL query, but not for approximate kNN search using the knn search option or knn DSL query . L2 distance The L2 distance, also called the Euclidean distance, of two vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B is measured by first summing up the square of the pairwise difference of all their elements and then taking the square root of the result. It’s basically the shortest path between two points. Similarly to L1, the smaller the distance δ L 2 \\delta_{L2} δ L 2 ​ , the closer the two vectors are: δ L 2 ( A ⃗ , B ⃗ ) = ∑ 1 ≤ i ≤ n ( A i − B i ) 2 (2) \\tag{2} \\delta_{L2}(\\vec{A},\\vec{B}) = \\sqrt{\\sum_{\\mathclap{1\\le i\\le n}} ( A_i-B_i )^2 } δ L 2 ​ ( A , B ) = 1 ≤ i ≤ n ∑ ​ ( A i ​ − B i ​ ) 2 ​ ( 2 ) The L2 distance is shown in red in the image below: Let’s reuse the same two sample vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B as we used for the δ L 1 \\delta_{L1} δ L 1 ​ distance, and we can now compute the δ L 2 \\delta_{L2} δ L 2 ​ distance as ( 1 − 2 ) 2 + ( 2 − 0.5 ) 2 = 3.25 ≊ 1.803 \\sqrt{(1-2)^2 + (2-0.5)^2} = \\sqrt{3.25} \\approxeq 1.803 ( 1 − 2 ) 2 + ( 2 − 0.5 ) 2 ​ = 3.25 ​ ≊ 1.803 . As far as scoring goes, the smaller the distance between two vectors, the closer (i.e., the more similar) they are. So in order to derive a score we need to invert the distance measure, so that the smallest distance yields the highest score. The way the score is computed when using the L2 distance looks as shown in formula (3) below: _ s c o r e L 2 ( A ⃗ , B ⃗ ) = 1 1 + δ L 2 ( A ⃗ , B ⃗ ) 2 (3) \\tag{3} \\_score_{L2}(\\vec{A},\\vec{B}) = \\frac{1}{1 + \\delta_{L2}(\\vec{A}, \\vec{B})^2} _ scor e L 2 ​ ( A , B ) = 1 + δ L 2 ​ ( A , B ) 2 1 ​ ( 3 ) Reusing the sample vectors from the earlier example, their score would be 1 4.25 ≊ 0.2352 \\frac{1}{4.25} \\approxeq 0.2352 4.25 1 ​ ≊ 0.2352 . Two vectors that are very close to one another will near a score of 1, while the score of two vectors that are very far from one another will tend towards 0. Wrapping up on L1 and L2 distance functions, a good analogy to compare them is to think about A and B as being two buildings in Manhattan, NYC. A taxi going from A to B would have to drive along the L1 path (streets and avenues), while a bird would probably use the L2 path (straight line). Cosine similarity In contrast to L1 and L2, cosine similarity does not measure the distance between two vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B , but rather their relative angle, i.e., whether they are both pointing in roughly the same direction. The higher the similarity s c o s s_{cos} s cos ​ , the smaller the angle α \\alpha α between the two vectors, and hence, the \"closer\" they are and the \"similar\" their conveyed meaning are. To illustrate this, let's think of two people out in the wild looking in different directions. In the figure below, the person in blue looks in the direction symbolized by vector A ⃗ \\vec{A} A and the person in red in the direction of vector B ⃗ \\vec{B} B . The more they will direct their eyesight towards the same direction (i.e., the closer their vectors get), the more their field of view symbolized by the blue and red areas will overlap. How much their field of view overlap is their cosine similarity. However, note that person B looks farther away than person A (i.e., vector B ⃗ \\vec{B} B is longer). Person B might be looking at a mountain far away on the horizon, while person A could be looking at a nearby tree. For cosine similarity, that doesn't play any role as it is only about the angle. Now let's compute that cosine similarity. The formula (4) is pretty simple, where the numerator consists of the dot product of both vectors and the denominator contains the product of their magnitude (i.e., their length): s c o s ( A ⃗ , B ⃗ ) = A ⃗ ⋅ B ⃗ ∥ A ⃗ ∥ × ∥ B ⃗ ∥ (4) \\tag{4} s_{cos}(\\vec{A}, \\vec{B}) = \\frac{\\vec{A} \\cdot \\vec{B}}{\\Vert \\vec{A} \\Vert \\times \\Vert \\vec{B} \\Vert} s cos ​ ( A , B ) = ∥ A ∥ × ∥ B ∥ A ⋅ B ​ ( 4 ) The cosine similarity between A ⃗ \\vec{A} A and B ⃗ \\vec{B} B is shown in the image below as a measure of the angle between them (in red): Let's take a quick detour in order to explain what these cosine similarity values mean concretely. As can be seen in the image below depicting the cosine function, values always oscillate in the [ − 1 , 1 ] [-1, 1] [ − 1 , 1 ] interval. Remember that in order for two vectors to be considered similar, their angle must be as acute as possible, ideally nearing a 0 ° 0° 0° angle, which would boil down to a perfect similarity of 1 1 1 . In other words, when vectors are... ... close to one another, the cosine of their angle nears 1 1 1 (i.e., close to 0 ° 0° 0° ) ... unrelated , the cosine of their angle nears 0 0 0 (i.e., close to 90 ° 90° 90° ) ... opposite , the cosine of their angle nears − 1 -1 − 1 (i.e., close to 180 ° 180° 180° ) Now that we know how to compute the cosine similarity between two vectors and we have a good idea of how to interpret the resulting value, we can reuse the same sample vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B and compute their cosine similarity using the formula (4) we saw earlier. s c o s ( A ⃗ , B ⃗ ) = ( 1 ⋅ 2 ) + ( 2 ⋅ 0.5 ) ( 1 2 + 2 2 ) × ( 2 2 + 0. 5 2 ) ≊ 3 4.609 ≊ 0.650791 s_{cos}(\\vec{A}, \\vec{B}) = \\frac{(1 \\cdot 2) + (2 \\cdot 0.5)}{\\sqrt{(1^2 + 2^2)} \\times \\sqrt{(2^2 + 0.5^2)}} \\approxeq \\frac{3}{4.609} \\approxeq 0.650791 s cos ​ ( A , B ) = ( 1 2 + 2 2 ) ​ × ( 2 2 + 0. 5 2 ) ​ ( 1 ⋅ 2 ) + ( 2 ⋅ 0.5 ) ​ ≊ 4.609 3 ​ ≊ 0.650791 We get a cosine similarity of 0.650791 0.650791 0.650791 , which is closer to 1 1 1 than to 0 0 0 , meaning that the two vectors are somewhat similar , i.e., not perfectly similar, but not completely unrelated either, and certainly do not carry opposite meaning. In order to derive a positive score from any cosine similarity value, we need to use the following formula (5), which transforms cosine similarity values oscillating within the [ − 1 , 1 ] [-1, 1] [ − 1 , 1 ] interval into scores in the [ 0 , 1 ] [0, 1] [ 0 , 1 ] interval: _ s c o r e c o s ( A ⃗ , B ⃗ ) = 1 + s c o s ( A ⃗ , B ⃗ ) 2 (5) \\tag{5} \\_score_{cos}(\\vec{A},\\vec{B}) = \\frac{1 + s_{cos}(\\vec{A}, \\vec{B})}{2} _ scor e cos ​ ( A , B ) = 2 1 + s cos ​ ( A , B ) ​ ( 5 ) The score for the sample vectors A ⃗ \\vec{A} A and B ⃗ \\vec{B} B would thus be: 1 + 0.650791 2 ≊ 0.8253 \\frac{1 + 0.650791}{2} \\approxeq 0.8253 2 1 + 0.650791 ​ ≊ 0.8253 . Dot product similarity One drawback of cosine similarity is that it only takes into account the angle between two vectors but not their magnitude, which means that if two vectors point roughly in the same direction but one is much longer than the other, both will still be considered similar. Dot product similarity, also called scalar or inner product similarity, improves that by taking into account both the angle and the magnitude of the vectors, which provides for a more accurate similarity metric. In order to make the magnitude of the vectors irrelevant, dot product similarity requires that the vectors be normalized first, so we are ultimately only comparing vectors of unit length 1. Let's try to illustrate this again with the same two people as before, but this time, we put them in the middle of a circular room, so that their sight reach is exactly the same (i.e., the radius of the room). Similarly to cosine similarity, the more they turn towards the same direction (i.e., the closer their vectors get), the more their field of view will overlap. However, in contrary to cosine similarity, both vectors have the same length and both areas have the same surface, which means that the two people look at exactly the same picture located at the same distance. How well those two areas overlap denotes their dot product similarity. Before introducing the dot product similarity formula, let's quickly see how a vector can be normalized. It's pretty simple and can be done in two trivial steps: compute the magnitude of the vector divide each component by the magnitude obtained in 1. As an example, let's take vector A ⃗ = ( 1 2 ) \\vec{A} = \\binom{1}{2} A = ( 2 1 ​ ) . We can compute its magnitude ∥ A ⃗ ∥ \\Vert \\vec{A} \\Vert ∥ A ∥ as we have seen earlier when reviewing the cosine similarity, i.e. 1 2 + 2 2 = 5 \\sqrt{1^2 + 2^2} = \\sqrt{5} 1 2 + 2 2 ​ = 5 ​ . Then, dividing each component of the vector by its magnitude, we obtain the following normalized vector C ⃗ \\vec{C} C : A n o r m ⃗ = C ⃗ = ( 1 5 2 5 ) ≊ ( 0.44 0.89 ) \\vec{A_{norm}} = \\vec{C} = \\dbinom{\\frac{1}{\\sqrt{5}}}{\\frac{2}{\\sqrt{5}}} \\approxeq \\dbinom{0.44}{0.89} A n or m ​ ​ = C = ( 5 ​ 2 ​ 5 ​ 1 ​ ​ ) ≊ ( 0.89 0.44 ​ ) Going through the same process for the second vector B ⃗ = ( 2 0.5 ) \\vec{B} = \\binom{2}{0.5} B = ( 0.5 2 ​ ) would yield the following normalized vector D ⃗ \\vec{D} D : B n o r m ⃗ = D ⃗ = ( 2 4.25 0.5 4.25 ) ≊ ( 0.97 0.24 ) \\vec{B_{norm}} = \\vec{D} = \\dbinom{\\frac{2}{\\sqrt{4.25}}}{\\frac{0.5}{\\sqrt{4.25}}} \\approxeq \\dbinom{0.97}{0.24} B n or m ​ ​ = D = ( 4.25 ​ 0.5 ​ 4.25 ​ 2 ​ ​ ) ≊ ( 0.24 0.97 ​ ) In order to derive the dot product similarity formula, we can compute the cosine similarity between our normalized vectors C ⃗ \\vec{C} C and D ⃗ \\vec{D} D using formula (4), as shown below: s c o s ( C ⃗ , D ⃗ ) = C ⃗ ⋅ D ⃗ 1 × 1 s_{cos}(\\vec{C}, \\vec{D}) = \\frac{\\vec{C} \\cdot \\vec{D}}{1 \\times 1} s cos ​ ( C , D ) = 1 × 1 C ⋅ D ​ And since the magnitude of both normalized vectors is now 1 1 1 , the dot product similarity formula (6) simply becomes... you guessed it, a dot product of both normalized vectors: s d o t ( C ⃗ , D ⃗ ) = C ⃗ ⋅ D ⃗ (6) \\tag{6} s_{dot}(\\vec{C}, \\vec{D}) = \\vec{C} \\cdot \\vec{D} s d o t ​ ( C , D ) = C ⋅ D ( 6 ) In the image below, we show the normalized vectors C ⃗ \\vec{C} C and D ⃗ \\vec{D} D and we can illustrate their dot product similarity as the projection of one vector onto the other (in red). Using our new formula (6), we can compute the dot product similarity of our two normalized vectors, which unsurprisingly yields the exact same similarity value as the cosine one: s d o t ( C ⃗ , D ⃗ ) = ( 1 5 ⋅ 2 4.25 ) + ( 2 5 ⋅ 0.5 4.25 ) ≊ 0.650791 s_{dot}(\\vec{C}, \\vec{D}) = \\Big(\\frac{1}{\\sqrt{5}} \\cdot \\frac{2}{\\sqrt{4.25}}\\Big) + \\Big(\\frac{2}{\\sqrt{5}} \\cdot \\frac{0.5}{\\sqrt{4.25}}\\Big) \\approxeq 0.650791 s d o t ​ ( C , D ) = ( 5 ​ 1 ​ ⋅ 4.25 ​ 2 ​ ) + ( 5 ​ 2 ​ ⋅ 4.25 ​ 0.5 ​ ) ≊ 0.650791 When leveraging dot product similarity, the score is computed differently depending on whether the vectors contain float or byte values. In the former case, the score is computed the same way as for cosine similarity using formula (7) below: _ s c o r e d o t − f l o a t ( C ⃗ , D ⃗ ) = 1 + s d o t ( C ⃗ , D ⃗ ) 2 (7) \\tag{7} \\_score_{dot-float}(\\vec{C},\\vec{D}) = \\frac{1 + s_{dot}(\\vec{C}, \\vec{D})}{2} _ scor e d o t − f l o a t ​ ( C , D ) = 2 1 + s d o t ​ ( C , D ) ​ ( 7 ) However, when the vector is composed of byte values, the scoring is computed a bit differently as shown in formula (8) below, where d i m s dims d im s is the number of dimensions of the vector: _ s c o r e d o t − b y t e ( C ⃗ , D ⃗ ) = 0.5 + s d o t ( C ⃗ , D ⃗ ) 32768 × d i m s (8) \\tag{8} \\_score_{dot-byte}(\\vec{C},\\vec{D}) = \\frac{0.5 + s_{dot}(\\vec{C}, \\vec{D})}{32768 \\times dims} _ scor e d o t − b y t e ​ ( C , D ) = 32768 × d im s 0.5 + s d o t ​ ( C , D ) ​ ( 8 ) Also, one constraint in order to yield accurate scores is that all vectors, including the query vector, must have the same length, but not necessarily 1. Max inner product similarity Since release 8.11, there is a new similarity function that is less constrained than the dot product similarity, in that the vectors don't need to be normalized. The main reason for this is explained at length in the following article , but to sum it up very briefly, certain datasets are not very well adapted to having their vectors normalized (e.g., Cohere embeddings ) and doing so can cause relevancy issues. The formula for computing max inner product similarity is exactly the same as the dot product one (6). What changes is the way the score is computed by scaling the max inner product similarity using a piecewise function whose formula depends on whether the similarity is positive or negative, as shown in formula (9) below: _ s c o r e m i p ( A ⃗ , B ⃗ ) = { 1 1 − s d o t ( A ⃗ , B ⃗ ) if s d o t < 0 1 + s d o t ( A ⃗ , B ⃗ ) if s d o t ⩾ 0 (9) \\tag{9} \\_score_{mip}(\\vec{A},\\vec{B}) = \\begin{cases} \\Large \\frac{1}{1 - s_{dot}(\\vec{A}, \\vec{B})} &\\text{if } s_{dot} < 0 \\\\ 1 + s_{dot}(\\vec{A}, \\vec{B}) &\\text{if } s_{dot} \\geqslant 0 \\end{cases} _ scor e mi p ​ ( A , B ) = ⎩ ⎨ ⎧ ​ 1 − s d o t ​ ( A , B ) 1 ​ 1 + s d o t ​ ( A , B ) ​ if s d o t ​ < 0 if s d o t ​ ⩾ 0 ​ ( 9 ) What this piecewise function does is that it scales all negative max inner product similarity values in the [ 0 , 1 [ [0, 1[ [ 0 , 1 [ interval and all positive values in the [ 1 , ∞ [ [1, \\infty[ [ 1 , ∞ [ interval. In summary That was quite a ride, mathematically speaking, but here are a few takeaways that you might find useful. Which similarity function you can use, ultimately depends on whether your vector embeddings are normalized or not. If your vectors are already normalized or if your data set is agnostic to vector normalization (i.e., relevancy will not suffer), you can go ahead and normalize your vectors and use dot product similarity, as it is much faster to compute than the cosine one since there is no need to compute the length of each vector. When comparing millions of vectors, those computations can add up quite a lot. If your vectors are not normalized, then you have two options: use cosine similarity if normalizing your vectors is not an option use the new max inner product similarity if you want the magnitude of your vectors to contribute to scoring because they do carry meaning (e.g., Cohere embeddings) At this point, computing the distance or similarity between vector embeddings and how to derive their scores should make sense to you. We hope you found this article useful. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to What are vector embeddings? How do we compare vectors? Distance, similarity and scoring L1 distance L2 distance Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Vector similarity techniques and scoring - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/vector-similarity-techniques-and-scoring","meta_description":"Explore vector similarity techniques and scoring in Elasticsearch, including L1 & L2 distance, cosine similarity, dot product similarity and max inner product similarity."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Improving information retrieval in the Elastic Stack: Optimizing retrieval with ELSER v2 Learn how we are reducing the retrieval costs of the Learned Sparse EncodeR (ELSER) v2. ML Research TV QH VK By: Thomas Veasey , Quentin Herreros and Valeriy Khakhutskyy On October 17, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our last post we introduced ELSER v2, discussed its zero shot relevance and the inference performance improvements we made. This blog focuses on how we reduced its retrieval costs. It has been noted that retrieval can be slow when using scores computed from learned sparse representations, such as ELSER. Slow is a relative term and in this context we mean slow when compared to BM25 scored retrieval. There are two principle reasons for this: The query expansion means we're usually matching many more terms than are present in user supplied keyword searches. The weight distribution for BM25 is particularly well suited to query optimisation. The first bottleneck can be tackled at train time, albeit with a relevance retrieval cost tradeoff. There is a regularizer term in the training loss which allows one to penalize using more terms in the query expansion. There are also gains to be had by performing better model selection. Retrieval cost aware training When training any model it is sensible to keep the best one as optimisation progresses. Typically the quality is measured using the training loss function evaluated on a hold-out, or validation, dataset. We had found this metric alone did not correlate as well as we liked with zero-shot relevance; so we were already measuring NDCG@10 on several small datasets from the BEIR suite to help decide which model to retain. This allows us to measure other aspects of retrieval behavior. In particular, we compute the retrieval cost using the number of weight multiplications performed on average to find the top-k matches for every query. We found that there is quite significant variation between the retrieval cost for relatively small variation in retrieval quality and used this information to identify Pareto optimal models. This was done for various choices of our regularization hyperparameters at different points along their learning trajectories. The figure below shows a scatter plot of the candidate models we considered characterized by their relevance and cost, together with the choice we made for ELSER v2. In the end we sacrificed around 1% in relevance for around a 25% reduction in the retrieval cost. Performing model selection for ELSER v2 via relevance retrieval cost multi-objective optimization Whilst this is a nice win, the figure also shows there is only so much it is possible to achieve when making the trade off at train time. At least without significantly impacting relevance. As we discussed before , with ELSER our goal is to train a model with excellent zero-shot relevance. Therefore, if we make the tradeoff during training we make it in a global setting, without knowing anything about the specific corpus where the model will be applied. To understand how to overcome the dichotomy between relevance and retrieval cost we need to study the token statistics in a specific corpus. At the same time, it is also useful to understand why BM25 scoring is so efficient for retrieval. Optimizing ELSER queries The BM25 score comprises two factors, one which relates to its frequency in each document and one which relates to the frequency of each query term in the corpus. Focusing our attention on second factor, the score contribution of a term t t t is weighted by its inverse document frequency (IDF) or log ⁡ ( 1 − f t f t + 1 ) \\log\\left(\\frac{1 - f_t}{f_t} + 1\\right) lo g ( f t ​ 1 − f t ​ ​ + 1 ) . Here f t = n t + 0.5 N f_t=\\frac{n_t+0.5}{N} f t ​ = N n t ​ + 0.5 ​ and n t n_t n t ​ and N N N denote the matching document count and total number of documents, respectively. So f t f_t f t ​ is just the proportion of the documents which contain that term, modulo a small correction which is negligible for large corpuses. It is clear that IDF is a monotonic decreasing function of the frequency. Coupled with block-max WAND , this allows retrieval to skip many non-competitive documents even if the query includes frequent terms. Specifically, in any given block one might expect some documents to contain frequent terms, but with BM25 scoring they are unlikely to be competitive with the best matches for the query. The figure below shows statistics related to the top tokens generated by ELSER v2 for the NFCorpus dataset. This is one of the datasets used to evaluate retrieval in the BEIR suite and comprises queries and documents related to nutrition. The token frequencies, expressed as a percentage of the documents which contain that token, are on the right hand axis and the corresponding IDF and the average ELSER v2 weight for the tokens are on the left hand axis. If one examines the top tokens they're what we might expect given the corpus content: things like “supplement”, “nutritional”, “diet”, etc. Queries expand to a similar set of terms. This underlines that even if tokens are well distributed in the training corpus as a whole, they can end up concentrated when we examine a specific corpus. Furthermore, we see that unlike BM25 the weight is largely independent of token frequency and this makes block-max WAND ineffective. The outcome is retrieval is significantly more expensive than BM25. Average ELSER v2 weights and IDF for the top 500 tokens in the document expansions of NFCorpus together with the percentage of documents in which they appear Taking a step back, this suggests we reconsider token importance in light of the corpus subject matter. In a general setting, tokens related to nutrition may be highly informative. However, for a corpus about nutrition they are less so. This in fact is the underpinning of information theoretic approaches to retrieval. Roughly speaking we have two measures of the token information content for a specific query and corpus: its assigned weight - which is the natural analogue of the term frequency term used in BM25 - and the token frequency in the corpus as a whole - which we disregard when we score matches using the product of token weights. This suggests the following simple strategy to accelerate queries with hopefully little impact on retrieval quality: Drop frequent tokens altogether provided they are not particularly important for the query in the retrieval phase, Gather slightly more matches than required, and Rerank using the full set of tokens. We can calculate the expected fraction of documents a token will be present in, assuming they all occur with equal probability. This is just the ratio N T N ∣ T ∣ \\frac{N_T}{N|T|} N ∣ T ∣ N T ​ ​ where N T N_T N T ​ is the total number of tokens in the corpus, N N N is the number of documents in the corpus and ∣ T ∣ |T| ∣ T ∣ is the vocabulary size, which is 30522. Any token that occurs in a significantly greater fraction of documents than this is frequent for the corpus. We found that pruning tokens which are 5 times more frequent than expected was an effective relevance retrieval cost tradeoff. We fixed the count of documents reranked using the full token set to 5 times the required set, so 50 for NDCG@10. We found we achieved more consistent results setting the weight threshold for which to retain tokens as a fraction of the maximum weight of any token in the query expansion. For the results below we retain all tokens whose weight is greater than or equal to 0.4 × “max token weight for the query”. This threshold was chosen so NDCG@10 was unchanged on NFCorpus. However, the same parameterization worked for the other 13 test datasets we tested, which strongly suggests that it generalizes well. The table below shows the change in NDCG@10 relative to ELSER v2 with exact retrieval together with the retrieval cost relative to ELSER v1 with exact retrieval using this strategy. Note that the same pruning strategy can be applied to any learned sparse representation. However, we view that the key questions to answer are: Does the approach lead to any degradation in relevance compared to using exact scoring? What improvement in the retrieval latency might one expect using ELSER v2 and query optimization compared to the performance of the text_expansion query to date? In summary, we achieved a very small improvement(!) of 0.07% in average NDCG@10 when we used the optimized query compared to the exact query and an average 3.4 times speedup. Furthermore, this speedup is measured without block-max WAND. As we expected, the optimization works particularly well together with block-max WAND. On a larger corpus (8.8M passages) we saw an 8.4 times speedup with block-max WAND enabled. Measuring the relevance and latency impact of using token pruning followed by reranking. The relevance is measured by percentage change in NDCG@10 for exact retrieval with ELSER v2 and the speedup is measured with respect to exact retrieval with ELSER v1 An intriguing aspect of these results is that on average we see a small relevance improvement. Together with the fact that we previously showed carefully tuned combinations of ELSER v1 and BM25 scores yield very significant relevance improvements, it strongly suggests there are benefits available for relevance as well as for retrieval cost by making better use of corpus token statistics. Ideally, one would re-architect the model and train the query expansion to make use of both token weights and their frequencies. This is something we are actively researching. Implementing with an Elasticsearch query As of Elasticsearch 8.13.0, we have integrated this optimization in the text_expansion query via token pruning so it is automatically applied in the retrieval phase for the text_expansion query. For versions of Elasticsearch before 8.13.0, it is possible to achieve the same results using existing Elasticsearch query DSL given an analysis of the token frequencies and their weights. Tokens are stored in the _source field so it is possible to paginate through the documents and accumulate token frequencies to find out which tokens to exclude. Given an inference response one can partition the tokens into a “kept” and “dropped” set. The kept set is used to score the match in a should query. The dropped set is used in a rescore query on a window of the top 50 docs. Using query_weight and rescore_query_weight both equal to one simply sums the two scores so recovers the score using the full set of tokens. The query together with some explanation is shown below. Conclusion In these last two posts in our series we introduced the second version of the Elastic Learned Sparse EncodeR. So what benefits does it bring? With some improvements to our training data set and regularizer we were able to obtain roughly a 2% improvement on our benchmark of zero-shot relevance. At the same time we've also made significant improvements to inference performance and retrieval latency. We traded a small degradation (of a little less than 1%) in relevance for a large improvement (of over 25%) in the retrieval latency when performing model selection in the training loop. We also identified a simple token pruning strategy and verified it had no impact on retrieval quality. Together these sped up retrieval by between 2 and 5 times when compared to ELSER v1 on our benchmark suite. Token pruning can currently be implemented using Elasticsearch DSL, but we're also working towards performing it automatically in the text_expansion query. To improve inference performance we prepared a quantized version of the model for x86 architecture and upgraded the libtorch backend we use. We found that these sped up inference by between 1.7 and 2.2 times depending on the text length. By using hybrid dynamic quantisation, based on an analysis of layer sensitivity to quantisation, we were able to achieve this with minimal loss in relevance. We believe that ELSER v2 represents a step change in performance, so encourage you to give it a try! This is an exciting time for information retrieval, which is being reshaped by rapid advances in NLP. We hope you've enjoyed this blog series in which we've tried to give a flavor of some of this field. This is not the end, rather the end of the beginning for us. We're already working on various improvements to retrieval in Elasticsearch and particularly in end-to-end optimisation of retrieval and generation pipelines. So stay tuned! The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Part 1: Steps to improve search relevance Part 2: Benchmarking passage retrieval Part 3: Introducing Elastic Learned Sparse Encoder, our new retrieval model Part 4: Hybrid Retrieval Part 5: Optimizing inference for ELSER v2 Part 6: Optimizing retrieval with ELSER v2 Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili ML Research December 19, 2024 Understanding optimized scalar quantization In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization. TV By: Thomas Veasey ML Research December 10, 2024 cRank it up! - Introducing the Elastic Rerank model (in Technical Preview) Get started in minutes with the Elastic Rerank model: powerful semantic search capabilities, with no required reindexing, provides flexibility and control over costs; high relevance, top performance, and efficiency for text search. ST By: Shubha Anjur Tupil Jump to Retrieval cost aware training Optimizing ELSER queries Implementing with an Elasticsearch query Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Improving information retrieval in the Elastic Stack: Optimizing retrieval with ELSER v2 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-2","meta_description":"Learn how we are reducing the retrieval costs of the Learned Sparse EncodeR (ELSER) v2."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open inference API adds support for Azure OpenAI chat completions Azure OpenAI chat completions is available via the Elasticsearch inference API. Learn how to use this feature to answer questions. Integrations Generative AI How To TG By: Tim Grein On May 22, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. We’ve integrated Azure OpenAI chat completions in the inference API, which allows our customers to build powerful GenAI applications based on chat completion using large language models like GPT-4 Azure and Elasticsearch developers can utilize the unique capabilities of the Elasticsearch vector database and the Azure AI ecosystem to power unique GenAI applications with the model of their choice. This blog quickly goes over the catalog of supported providers in the open inference API and explains how to use Azure’s OpenAI chat completions to answer questions through an example. The inference API is growing…fast! We’re heavily extending the catalog of supported providers in the open inference API. Check out some of our latest blog posts on Elastic Search labs to learn more about recent integrations around embeddings, completions and reranking: Elasticsearch open inference API adds support for Azure Open AI Studio Elasticsearch open inference API adds support for Azure Open AI embeddings Elasticsearch open inference API adds support for OpenAI chat completions Elasticsearch open Inference API adds support for Cohere’s Rerank 3 model Elasticsearch open inference API adds support for Cohere Embeddings ...more to come! Azure OpenAI chat completions support is available through the open inference API in our stateless offering on Elastic Cloud. It’ll also be soon available to everyone in an upcoming versioned Elasticsearch release. This also complements the capability to use the Elasticsearch vector database in the Azure OpenAI service. Using Azure’s OpenAI chat completions to answer questions In my last blog post about OpenAI chat completions we’ve learned how to summarize text using OpenAI’s chat completions. In this guide we’ll use Azure OpenAI chat completions to answer questions during ingestion to have answers ready ahead of searching. Make sure you have your Azure OpenAI api key, deployment id and resource name ready by creating a free Azure account first and setting up a model suited for chat completions. You can follow Azure's OpenAI Service GPT quickstart guide to get a model up and running. In the following example we’ve used `gpt-4` with the version `2024-02-01`. You can read more about supported models and versions here . In Kibana, you'll have access to a console for you to input these next steps in Elasticsearch without even needing to set up an IDE. First, we configure a model, which will perform completions: You’ll get back a response similar to the following with status code `200 OK` on successful inference creation: You can now call the configured model to perform completion on any text input. Let’s ask the model what’s inference in the context of GenAI: You should get back a response with status code `200 OK` explaining what inference is: Now we can set up a small catalog of questions, which we want to be answered during ingestion. We’ll use the Bulk API to index three questions about products of Elastic: You’ll get back a response with status `200 OK` back similar to the following upon successful indexing: We’ll create now our question and answering ingest pipeline using the script- , inference- and remove-processor : This pipeline prefixes the content with the instruction “Please answer the following question: “ in a temporary field named `prompt`. The content of this temporary `prompt` field will be sent to Azure’s OpenAI Service through the inference API to perform a completion. Using an ingest pipeline allows for immense flexibility as you can change the pre-prompt to anything you would like. This allows you to summarize documents for example, too. Check out Elasticsearch open inference API adds support for OpenAI chat completions to learn about how to build a summarisation ingest pipeline! We now send our documents containing questions through the question and answering pipeline by calling the reindex API . You'll get back a response with status 200 OK similar to the following: In a real world setup you’ll probably use another ingestion mechanism to ingest your documents in an automated way. Check out our Adding data to Elasticsearch guide to learn more about the various options offered by Elastic to ingest data into Elasticsearch. We’re also committed to showcase ingest mechanisms and provide guidance on how to bring data into Elasticsearch using 3rd party tools. Take a look at Ingest Data from Snowflake to Elasticsearch using Meltano: A developer’s journey for example on how to use Meltano for ingesting data. You're now able to search for your pre-generated answers using the Search API : In the response you'll get back your pre-generated answers: Pre-generating answers for frequently asked questions is particularly effective in reducing operational costs. By minimizing the need for on-the-fly response generation, you can significantly cut down on the amount of computational resources required like token usage. Additionally, this method ensures that every user receives the same, precise information. Consistency is crucial, especially in fields requiring high reliability and accuracy such as medical, legal, or technical support. More to come! We’re already working on adding support for more task types using Cohere, Google Vertex AI and many more. Furthermore we’re actively developing an intuitive UI in Kibana for managing Inference endpoints. Lots of exciting stuff to come! Bookmark Elastic Search Labs now to keep with Elastic’s innovations in the GenAI space! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to The inference API is growing…fast! Using Azure’s OpenAI chat completions to answer questions More to come! Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch open inference API adds support for Azure OpenAI chat completions - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-azure-openai-completion-support","meta_description":"Azure OpenAI chat completions is available via the Elasticsearch inference API. Learn how to use this feature to answer questions."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to use Elasticsearch with popular Ruby tools Take a look at how to use Elasticsearch with some popular Ruby libraries. Ruby How To FB By: Fernando Briano On October 16, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this blog post we are going to take a look at how to use Elasticsearch with some popular Ruby tools. We'll implement the common use APIs found in the \"Getting Started\" guide for the Ruby client. If you follow that link, you can see how you can run these same actions with the official Elasticsearch client: elasticsearch-ruby . We run extensive tests on the client to make sure all the APIs in Elasticsearch are supported for every version, including the current version in development. This covers almost 500 APIs. However, there might be cases where you don't want to use the client and want to implement some of the functionality yourself in your Ruby code. Your code could depend heavily on a particular library, and you'd like to reuse it for Elasticsearch. You could be working in a setup where you only need a couple of the APIs and don't want to bring in a new dependency. Or you have limited resources and you don't want to use a full-fledged Client that can do everything in Elasticsearch. Whatever the reason, Elasticsearch makes it easy by exposing REST APIs that can be called directly, so you can access its functionality by making HTTP requests without the client. When working with the API, it's recommended to take a look at the API Conventions and Common options . Introduction The libraries used in these examples are Net::HTTP , HTTParty , exon , HTTP (a.k.a. http.rb ), Faraday and elastic-transport . On top of looking at how to interact with Elasticsearch from Ruby, this post will take a short look at each of these libraries, allowing us to get to know them and how to use them. It's not going to go in depth for any of the libraries, but it'll give an idea of what it's like to use each of them. The code was written and tested in Ruby 3.3.5. The versions of each tool will be mentioned in their respective sections. The examples use require 'bundler/inline' for the convenience of installing the necessary gems in the same file where the code is being written, but you can use a Gemfile instead too. Setup While working on these examples, I'm using start-local , a simple shell script that sets up Elasticsearch and Kibana in seconds for local development. In the directory where I'm writing this code, I run: This creates a sub directory called elastic-start-local , which includes a .env file with the information we need to connect and authenticate with Elasticsearch. We can either run source elastic-start-local/.env before running our Ruby code, or use the dotenv gem: The following code examples assume the ENV variables in this file have been loaded. We can authenticate with Elasticsearch by using Basic Auth or API Key Authentication . To use Basic Auth, we have to use the user name 'elastic' and the value stored in ES_LOCAL_PASSWORD as password. To use API Key Authentication, we need the value stored in ES_LOCAL_API_KEY in this .env file. Elasticsearch can be managed using Kibana, which will be running at http://localhost:5601 with start-local , and you can create an API Key manually in Kibana too. Elasticsearch will be running on http://localhost:9200 by default, but the examples load the host from the ES_LOCAL_URL environment variable. You could also use any other Elasticsearch cluster to run these, adjusting the host and credentials accordingly. If you're using start-local , you can stop the running instance of Elasticsearch with the command docker compose stop and restart it with docker compose up from the elastic-start-local directory. Net::HTTP Net::HTTP provides a rich library that implements the client in a client-server model that uses the HTTP request-response protocol. We can require this library in our code with require 'net-http' and start using it without installing any extra dependencies. It's not the most user-friendly one, but it's natively available in Ruby. The version used in these examples is 0.4.1 . This gives us the setup for performing requests to Elasticsearch. We can test this with an initial request to the root path of the server: And we can inspect the response for more information: We can now try to create an index : With our index, we can now start to work with Documents . Notice how we need to transform the document to JSON to use it in the request. With an indexed document, we can test a very simple Search request: And do some more work with the indexed data: Finally, we'll delete the index to clean up our cluster: HTTParty HTTParty is a gem \"with the goal to make HTTP fun\". It provides some helpful abstractions to make requests and work with the response. These examples use version 0.22.0 of the library. The initial request to the server: If the response Content Type is application/json , HTTParty will parse the response and return Ruby objects such as a hash or array. The default behavior for parsing JSON will return keys as strings. We can use the response as follows: The README shows how to use the class methods to make requests quickly and the option to create a custom class. It would be more convenient to implement an Elasticsearch Client class and add the different API methods we'd like to use. Something like this for example: We don't want to re-implement Elasticsearch Ruby with HTTParty in this blog post, but this could be an alternative when using just a few of the APIs. We'll take a look at how to build the rest of the requests: excon Excon was designed with the intention of being simple, fast and performant. It is particularly aimed at usage in API clients, so it is well suited for interacting with Elasticsearch. This code uses Excon version 0.111.0 . Excon requests return an Excon::Response object which has body , headers , remote_ip and status attributes. We can also access the data directly with the keys as symbols, similar to how Elasticsearch::API::Response works: We can reuse a connection across multiple requests to share options and improve performance. We can also use persistent connections to establish the socket connection with the initial request, and leave the socket open while we're running these examples: HTTP (http.rb) HTTP is an HTTP client which uses a chainable API similar to Python's Requests . It implements the HTTP protocol in Ruby and outsources the parsing to native extensions. The version used in this code is 5.2.0 . We can also use the auth method to take advantage of the chainable API: Or since we also care about the content type header, chain headers : With HTTP we can create a client with persistent connection to the host, and persist the headers too: So once we've created our persistent clients, it makes it shorter to build our requests: The documentation warns us that the response must be consumed before sending the next request in the persistent connection. That means calling to_s , parse , or flush on the response object. Faraday Faraday is the HTTP client library used by default by the Elasticsearch Client. It provides a common interface over many adapters which you can select when instantiating a client (Net::HTTP, Typhoeus, Patron, Excon and more). The version of Faraday used in this code was 2.12.0 . The signature for get is (url, params = nil, headers = nil) so we're passing nil for parameters in this initial test request: The response is a Faraday::Response object with the response status , headers , and body and we can also access lots of properties in a Faraday Env object . As we've seen with other libraries, the recommended way to use Faraday for our use case is to create a Faraday::Connection object: And now reusing that connection, we can see what the rest of the requests look like with Faraday: Elastic Transport The library elastic-transport is the Ruby gem that deals with performing HTTP requests, encoding, compression, etc. in the official Elastic Ruby clients. This library has been battle tested for years against every official version of Elasticsearch. It used to be known as elasticsearch-transport as it was the base for the official Elasticsearch client. However in version 8.0.0 of the client, we migrated the transport library to elastic-transport since it was also supporting the official Enterprise Search Client and more recently the Elasticsearch Serverless Client. It uses a Faraday implementation by default, which supports several different adapters as we saw earlier. You can also use Manticore and Curb (the Ruby binding for libcurl) implementations included with the library. You can even write your own, or an implementation with some of the libraries we've gone through here. But that would be the subject for a different blog post! Elastic Transport can also be used as an HTTP library to interact with Elasticsearch. It will deal with everything you need and has a lot of settings and different configurations related to the use at Elastic. The version used here is the latest 8.3.5 . A simple example: Conclusion As you can see, the Elasticsearch Ruby client does a lot of work to make it easy to interact with Elasticsearch in your Ruby code. We didn't even go too deep in this blog post working with more complex requests or handling errors. But Elasticsearch's REST API makes it possible to use it with any library that supports HTTP requests, in Ruby and any other language. The Elasticsearch REST APIs guide is a great reference to learn more about the available APIs and how to use them. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction Setup Net::HTTP HTTParty excon Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to use Elasticsearch with popular Ruby tools - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-ruby-tools","meta_description":"Learn how to use Elasticsearch with popular Ruby libraries like Net::HTTP, HTTParty, exon, HTTP (http.rb), Faraday and elastic-transport."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. Search Relevance PB By: Panagiotis Bailis On May 28, 2025 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In our previous blog post we introduced the redesigned-from-scratch retrievers framework, which enables the creation of complex ranking pipelines. We also explored how the Reciprocal Rank Fusion (RRF) retriever enables hybrid search by merging results from different queries. While RRF is easy to implement, it has a notable limitation: it focuses purely on relative ranks, ignoring actual scores. This makes fine-tuning and optimization a challenge. Meet the linear retriever! In this post, we introduce the linear retriever , our latest addition for supporting hybrid search! Unlike rrf , the linear retriever calculates a weighted sum across all queries that matched a document. This approach preserves the relative importance of each document within a result set while allowing precise control over each query’s influence on the final score. As a result, it provides a more intuitive and flexible way to fine-tune hybrid search. Defining a linear retriever where the final score will be computed as: s c o r e = 5 ∗ k n n + 1.5 ∗ b m 25 score = 5 * knn + 1.5 * bm25 score = 5 ∗ knn + 1.5 ∗ bm 25 It is as simple as: Notice how simple and intuitive it is? (and really similar to rrf !) This configuration allows you to precisely control how much each query type contributes to the final ranking, unlike rrf , which relies solely on relative ranks. One caveat remains: knn scores may be strictly bounded, depending on the similarity metric used. For example, with cosine similarity or the dot product of unit-normalized vectors, scores will always lie within the [0, 1] range. In contrast, bm25 scores are less predictable and have no clearly defined bounds. Scaling the scores: kNN vs BM25 One challenge of hybrid search is that different retrievers produce scores on different scales. Consider for example the following scenario: Query A scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 100 1.5 1 0.5 Query B scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 0.63 0.01 0.3 0.4 You can see the disparity above: kNN scores range between 0 and 1, while bm25 scores can vary wildly. This difference makes it tricky to set static optimal weights for combining the results. Normalization to the rescue: the MinMax normalizer To address this, we’ve introduced an optional minmax normalizer that scales scores, independently for each query, to the [0, 1] range using the following formula: n o r m a l i z e d s c o r e = ( s c o r e − m i n ) / ( m a x − m i n ) normalized_score = (score - min) / (max - min) n or ma l i ze d s ​ core = ( score − min ) / ( ma x − min ) This preserves the relative importance of each document within a query’s result set, making it easier to combine scores from different retrievers. With normalization, the scores become: Query A scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 1.00 0.01 0.005 0.000 Query B scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 1.00 0.000 0.465 0.645 All scores now lie in the [0, 1] range and optimizing the weighted sum is much more straightforward as we now capture the (relative to the query) importance of a result instead of its absolute score and maintain consistency across queries. Example time! Let’s go through an example now to showcase what the above looks like and how the linear retriever addresses some of the shortcomings of rrf . RRF relies solely on relative ranks and doesn’t consider actual score differences. For example, given these scores: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 100 1.5 1 0.5 rrf score 0.03226 0.03252 0.03200 0.03125 rrf would rank the documents as: d o c 2 > d o c 1 > d o c 3 > d o c 4 doc2 > doc1 > doc3 > doc4 d oc 2 > d oc 1 > d oc 3 > d oc 4 However, doc1 has a significantly higher bm25 score than the others, which rrf fails to capture because it only looks at relative ranks. The linear retriever, combined with normalization, correctly accounts for both the scores and their differences, producing a more meaningful ranking: doc1 doc2 doc3 doc4 knn 0.347 0.35 0.348 0.346 bm25 1 0.01 0.005 0 As we can see in the above, doc1’s great ranking and score for bm25 is properly accounted for and reflected on the final scores. In addition to that, all scores lie now in the [0, 1] range so that we can compare and combine them in a much more intuitive way (and even build offline optimization processes). Putting it all together To take full advantage of the linear retriever with normalization, the search request would look like this: This approach combines the best of both worlds: it retains the flexibility and intuitive scoring of the linear retriever, while ensuring consistent score scaling with MinMax normalization. As with all our retrievers, the linear retriever can be integrated into any level of a hierarchical retriever tree, with support for explainability, match highlighting, field collapsing, and more. When to pick the linear retriever and why it makes a difference The linear retriever: Preserves relative importance by leveraging actual scores, not just ranks. Allows fine-tuning with weighted contributions from different queries. Enhances consistency using normalization, making hybrid search more robust and predictable. Conclusion The linear retriever is already available on Elasticsearch Serverless, and the 8.18 and 9.0 releases! More examples and configuration parameters can also be found in our documentation. Try it out and see how it can improve your hybrid search experience — we look forward to your feedback. Happy searching! Report an issue Related content Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Search Relevance April 16, 2025 ES|QL, you know, for Search - Introducing scoring and semantic search With Elasticsearch 8.18 and 9.0, ES|QL comes with support for scoring, semantic search and more configuration options for the match function and a new KQL function. IT By: Ioana Tagirta Jump to Meet the linear retriever! Scaling the scores: kNN vs BM25 Normalization to the rescue: the MinMax normalizer Example time! Putting it all together Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Hybrid search revisited: introducing the linear retriever! - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/linear-retriever-hybrid-search","meta_description":"Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to use Elasticsearch to prompt ChatGPT with natural language This blog post presents an experimental project for querying Elasticsearch in natural language using ChatGPT. Generative AI PHP EZ By: Enrico Zimuel On June 21, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. These days everyone is talking about ChatGPT . One of the cool features of this large language model (LLM) is the ability to generate code. We used it to generate Elasticsearch DSL queries . The goal is to search in Elasticsearch ® with sentences like “Give me the first 10 documents of 2017 from the stocks index.” This experiment showed that it is possible, with some limitations. In this post, we describe this experiment and the open source library that we published for this use case. Can ChatGPT generate Elasticsearch DSL? We start the experiment with some tests focusing on the ability of ChatGPT to generate Elasticsearch DSL query. For this scope, you need to provide some context to ChatGPT about the structure of the data that you want to search. In Elasticsearch, data is stored in an index, which is similar to a \"table\" in a relational database. It has a mapping that defines multiple fields and their types. This means we need to provide the mapping information of the index that we want to query. By doing so, ChatGPT has the necessary context to translate the query into Elasticsearch DSL. Elasticsearch offers a get mapping API to retrieve the mapping of an index. In our experiment, we used a stocks index data set available here . This data set contains five years of stock prices of 500 Fortune companies, spanning from February 2013 to February 2018. Here we reported the first five lines of the CSV file containing the data set: Each line contains the date of the stock, the open value of the day, the high and the low values, the close value, the volume of the stocks exchanged, and finally the stock name — for example, American Airlines Group Inc. (AAL). The mapping associated to the stocks index is as follows: We can use the GET %2Fstocks%2F_mapping API to retrieve the mapping from Elasticsearch. [Related article: ChatGPT and Elasticsearch: OpenAI meets private data ] Let's build a prompt to find out In order to translate a query expressed in human language to Elasticsearch DSL, we need to find the right prompt to give to ChatGPT. This is the most difficult part of the process: to actually program ChatGPT using the correct question format (in other words, the right prompt). After some iterations, we ended up with the following prompt that seems to work quite well: The value {mapping} and query} in the prompt are two placeholders to be replaced with the mapping json string (for example, returned by the GET %2Fstocks%2F_mapping in the previous example) and the query expressed in human language (for example: Return the first 10 documents of 2017). Of course, ChatGPT is limited and in some cases it won’t be able to answer a question. We found that, most of the time, this happens because the sentence used in the prompt is too general or ambiguous. To solve this situation, we need to enhance the prompt using more details. This process is called iteration, and it requires multiple steps to define the proper sentence to be used. If you want to try out how ChatGPT can translate a search sentence in an Elasticsearch DSL query (or even SQL), you can use dsltranslate.com . Putting it all together Using the ChatGPT API offered by OpenAI and the Elasticsearch API for mapping and search, we put it all together in an experimental library for PHP. This library exposes a search() function with the following API: Where $index is the index name to be used, $prompt is the query expressed in human language and $bool is an optional parameter for using a cache (enabled by default). The process of this function is reported in the following diagram: The inputs are index and prompt (on the left). The index is used to retrieve the mapping from Elasticsearch (using the get mapping API) . The result is a mapping in JSON that is used to build the query string to send to ChatGPT using the following API code. We used the gpt-3.5-turbo model of OpenAI that is able to translate in code. The result from ChatGPT contains an Elasticsearch DSL query that we use to query Elasticsearch. The result is then returned to the user. To query Elasticsearch, we utilized the official elastic%2Felasticsearch-php client. To optimize response time and reduce the cost of using the ChatGPT API, we used a simple caching system based on files. We used a cache to: Store the mapping JSON returned by Elasticsearch: We store this JSON in a file named after the index. This allows us to retrieve the mapping information without making additional calls to Elasticsearch. Store the Elasticsearch DSL generated by ChatGPT: To cache the generated Elasticsearch DSL, we named the cache file using the hash (MD5) of the prompt used. This approach enables us to reuse previously generated Elasticsearch DSL for the same query, eliminating the need to call the ChatGPT API again. We also added the possibility to retrieve the Elasticsearch DSL programmatically using the getLastQuery() function. Running the experiment with financial data We used Elastic Cloud to store the stocks value reported here . In particular, we used a simple bulk script to read the stocks file in CSV and send it to Elasticsearch using the bulk API . For more details on how to set up an Elastic Cloud and retrieve the API key, read the documentation . Once we stored the stocks index, we used a simple PHP script for testing some query expressed in English. The script we used is examples%2Ftest.php . To execute this examples%2Ftest.php script, we need to set three environment variables: OPENAI_API_KEY: the API key of OpenAI ELASTIC_CLOUD_ENDPOINT: the url of the Elasticsearch instance ELASTIC_CLOUD_API_KEY: the API key of Elastic Cloud Using the stocks mapping, we tested the following queries recording all the Elasticsearch DSL responses: As you can see, the results are pretty good. The last one about the difference between closed and open fields was quite impressive! All the requests have been translated in a valid Elasticsearch DSL query that is correct according to the question expressed in natural language. Use the language you speak! A very nice feature of ChatGPT is the ability to specify questions in different languages. That means you can use this library and specify the query in different natural languages, like Italian, Spanish, French, German, and so on. Here is an example: All the previous search have the same results producing the following Elasticsearch query (more or less): Important: ChatGPT is an LLM that has been optimized for English, which means the best results are obtained using queries entered in English. Limitations of LLMs Unfortunately, ChatGPT and LLMs in general are not capable of verifying the correctness of the answer from a semantic point of view. They give answers that look right from a statistical point of view. This means, we cannot test if the Elasticsearch DSL query generated by ChatGPT is the right translation of the query in natural language. Of course, this is a big limitation at the moment. In some other use cases, like mathematical operations, we can solve the correctness problem using an external plugin, like the Wolfram Plugin of ChatGPT . In this case, the result of ChatGPT uses the Wolfram engine that checks the correctness of the response, using a mathematical symbolic model. Apart from the correctness limitation, which implies we should always check ChatGPT’s answers, there are also limitations to the ability to translate a human sentence in an Elasticsearch DSL query. For instance, using the previous stocks data set if we ask something as follows: The DSL query generated by ChatGPT is not valid producing this Elasticsearch error: Failed to parse date field [2015-01-01] with format [yyyy]. If we rephrase the sentence using more specific information, removing the apparent ambiguity about the date format, we can retrieve the correct answer, as follows: Basically, the sentence must be expressed using a description of how the Elasticsearch DSL should be rather than a real human sentence. Wrapping up In this post, we presented an experimental use case of ChatGPT for translating natural language search sentences into Elasticsearch DSL queries. We developed a simple library in PHP for using the OpenAI API to translate the query under the hood, providing also a caching system. The results of the experiment are promising, even with the limitation on the correctness of the answer. That said, we will definitely investigate further the possibility to query Elasticsearch in natural language using ChatGPT, as well as other LLM models that are becoming more and more popular. Learn more about the possibilities with Elasticsearch and AI . In this blog post, we may have used third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Can ChatGPT generate Elasticsearch DSL? Let's build a prompt to find out Putting it all together Running the experiment with financial data Use the language you speak! Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to use Elasticsearch to prompt ChatGPT with natural language - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-prompt-chatgpt-natural-language","meta_description":"This blog post presents an experimental project for querying Elasticsearch in natural language using ChatGPT."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. Elastic Cloud Serverless YL By: Yaru Lin On December 2, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. Elasticsearch Serverless is now generally available. We’ve re- architected Elastisearch as a fully managed service that autoscales with your data, usage, and performance needs. It has the power and flexibility of Elasticsearch without operational overhead. Since its technical preview this spring, we’ve introduced new capabilities to help developers build and manage applications faster. Whether you’re implementing semantic search, keyword search, or even image search, Elasticsearch Serverless simplifies the process, allowing you to focus on innovation instead of infrastructure. Designed to eliminate the complexity of managing resources, Elasticsearch Serverless makes it easier to run search, RAG, and AI-powered applications while maintaining the speed, relevance, and versatility Elasticsearch is known for. In this post, we’ll share how Elasticsearch Serverless simplifies building search applications with its modern architecture and developer-friendly features. Elasticsearch is the backbone of search experience Elasticsearch has long been the trusted engine for developers, data scientists, and full-stack engineers seeking high-performance, scalable search, and vector database capabilities. Its powerful relevance features and flexibility have made it the backbone for countless search-driven applications. Elasticsearch’s innovations in query speed and vector quantization have positioned it as a leading vector database, supporting scalable AI-driven use cases like semantic and hybrid search. Today, Elasticsearch continues to set the gold standard for search by combining: High speed and relevance for text search. Flexible query capabilities to tailor search workflows. Seamless handling of hybrid queries , combining vector and lexical search. An open-source core, rooted in Lucene , with continuous optimizations that push the boundaries of search technology. As search use cases evolve—incorporating hybrid search, AI and inference, and dynamic workloads—teams have more options than ever for scaling and managing infrastructure to meet their unique needs. These evolving demands present an exciting opportunity to rethink how we design for scale. Elasticsearch with serverless speed and simplicity Elasticsearch Serverless builds on Elasticsearch’s strengths to address the demands of modern workloads, characterized by large datasets, AI search, and unpredictable traffic. Elasticsearch Serverless meets these challenges head-on with a reimagined architecture purpose-built for today’s demands. Foundationally, Elasticsearch Serverless is built on a decoupled compute and storage model. This is an architectural change that removes the inefficiencies of repeated data transfers and leverages the reliability of object storage. From here, separating critical components enables independent scaling of indexing and search workloads, and resolves the long-standing challenges of balancing performance and cost-efficiency in high-demand scenarios. Decoupled compute and storage Elasticsearch Serverless uses object storage for reliable data storage and cost-efficient scalability. By eliminating the need for multiple replicas, indexing costs and data duplication are reduced. This approach ensures that storage is used only for what’s necessary, eliminating waste while maximizing efficiency. To maintain Elasticsearch’s speed, segment-level query parallelization optimizes data retrieval from object stores like S3, while advanced caching strategies ensure fast access to frequently used data. Dynamic autoscaling without compromise The decoupled architecture also enables smarter resource management by separating search and ingest workloads, allowing them to scale independently based on specific demands. This separation ensures that: Concurrent updates and searches no longer compete for resources. CPU cycles, memory, and I/O are allocated independently, ensuring consistent performance even during high-ingest operations. Ingest-heavy use cases benefit from isolated compute. Ensure fast and reliable search performance, even while indexing massive volumes of data. Vector search workflows thrive. Decoupling allows for compute-intensive indexing (like embedding generation) without impacting query speeds. Resources for ingest, search, and machine learning scale dynamically and independently to accommodate diverse workloads. There’s no need to overprovision for peak loads or worry about downtime during demand spikes. Read more about our dynamic and load-based ingest and search autoscaling. High-performance query execution Elasticsearch Serverless enhances query execution by building on Elasticsearch’s strengths as a vector database. Innovations in query performance and vector quantization ensure fast and efficient search experiences for modern use cases. Highlights include: Faster data retrieval via segment-level query parallelization, enabling multiple concurrent requests to fetch data from object storage and significantly reducing latency to ensure faster access even when data isn't cached locally Smarter caching through intelligent query results reuse and optimized data structures in Lucene that allow for caching only the utilized portion of indexes, Tailored Lucene index structures maximize performance for various data formats, ensuring that each data type is stored and retrieved in the most efficient manner. Advanced vector quantization significantly reduces the storage footprint and retrieval latency of high-dimensional data, making AI and vector search more scalable and cost-effective. This new architecture preserves Elasticsearch’s flexibility—supporting faceting, filtering, aggregations, and diverse data types—while simplifying operations and accelerating performance for modern search needs. For teams seeking a hands-off solution that adapts to changing needs, Elasticsearch Serverless offers all the power and versatility of Elasticsearch without the operational overhead. Whether you're a developer looking to integrate hybrid search, a data scientist working with high-cardinality datasets, or a full-stack engineer optimizing relevance with AI models, Elasticsearch Serverless empowers you to focus on delivering exceptional search experiences. Access to the newest search and AI features in Elasticsearch Serverless Elasticsearch Serverless is more than just a managed service—it’s a platform designed to accelerate development and optimize search experiences. It’s where you can access the latest search and generative AI features: Elastic AI Assistant : Quickly access documentation, guidance, and resources to simplify prototyping and implementation. ELSER Embedding Model : Enable semantic or hybrid search capabilities, opening new ways to query your data. Semantic Text Field Type: Generate vectors for text fields with ease. Better Binary Quantization ( BBQ ) : Optimize vector storage and memory usage without compromising accuracy or performance. Elastic Rerank and Reciprocal Rank Fusion (RRF) : Improve result relevance with simplified reranking and hybrid scoring capabilities. Playground and Developer Console : Experiment with new features, including Gen AI integrations, using a unified interface and API workflows. ES|QL, Elastic’s intuitive command language , fully compatible with Elasticsearch Serverless. Usage and Performance Transparency : Manage search speed and costs through the Cloud console with detailed performance insights. Get started with Elasticsearch Serverless Ready to start building? Elasticsearch Serverless is available now, and you can try it today with our free trial. Developers love Elasticsearch for its speed, relevance, and flexibility. With Elasticsearch Serverless, you’ll love it for its simplicity. Explore Elasticsearch Serverless today and experience search, reimagined. Learn about serverless pricing . The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Elastic Cloud Serverless Ingestion September 20, 2024 Architecting the next-generation of Managed Intake Service APM Server has been the de facto service for ingesting data from Elastic APM agents and OTel agents. In this blog post, we will walk through our journey of redesigning the APM Server product to scale and evolve into a more generic ingest component for Elastic Observability while also improving the reliability and maintainability compared to the traditional APM Server. VR MR By: Vishal Raj and Marc Lopez Rubio Jump to Elasticsearch is the backbone of search experience Elasticsearch with serverless speed and simplicity Decoupled compute and storage Dynamic autoscaling without compromise High-performance query execution Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch Serverless is now generally available - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-serverless-now-ga","meta_description":"Elasticsearch Serverless is generally available. Learn how its architecture and features simplify building search applications."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Ingesting data from Snowflake to Elasticsearch using Meltano Ingest data from Snowflake to Elasticsearch with Meltano. Follow along to setup Meltano, connect to Snowflake & ingest data to Elasticsearch. How To DB By: Dmitrii Burlutskii On April 7, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the Search team At Elastic, we have been exploring different ETL tools and how we can leverage them to ship data into Elasticsearch and enable AI-powered search on the ingested data. Today, I’d like to share our story with the Meltano ecosystem and Meltano Elasticsearch loader . Meltano is a declarative code-first data integration engine that allows you to sync data between different storages. There are many extractors and loaders available at the hub.meltano.com . If you store your data in Snowflake and you want to build a search experience out-of-the-box for your customers, you might want to think about using Elasticsearch, where you can build a semantic search for your customers based on the data you have. Today, we will focus on syncing data from Snowflake to Elasticsearch. Requirements Snowflake credentials You will have received all the below credentials after signup , or you can get them from the Snowflake panel. Account username Account password Account Identifier (see here for instructions on how to get it) Snowflake dataset If you create a new Snowflake account you will have sample data to experiment with. However, I will be using one of the public air quality datasets that contains Nitrogen Dioxide (NO2) measurements. Elastic credentials Visit https://cloud.elastic.co and sign up. Click on Create deployment . In the pop-up, you can change or keep the default settings. Once you’re ready for deployment, click on Continue (or click on Open Kibana ). It will redirect you to the Kibana dashboard. Go to Stack Management -> Security -> API keys and generate a new API key. Steps to ingest data from Snowflake to Elasticsearch using Meltano 1. Install Meltano In my example, I will be using the Meltano Python package but you can also install it as a Docker container. Add the snowflake extractor Verify the extractor Add Elasticsearch loader 2. Configure the extractor and the loader There are multiple ways to configure Meltano extractors and loaders: Edit meltano.yml Using CLI commands like Using CLI interactive mode I will be using the interactive mode. To configure the Snowflake extractor run the following command and provide at least the Account Identifier, Username, Password, and Database. You should see the following screen where you can choose an option to configure. After you configured the extract you can test the connection. Simply run the following command: Configure the Elasticsearch loader and provide Host, Port, Schema, and the API key, If you want to change the index name you can run this command and change it: ie. the default index string defined as ecs-{{ stream_name }}-{{ current_timestamp_daily}} that results in ecs-animals-2022-12-25 where the stream name was animals. When everything is configured we can start syncing data. Once the sync starts you can go to Kibana and see that a new index is created and there are some indexed documents. You can view the documents by clicking on the index name. You should see your documents. 3. Use your index settings (or mapping) If we start syncing data, the loader will automatically create a new index with dynamic mapping, which means Elasticsearch will take care of the fields and their types in the index. We can change this behavior if we want to by creating an index in advance and applying the settings we need. Let’s try. Navigate to the Kibana -> DevTools and run the following commands: 3.1 Create a new pipeline This will drop all the documents with datavalue < 10 . 3.2 Create a new index 3.3 Apply index settings 3.4 Change the index name in Meltano 4. Start a sync job When the job is done you can see that the index has fewer documents than the one we created before Conclusion We have successfully synced the data from Snowflake to Elastic Cloud. We let Meltano create a new index for us and take care of the index mapping and we synced data to the existing index with a predefined pipeline. I would like to highlight some key points I wrote down during my journey: Elasticsearch loader ( page on Meltano Hub ) It is not ready to process a big chunk of data. You need to adjust the default Elasticsearch configuration to make it more resilient. I’ve submitted a Pull Request to expose “request_timeout” and “retry_on_timeout” options that will help. It uses the 8.x branch of Elasticsearch Python client so you can make sure it supports the latest Elasticsearch features. It sends data synchronously (doesn’t use Python AsyncIO) so might be quite slow when you need to transfer a huge data volume. Meltano CLI It is just awesome. You don’t need a UI so everything can be configured in the terminal which gives engineers a lot of options for automation. You can simply run on-demand sync with one command. No other running services are required. Replication/Incremental sync If your pipeline requires data replication or incremental syncs, you can visit this page to read more. Also, I would like to mention that Meltano Hub is amazing. It is easy to navigate and find what you need. Also, you can easily compare different loaders or extractors by just looking at how many customers use them. Find more information in the following blog posts if you’re interested in building AI-based apps: Full text and semantic search capabilities on your data set. Connect your data with LLMs to build Question - Answer . Build a Chatbot that uses a pattern known as Retrieval-Augmented Generation (RAG). Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Requirements Snowflake credentials Snowflake dataset Elastic credentials Steps to ingest data from Snowflake to Elasticsearch using Meltano Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Ingesting data from Snowflake to Elasticsearch using Meltano - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/data-ingestion-from-snowflake-to-elasticsearch-using-meltano","meta_description":"Ingest data from Snowflake to Elasticsearch with Meltano. Follow along to setup Meltano, connect to Snowflake & ingest data to Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog ChatGPT and Elasticsearch: OpenAI meets private data Integrate Elasticsearch's search relevance with ChatGPT's question-answering capability to enhance your domain-specific knowledge base. Generative AI Python JV By: Jeff Vestal On June 21, 2023 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. NOTE: This blog has been revisited with an update incorporating new features Elastic has released since this was first published. Please check out the new blog here! Combine Elasticsearch's search relevance with OpenAI's ChatGPT's question-answering capabilities to query your data. In this blog, you'll learn how to connect ChatGPT to proprietary data stores using Elasticsearch and build question/answer capabilities for your data. What is ChatGPT? In recent months, there has been a surge of excitement around ChatGPT, a groundbreaking AI model created by OpenAI. But what exactly is ChatGPT? Based on the powerful GPT architecture, ChatGPT is designed to understand and generate human-like responses to text inputs. GPT stands for \"Generative Pre-trained Transformer.” The Transformer is a cutting-edge model architecture that has revolutionized the field of natural language processing (NLP). These models are pre-trained on vast amounts of data and are capable of understanding context, generating relevant responses, and even carrying on a conversation. To learn more about the history of transformer models and some NLP basics in the Elastic Stack, be sure to check out the great talk by Elastic ML Engineer Josh Devins . The primary goal of ChatGPT is to facilitate meaningful and engaging interactions between humans and machines. By leveraging the recent advancements in NLP, ChatGPT models can provide a wide range of applications, from chatbots and virtual assistants to content generation, code completion, and much more. These AI-powered tools have rapidly become an invaluable resource in countless industries, helping businesses streamline their processes and enhance their services. Limitations of ChatGPT & how to minimize them Despite the incredible potential of ChatGPT, there are certain limitations that users should be aware of. One notable constraint is the knowledge cutoff date. Currently, ChatGPT is trained on data up to September 2021, meaning it is unaware of events, developments, or changes that have occurred since then. Consequently, users should keep this limitation in mind while relying on ChatGPT for up-to-date information. This can lead to outdated or incorrect responses when discussing rapidly changing areas of knowledge such as software enhancements and capabilities or even world events. ChatGPT, while an impressive AI language model, can occasionally hallucinate in its responses, often exacerbated when it lacks access to relevant information. This overconfidence can result in incorrect answers or misleading information being provided to users. It is important to be aware of this limitation and approach the responses generated by ChatGPT with a degree of skepticism, cross-checking and verifying the information when necessary to ensure accuracy and reliability. Another limitation of ChatGPT is its lack of knowledge about domain-specific content. While it can generate coherent and contextually relevant responses based on the information it has been trained on, it is unable to access domain-specific data or provide personalized answers that depend on a user's unique knowledge base. For instance, it may not be able to provide insights into an organization’s proprietary software or internal documentation. Users should, therefore, exercise caution when seeking advice or answers on such topics from ChatGPT directly. One way to minimize these limitations is by providing ChatGPT access to specific documents relevant to your domain and questions, and enabling ChatGPT’s language understanding capabilities to generate tailored responses. This can be accomplished by connecting ChatGPT to a search engine like Elasticsearch. Elasticsearch — you know, for search! Elasticsearch is a scalable data store and vector database designed to deliver relevant document retrieval, ensuring that users can access the information they need quickly and accurately. Elasticsearch’s primary focus is on delivering the most relevant results to users, streamlining the search process, and enhancing user experience. Elasticsearch boasts a myriad of features to ensure top-notch search performance, including support for traditional keyword and text-based search ( BM25 ) and an AI-ready vector search with exact match and approximate kNN ( k-Nearest Neighbor ) search capabilities. These advanced features allow Elasticsearch to retrieve results that are not only relevant but also for queries that have been expressed using natural language. By leveraging traditional, vector, or hybrid search (BM25 + kNN), Elasticsearch can deliver results with unparalleled precision, helping users find the information they need with ease. One of the key strengths of Elasticsearch is its robust API, which enables seamless integration with other services to extend and enhance its capabilities. By integrating Elasticsearch with various third-party tools and platforms, users can create powerful and customized search solutions tailored to their specific requirements. This flexibility and extensibility makes Elasticsearch an ideal choice for businesses looking to improve their search capabilities and stay ahead in the competitive digital landscape. By working in tandem with advanced AI models like ChatGPT, Elasticsearch can provide the most relevant documents for ChatGPT to use in its response. This synergy between Elasticsearch and ChatGPT ensures that users receive factual, contextually relevant, and up-to-date answers to their queries. In essence, the combination of Elasticsearch's retrieval prowess and ChatGPT's natural language understanding capabilities offers an unparalleled user experience, setting a new standard for information retrieval and AI-powered assistance. How to use ChatGPT with Elasticsearch Python interface accepts user questions. Generate a hybrid search request for Elasticsearch BM25 match on the title field kNN search on the title-vector field Boost kNN search results to align scores Set size=1 to return only the top scored document Search request is sent to Elasticsearch. Documentation body and original url are returned to python. API call is made to OpenAI ChatCompletion. Prompt: \"answer this question <question> using only this document <body_content from top search result>\" Generated response is returned to python. Python adds on original documentation source url to generated response and prints it to the screen for the user. The ElasticDoc ChatGPT process utilizes a Python interface to accept user questions and generate a hybrid search request for Elasticsearch, combining BM25 and kNN search approaches to find the most relevant document from the Elasticsearch Docs site, now indexed in Elasticsearch. However, you do not have to use hybrid search or even vector search. Elasticsearch provides the flexibility to use whichever search pattern best fits your needs and provides the most relevant results for your specific data sets. After retrieving the top result, the program crafts a prompt for OpenAI's ChatCompletion API, instructing it to answer the user's question using only the information from the selected document. This prompt is key to ensuring the ChatGPT model only uses information from the official documentation, lessening the chance of hallucinations. Finally, the program presents the API-generated response and a link to the source documentation to the user, offering a seamless and user-friendly experience that integrates front-end interaction, Elasticsearch querying, and OpenAI API usage for efficient question-answering. Note that while we are only returning the top-scored document for simplicity, the best practice would be to return multiple documents to provide more context to ChatGPT. The correct answer could be found in more than one documentation page, or if we were generating vectors for the full body text, those larger bodies of text may need to be chunked up and stored across multiple Elasticsearch documents. By leveraging Elasticsearch's ability to search across numerous vector fields in tandem with traditional search methods, you can significantly enhance your top document recall. Technical setup The technical requirements are fairly minimal, but it takes some steps to put all the pieces together. For this example, we will configure the Elasticsearch web crawler to ingest the Elastic documentation and generate vectors for the title on ingest. You can follow along to replicate this setup or use your own data. To follow along we will need: Elasticsearch cluster Eland Python library OpenAI API account Somewhere to run our python frontend and api backend Elastic Cloud setup The steps in this section assume you don’t currently have an Elasticsearch cluster running in Elastic Cloud. If you do you, can skip to the next section. Sign up If you don’t already have an Elasticsearch cluster, you can sign up for a free trial with Elastic Cloud . Create deployment After you sign up, you will be prompted to create your first deployment. Create a name for your deployment. You can accept the default cloud provider and region or click Edit Settings and choose another location. Click Create deployment. Shortly a new deployment will be provisioned for you and you will be logged in to Kibana. Back to the Cloud We need to do a couple of things back in the Cloud Console before we move on: Click on the Navigation Icon in the upper left and select Manage this deployment. Add a machine learning node. Back in the Cloud Console, click on Edit under your Deployment’s name in the left navigation bar. Scroll down to the Machine Learning instances box and click +Add Capacity. Under Size per zone, click and select 2 GB RAM. Scroll down and click on Save. In the pop-up summarizing the architecture changes, click Confirm. In a few moments, your deployment will now have the ability to run machine learning models! Reset Elasticsearch Deployment User and password: Click on Security on the left navigation under your deployment’s name. Click on Reset Password and confirm with Reset. (Note: as this is a new cluster nothing should be using this Elastic password.) Download the newly created password for the “elastic” user. (We will use this to load our model from Hugging Face and in our python program.) Copy the Elasticsearch Deployment Cloud ID. Click on your Deployment name to go to the overview page. On the right-hand side click the copy icon to copy your Cloud ID. (Save this for use later to connect to the Deployment.) Eland We next need to load an embedding model into Elasticsearch to generate vectors for our blog titles and later for our user’s search questions. We will be using the all-distilroberta-v1 model trained by SentenceTransformers and hosted on the Hugging Face model hub. This particular model isn’t required for this setup to work. It is good for general use as it was trained on very large data sets covering a wide range of topics. However, with vector search use cases, using a model fine-tuned to your particular data set will usually provide the best relevancy. To do this, we will use the Eland python library created by Elastic. The library provides a wide range of data science functions, but we will be using it as a bridge to load the model into Elasticsearch from the Hugging Face model hub so it can be deployed on machine learning nodes for inference use. Eland can either be run as part of a python script or on the command line. The repo also provides a Docker container for users looking to go that route. Today we will run Eland in a small python notebook , which can run in Google’s Colab in the web browser for free. Open the program link and click the “Open in Colab” button at the top to launch the notebook in colab. Set the variable hf_model_id to the model name. This model is set already in the example code but if you want to use a different model or just for future information: hf_model_id='sentence-transformers/all-distilroberta-v1' Copy model name from Hugging Face. The easiest way to do this is to click the copy icon to the right of the model name. Run the cloud auth section, and you will be prompted to enter: Cloud ID (you can find this in the Elastic Cloud Console) Elasticsearch Username (easiest will be to use the “Elastic” user created when the deployment was created) Elasticsearch User Password Run the remaining steps. This will download the model from Hugging face, chunk it up, and load it into Elasticsearch. Deploy (start) the model onto the machine learning node. Elasticsearch index and web crawler Next up we will create a new Elasticsearch index to store our Elastic Documentation, configure the web crawler to automatically crawl and index those docs, as well as use an ingest pipeline to generate vectors for the doc titles. Note that you can use your proprietary data for this step, to create a question/answer experience tailored to your domain. Open Kibana from the Cloud Console if you don’t already have it open. In Kibana, Navigate to Enterprise Search -> Overview. Click Create an Elasticsearch Index. Using the Web Crawler as the ingestion method, enter elastic-docs as the index name. Then, click Create Index. Click on the “Pipelines” tab. Click Copy and customize in the Ingest Pipeline Box. Click Add Inference Pipeline in the Machine Learning Inference Pipelines box. Enter the name elastic-docs_title-vector for the New pipeline. Select the trained ML model you loaded in the Eland step above. Select title as the Source field. Click Continue, then click Continue again at the Test stage. Click Create Pipeline at the Review stage. Update mapping for dense_vector field. (Note: with Elasticsearch version 8.8+, this step should be automatic.) In the navigation menu, click on Dev Tools. You may have to click Dismiss on the flyout with documentation if this is your first time opening Dev Tools. In Dev Tools in the Console tab, update the mapping for our dense vector target field with the following code. You simply paste it in the code box and click the little arrow to the right of line 1. You should see the following response on the right half of the screen: This will allow us to run kNN search on the title field vectors later on. Configure web crawler to crawl Elastic Docs site: Click on the navigation menu one more time and click on Enterprise Search -> Overview. Under Content, click on Indices. Click on search-elastic-docs under Available indices. Click on the Manage Domains tab. Click “Add domain.” Enter https://www.elastic.co/guide/en , then click Validate Domain. After the checks run, click Add domain. Then click Crawl rules. Add the following crawl rules one at a time. Start with the bottom and work up. Rules are evaluated according to first match. Disallow Contains release-notes Allow Regex /guide/en/.*/current/.* Disallow Regex .* With all the rules in place, click Crawl at the top of the page. Then, click Crawl all domains on this index. Elasticsearch’s web crawler will now start crawling the documentation site, generating vectors for the title field, and indexing the documents and vectors. The first crawl will take some time to complete. In the meantime, we can set up the OpenAI API credentials and the Python backend. Connecting with OpenAI API To send documents and questions to ChatGPT, we need an OpenAI API account and key. If you don’t already have an account, you can create a free account and you will be given an initial amount of free credits. Go to https://platform.openai.com and click on Signup. You can go through the process to use an email address and password or login with Google or Microsoft. Once your account is created, you will need to create an API key: Click on API Keys . Click Create new secret key. Copy the new key and save it someplace safe as you won’t be able to view the key again. Python backend setup Clone or download the python program Github Link to code Install required python libraries. We are running the example program in Replit, which has isolated environments. If you are running this on a laptop or VM, best practice is to set up a virtual ENV for python . Run pip install -r requirements.txt Set authentication and connection environment variables (e.g., if running on the command line: export openai_api=”123456abcdefg789”) openai_api - OpenAI API Key cloud_id - Elastic Cloud Deployment ID cloud_user - Elasticsearch Cluster User cloud_pass - Elasticsearch User Password Run the streamlit program. More info about streamlit can be found in its docs . Streamlit has its own command to start: streamlit run elasticdocs_gpt.py This will start a web browser and the url will be printed to the command line. Sample chat responses With everything ingested and the front end up and running, you can start asking questions about the Elastic Documentations. Asking “Show me the API call for an inference processor” now returns an example API call and some information about the configuration settings. Asking for steps to add a new integration to Elastic Agent will return: As mentioned earlier, one of the risks of allowing ChatGPT to answer questions based purely on data it has been trained on is its tendency to hallucinate incorrect answers. One of the goals of this project is to provide ChatGPT with the data containing the correct information and let it craft an answer. So what happens when we give ChatGPT a document that does not contain the correct information? Say, asking it to tell you how to build a boat (which isn’t currently covered by Elastic’s documentation): When ChatGPT is unable to find an answer to the question in the document we provided, it falls back on our prompt instruction simply telling the user it is unable to answer the question. Elasticsearch’s robust retrieval + the power of ChatGPT In this example, we've demonstrated how integrating Elasticsearch's robust search retrieval capabilities with cutting-edge advancements in AI-generated responses from GPT models can elevate the user experience to a whole new level. The individual components can be tailored to suit your specific requirements and adjusted to provide the best results. While we used the Elastic web crawler to ingest public data, you're not limited to this approach. Feel free to experiment with alternative embedding models, especially those fine-tuned for your domain-specific data. You can try all of the capabilities discussed in this blog today! To build your own ElasticDocs GPT experience, sign up for an Elastic trial account , and then look at this sample code repo to get started. If you would like ideas to experiment with search relevance, here are two to try out: [BLOG] Deploy NLP text embeddings and vector search using Elasticsearch [BLOG] Implement image similarity search with Elastic In this blog post, we may have used third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Jump to What is ChatGPT? Limitations of ChatGPT & how to minimize them Elasticsearch — you know, for search! How to use ChatGPT with Elasticsearch Technical setup Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"ChatGPT and Elasticsearch: OpenAI meets private data - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/chatgpt-elasticsearch-openai-meets-private-data","meta_description":"Integrate Elasticsearch's search relevance with ChatGPT's question-answering capability to enhance your domain-specific knowledge base."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch retrievers architecture and use-cases Elasticsearch retrievers have gone through a significant revamp and are now generally available for all to use. Learn all about their c and use-cases. Search Relevance PB By: Panagiotis Bailis On November 14, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this blog post we'll take another deep dive with retrievers. We've already talked about them in previous blogs from their very introduction to semantic reranking using retrievers . Now, we're happy to announce that retrievers are becoming generally available with Elasticsearch 8.16.0, and in this blog post we'll take a technical tour on how we implemented them, as well as we'll get the chance to discuss the newly available capabilities! Elasticsearch retriever The main concept of a retriever remains the same as with their initial release; retrievers is a framework that provides the basic building blocks that can be stacked hierarchically to build multi-stage complex retrieval and ranking pipelines. E.g. of a simple standard retriever, which just bring backs all documents: Pretty straightforward, right? In addition to the standard retriever, which is essentially just a wrapper around the standard query search API element, we also support the following types: knn - return the top documents from a kNN (k Nearest Neighbor) search rrf - combine results from different retrievers based on the RRF (Reciprocal Rank Fusion) ranking formula text_similarity_reranker - rerank the top results of a nested retriever using a rerank type inference endpoint More detailed information along with the specific parameters for each retriever can also be found in the Elasticsearch documentation . Let's briefly go through some of the technical details first, which will help us understand the architecture and what has changed and why all these previous limitations have now been lifted! Technical drill down of retrievers One of the most important (and requested) things that we wanted to address was the ability to use any retriever, at any nesting level. Whether this means having 2 or more text_similarity_reranker stacked together, or an rrf retriever operating on top of another rrf along with a text_similarity_reranker , or any combination and nesting you can think of, we wanted to make sure that this would be something one could express with retrievers! To account for this, we have introduced some significant changes to the retriever execution plan. Up until now, retrievers were evaluated as part of the standard search execution flow, where (in a simplified scenario for illustration purposes) we reach out to the shards twice: once for querying the shards and bringing back from + size documents from each shard, and once for fetching all field data and perform any additional operations (e.g. highlighting) for the true top [from, from+size] results. This is a nice linear execution flow that is (relatively) easy to follow, but introduces some significant limitations if we want to execute multiple queries, operate on different results sets, etc. In order to work around this, we have moved to an eager evaluation of all sub-retrievers of a retriever pipeline at the very early stages of query execution. This means that, if needed, we are recursively rewriting any retriever query to a simpler form, the specifics of which depend on the retriever type. For non-compound retrievers we rewrite similar to how we do in a standard query, as they could still follow the linear execution plan. For compound retrievers, i.e. for retrievers that operate on top of other retriever(s), we flatten them to a single rank_window_size result set, which is essentially a <doc, shard> tuple list that represents the top ranked documents for this retriever. Let's see what this actually looks like, by working through the following (rather complex) retriever request: The rrf retriever above is a compound one, as it operates on the results of some other retrievers, so we'll try to rewrite it to a simpler, flattened, list of <doc, shard> tuples, where each tuple specifies a document and the shard that it was found on. This rewrite will also enforce a strict ranking, so no different sort options are currently supported. Let's proceed now to identify all components and describe the process of how this will be evaluated: [1] top level rrf retriever; this is the parent of all sub-retrievers which will be rewritten and evaluated last, as we'd first need to know the top 10 (based on rank_window_size ) results from each of its sub-retrievers. [2] This knn retriever is the first child of the top level rrf retriever and uses an embedding service ( my-text-embedding-model ) to compute the actual query vector that will be used. This will be rewritten as the usual knn query by making an async request to the embedding service to compute the vector for the given model_text . [3] A standard retriever that is also part of the top-level's rrf retriever's children, which returns all documents matching topic: science query. [4] Last child of the top-level rrf retriever which is also an rrf retrievers that needs to be flattened. [5] [6] similar to [2] and [3], these are retrievers that are direct children of an rrf retriever, for which we will fetch the top 100 results (based on the rrf retriever's rank_window_size [4]) for each one, combine them using the rrf formula, and then rewrite to a flattened <doc, shard> list of the true top 100 results. The updated execution flow for retrievers is now as follows: We'll start by rewriting all leaves that we can. This means that we'll rewrite the knn retrievers [2] and [6] to compute the query vector, and once we have that we can move up one level in the tree. At the next rewrite step, we are now ready to evaluate the nested rrf retriever [4], which we will eventually rewrite to a flattened RankDocsQuery query (i.e. a list of <doc, shard> tuples). Finally, all inner rewritten steps for the top-level rrf retriever [1] will have taken place, so we should be ready to combine and rank the true top 10 results as requested. Even this top-level rrf retriever will rewrite itself to a flattened RankDocsQuery which will be later used to proceed with the standard linear search execution flow. Visualizing all the above, we have: Looking at the example above, we can see how a hierarchical retriever tree is asynchronously rewritten to just a simple RankDocsQuery . This simplification gives us the nice (and desired!) side effect of eventually executing a normal request with explicit ranking, and in addition to that we can also perform any complementary operations we choose. Playing with the (golden) retrievers! As we briefly mentioned above, with the rework in place, we can now support a plethora of additional search features! In this section we'll go through some examples and use-cases, but more can also be found in the documentation . We'll start with the most coveted one which is composability, i.e. the option to have any retriever at any level of the retriever tree. Composability In the following example, we want to perform a semantic query (using an embedding service like ELSER ), and then merge those results along with a knn query, using rrf . Finally, we'd want to rerank those using the text_similarity_reranker retriever using a reranker. The retriever to express the above would look like this: Aggregations Recall that with the rework we discussed, we rewrite a compound retriever to just a RankDocsQuery (i.e. a flattened explicitly ranked result list). This however does not block us from computing aggregations, as we also keep track of the source queries that were part of a compound retriever. This means that we can fallback to the nested standard retrievers below, to properly compute aggregations for the topic field, based on the union of the results of the two nested retrievers. So in the example above, we'll compute a term aggregation for the topic field, where either the year field is greater than 2023, or the document has the topic elastic associated with it. Collapsing In addition to the aggregation option we discussed above, we can now also collapse results, as we'd do with a standard query request. In the following example, we compute the top 10 results of the rrf retriever, and then collapse them under the year field. The main difference with standard searches is that here we're collapsing just the top rank_window_size results, and not the ones within the nested retrievers. Pagination As is also specified in the docs compound retrievers also support pagination. There is a significant difference with standard queries where, similarly to collapse above, the rank_window_size parameter is the whole result set upon which we can perform navigation. This means that if from + size > rank_window_size then we would bring no results back (but we'd still return aggregations). In the example above, we would compute the top 10 results (as defined in rrf's rank_window_size ) from the combination of the two nested retrievers ( standard and knn ) and then we'd perform pagination by consulting the from and size parameters. So, in this case, we'd skip the first 2 results ( from ) and pick the next 2 ( size ). Consider now a different scenario, where, in the same query above, we would instead have from: 10 and size: 2 . Given that rank_window_size is 10, and that these would be all the results that we can paginate upon, requesting to get 2 results after skipping the first 10 would fall outside of the navigatable result set, so we'd get back empty results. Additional examples and a more detailed break-down can also be found in the documentation for the rrf retriever . Explain We know that with great power comes great responsibility. Given that we can now combine retrievers in arbitrary ways, it could be rather difficult to understand why a result was eventually returned first, and how to optimize our retrieval strategy. For this very specific reason, we have worked to ensure that the explain output of a retriever request (i.e. by specifying explain: true ) will convey all necessary information from all sub-retrievers, so that we can have a proper understanding of all the factors that contributed to the final ranking of a result. Taking the rather complex query in the Collapsing section, the explain for the first result looks like this: Still a bit verbose, but it conveys all necessary information on why a document is at a specific position. For the top-level rrf retriever, we have 2 details specified, one for each of its nested retrievers. The first one is a text_similarity_reranker retriever, where we can see on explain the weight for the rerank operation, and the second one is a knn query informing us of the doc's computed similarity with the query vector. It might take a bit to familiarize with, but each retriever ensures to output all the information you might need to evaluate and optimize your search scenario! Conclusion That's all for now! We hope you stayed with us until now and you enjoyed this topic! We're really excited with the release of the retriever framework and all the new use-cases that we can now support! Retrievers were built in order to support from very simple searches, to advanced RAG and hybrid search scenarios! As mentioned above, watch this space and more will be available soon! Report an issue Related content Search Relevance May 28, 2025 Hybrid search revisited: introducing the linear retriever! Discover how the linear retriever enhances hybrid search by leveraging weighted scores and MinMax normalization for more precise and consistent rankings. Learn how to configure this new tool for optimized search pipelines and improve your results today. PB By: Panagiotis Bailis Search Relevance May 26, 2025 Creating Judgement Lists with Quepid Creating judgement lists in Quepid with a collaborative human rater process. DW By: Daniel Wrigley Search Relevance May 20, 2025 Cracking the code on search quality: The role of judgment lists Explore why a judgment list is essential, the different types of judgments, and the key factors that define search quality. DW By: Daniel Wrigley Search Relevance April 11, 2025 Enhancing relevance with sparse vectors Learn how to use sparse vectors in Elasticsearch to boost relevance and personalize search results with minimal complexity. VB By: Vincent Bosc Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Jump to Elasticsearch retriever Technical drill down of retrievers Playing with the (golden) retrievers! Composability Aggregations Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch retrievers architecture and use-cases - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-retrievers-ga-8.16.0","meta_description":"Elasticsearch retrievers are GA with 8.16.0. Learn all about their c, use-cases and how to implement them, including the rrf retriever."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elastic Jira connector tutorial part I Reviewing a use case for the Elastic Jira connector. We'll be indexing our Jira content into Elasticsearch to create a unified data source and do search with Document Level Security. Integrations Ingestion How To GL By: Gustavo Llermaly On January 15, 2025 Part of Series Jira connector tutorials Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In this article, we'll review a use case for the Elastic Jira native connector . We'll use a mock project where a bank is developing a money transfer app and needs to integrate the information in Jira into Elastic. The native connector allows us to get into our Elastic cluster information from tickets, tasks, and other documents, centralizing data and enabling advanced search features. The main benefits of using this connector are: Data from Jira is synchronized with Elasticsearch. Access to advanced search features. Document Level Security (DLS) matching source security. You can only search for what you're allowed to see in Jira. Steps Configuring Jira connector Indexing documents into Elasticsearch Querying data Document Level Security (DLS) Configuring Jira connector You'll first need to get an API token from Jira to connect to Elasticsearch. Go to this link to learn how to create it. Name it \"elastic-connector.\" It should look like this: Get the token and to your Kibana dashboard. Then, go to native connectors and select New Jira Cloud connector. Replace YOUR_KIBANA_URL with Kibana endpoint . Name the connector “bank” and click “Create and attach an index named bank” to create a new index with the same name. Done! Now we need to configure our Jira data. We'll keep \"Enable SSL\" off since we won't be using our own SSL certificates. You can see the details of each field in the official documentation . Activate Document Level Security (DLS) so you get your documents with the users and groups authorized to see them. Once the connector is correctly configured, you can continue to synchronize data as you can see below. It might take a couple of minutes to get the data from Jira. Full Content: indexes all Jira documents. Incremental Content: only indexes changes from the last Full Content Sync. Access Control: indexes Jira users in the security index to activate DLS. We can check the connector's Overview to see if the sync was successful. In the Documents tab, we can see exactly what data we got with the connector. The objects from this first sync are: Projects Issues Attachments Indexing documents into Elasticsearch We are not limited to searching across the connector documents. Elasticsearch allows you to search on many indices with a single query. For our example, we'll index additional documents into the galactic_documents index to see how search works with more than one datasource: Compliance Manual of the GBFF User Guide for the Galactic Banking App Technical Specifications Report But before indexing, we'll create optimized mappings for each field: With the mappings configured, we can now index: Querying data Now that we have both Jira objects and documents, we can search for them together. Querying \"galactic moon\" will get us both Jira objects and the documents we indexed: If a document is too long, you can add the option _source to the query to only include the fields that you need. If you just want to remove some fields, we'll cover that option in the second part of this series. Document Level Security (DLS) We will now configure Document Level Security (DLS) to match Jira permissions to the ones in Elasticsearch so that when users search, they can only see what they are allowed to see in Jira. To begin, we'll go to the connector's Control Panel in Elastic Cloud and click on Access Control Sync. This sync will bring the access and permission info from the Jira users. To test this, I've made another Jira board to which the user \"Gustavo\" does not have access. Note: Do not forget to run content sync after creating the board.You can run one time syncs , or schedule based . Let's begin checking that the documents from the new board are there: We can effectively see the issues: However, since the user \"Gustavo\" does not have access, he should not be able to see them. Let's look for the user's document in the ACL filter index to see their permissions. Response: This index includes the user id and all of their Jira groups. We need to make a match between the content in the user's access control and the field _allowed_access_control in each document. We'll create an API Key for Gustavo using the command below. You must copy the query.template value from the previous step: Note that we're only giving access to the indices in this article through this option. The response for the creation of the API Key for Gustavo is this: You can use curl to test that we can run searches using the API KEY and it won't bring info from the Marketing board, since Gustavo does not have access to it. Response: We can see that Gustavo did not get any info since he did not have access. Now, let's test with the documents from the board that he is allowed to see: Response: Conclusion As you can see, integrating Elasticsearch with Jira has many benefits, like being able to get a unified search on all the projects you're working on as well as being able to run more advanced searches in more than one data source. The added DLS is a quick and easy way to guarantee that users will maintain the access they already had in the original sources. Check out the second part of this tutorial , where we'll review best practices and advanced configurations to escalate the connector. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps Configuring Jira connector Indexing documents into Elasticsearch Querying data Document Level Security (DLS) Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elastic Jira connector tutorial part I - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-jira-connector-tutorial","meta_description":"Learn how to integrate Elasticsearch with Jira using the Elastic Jira connector and implement DLS through a practical use case."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Finding your best music friend with vectors: Spotify Wrapped, part 5 Understanding vectors has never been easier. Handcrafting vectors and figuring out various techniques to find your music friend in a heavily biased dataset. How To PK VB By: Philipp Kahr and Vincent Bosc On April 10, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the first part we talked about how to retrieve your Spotify data and visualize it. In the second part we talked about how to process the data and how to visualize it. In the third part we explored anomaly detection and how it helps us find interesting listening behavior. The fourth part uncovered relationships between the artists by using Kibana Graph. In this part, we talk about how to use vectors to find your music friend. Discover your musical friends with vectors A vector is a mathematical entity that has both magnitude (size) and direction. In this context, vectors are used to represent data, such as the number of songs listened to by a user for each artist. The magnitude corresponds to the count of songs played for an artist, while the direction is determined by the relative proportions of the counts for all artists within the vector. Although the direction is not explicitly set or visualized, it is implicitly defined by the values in the vector and their relationships to one another. The idea is to simply create a huge array where we do a key => value sorting approach. The key is the artist and the value is the count of listened to songs. This is a very simple approach and can be done with a few lines of code. We create this vector: Which is super interesting because it is now sorted by the artist name. This gives us zero values for all artists we didn't listen to, or which the user didn't even know existed. Finding your musical match then becomes a straightforward task of calculating the distance between two vectors and identifying the closest match. Several methods can be used for this, such as the dot product , euclidean distance , and cosine similarity . Each method behaves differently and may yield varying results. It is important to experiment and determine which approach best suits your needs. How does cosine similarity, Euclidian distance and dot product work? We will not delve into the mathematical details of each method, but we will provide a brief overview of how they work. To simplify, let’s break this down into just two dimensions: Ariana Grande and Taylor Swift. User A listens to 100 songs by Taylor Swift, user B listens to 300 songs by Ariana Grande, and user C falls in the middle, listening to 100 songs by Taylor Swift and 100 songs by Ariana Grande. The Cosine Similarity is better the smaller the angle and focuses on the direction of the vectors, ignoring their magnitude. In our case, user C will match with user A and user B equally because the angle between their vectors is the same (both are 45^\\circ). The Euclidian distance measures the direct distance between two points, with shorter distances indicating higher similarity. This method is sensitive to both direction and magnitude. In our case, user C is closer to user A than to user B because the difference in their positions results in a shorter distance. The dot product calculates similarity by summing the products of the corresponding entries of two vectors. This method is sensitive to both magnitude and alignment. For example, user A and user B result in a dot product of 0 because they have no overlap in preferences. User C matches more strongly with user B (300 × 100 = 30,000) than with user A (100 × 100 = 10,000) due to the larger magnitude of user B’s vector. This highlights the dot product’s sensitivity to scale, which can skew results when magnitudes differ significantly. In our specific use case, the magnitude of the vectors should not significantly impact the similarity results. This highlights the importance of applying normalization (more on that later) before using methods like Euclidean distance or dot product to ensure that comparisons are not skewed by differences in scale. Data distribution The distribution of our dataset is a crucial factor, as it will play a significant role later when we work on finding your best musical match. User Count of records Unique Artists Unique Titles Responsible for % of dataset philipp 202907 14183 24570 35% elisheva 140906 9872 23770 24% stefanie 70373 2647 5471 12% emil 53568 5663 14227 9% karolina 41232 7988 12427 7% iulia 39598 5114 8976 6% chris 23598 6124 8654 4% Summary: 7 572182 35473 77942 100% More details about the diversity of the dataset are discussed in the subheading Dataset Issues within the dense_vector section. The primary issue lies in the distribution of listened-to artists for each user. Each color represents a different user, and we can observe various listening styles: some users listen to a wide range of artists evenly distributed, while others focus on just a handful, a single artist, or a small group. These variations highlight the importance of considering user listening patterns when creating your vector. Using dense_vector type First of all, we created the vector above already, now we can store that in a field of dense_vector . We will auto create the dimensions we need in the Python code, based on our vector length. Whoops that errored in our case with this message: Error: BadRequestError(400, 'mapper_parsing_exception', 'The number of dimensions should be in the range [1, 4096] but was [33806]') : Ok, so that means our vector artists is too large. It is 33806 items long. Now, that is interesting, and we need to find a way to reduce that. This number 33806 represents the cardinality of artists. Cardinality is another term for uniqueness. It is the number of unique values in a dataset. In our case, it is the number of unique artists across all users. One of the easiest ways is to rework the vector. Let's focus on the top 1000 commonly used artists. This will reduce the vector size to exactly 1000. We can always increase it to 4096 and see if there is something else going on then. This method of aggregation gives us the top 1000 artists per user. However, this can lead to issues. For instance, if there are 7 users and none of the top 1000 artists overlap, we end up with a vector dimension of 7000. When testing this approach, we encountered the following error: Error: BadRequestError(400, 'mapper_parsing_exception', 'The number of dimensions should be in the range [1, 4096] but was [4456]') . This indicates that our vector dimensions were too large. To resolve this, there are several options. One straightforward approach is to reduce the top 1000 artists to 950, 900, 800, and so on until we fit within the 4096 dimension limit. Reducing the top n artists per user to fit within the 4,096-dimension limit may temporarily resolve the issue, but the risk of exceeding the limit will resurface each time new users are added, as their unique top artists will increase the overall vector dimensions. This makes it an unsustainable long-term solution for scaling the system. We already sense that we will need to find a different solution. Dataset issues We adjusted the aggregation method by switching from calculating the top 1000 artists per user to calculating the overall top 1000 artists and then splitting the results by user. This ensures the vector is exactly 1000 artists long. However, this adjustment does not address a significant issue in our dataset: it is heavily biased toward certain artists, and a single user can disproportionately influence the results. As shown earlier, Philipp contributes roughly 35% of all data, heavily skewing the results. This could result in a situation where smaller contributors, like Chris, have their top artists excluded from the top 1000 terms or even the 4096 terms in a larger vector. Additionally, outliers like Stefanie, who might listen repeatedly to a single artist, can further distort the results. To illustrate this, we converted the JSON response into a table for better readability. Artist Total Count User Casper 15100 14924 stefanie 170 philipp 4 emil 2 chris Taylor Swift 12961 9557 elisheva 2240 stefanie 664 iulia 409 philipp 53 karolina 23 chris 15 emil Ariana Grande 7247 3508 philipp 1873 elisheva 1525 iulia 210 stefanie 107 karolina 24 chris K.I.Z 6683 6653 stefanie 23 philipp 7 emil It is immediately apparent that there is an issue with the dataset. For example, Casper and K.I.Z, both German artists, appear in the top 5, but Casper is overwhelmingly influenced by Stefanie, who accounts for approximately 99% of all tracks listened to for this artist. This heavy bias places Casper at the top spot, even though it might not be representative across the dataset. To address this issue while still using the 4096 artists in a dense vector, we can apply some data manipulation techniques. For instance, we could consider using the diversified_sampler or methods like softmax to calculate the relative importance of each artist. However, if we aim to avoid heavy data manipulation, we can take a different approach by using a sparse_vector instead. Using a sparse_vector type We tried squeezing our vector where each position represented an artist into a dense_vector field, however it's not the best fit as you can tell. We are limited to 4096 artists and we end up with a large array that has a lot of null values. Philipp might never listen to Pink Floyd yet in the dense vector approach, Pink Floyd will take up one position with a 0. Essentially, we were using a dense vector format for data that is inherently sparse. Fortunately, Elasticsearch supports sparse vector representation through the sparse_vector type. Let’s explore how it works! Instead of creating one large array, we will create a key => value pair and store the artists name next to the listened count. This is a much more efficient way of storing the data and will allow us to store a higher cardinality. There is no real limit to how many key value pairs you can have inside the sparse_vector. At some point the performance will degrade, but that is a discussion for another day. Any null pairs will simply be skipped. What does a search look like? We take the entire content of artists and put that inside the query_vector and we use the sparse_vector query type and only retrieve the user and the score. Normalization Using a sparse_vector allows us to store data more efficiently and handle higher cardinality without hitting the dimension limit. The tradeoff, however, is that we are limited to using the dot product for similarity calculations, which means we cannot directly use methods such as cosine similarity or Euclidean distance. As we saw earlier, the dot product is heavily influenced by the amplitude of vectors. To minimize or avoid this effect, we will first need to normalize our data. We provide the full sparse vector to identify our “music best friend.” This straightforward approach has yielded some interesting results, as shown here. However, we are still encountering a similar issue as before: the influence of vector magnitudes. While the impact is less severe compared to the dense_vector approach, the distribution of the dataset still creates imbalances. For example, Philipp might match disproportionately with many users simply due to the vast number of artists he listens to. This raises an important question: does it matter if you listen to an artist 100, 500, 10,000, or 25,000 times? The answer is no—it’s the relative distribution of preferences that matters. To address this, we can normalize the data using a normalizing function like Softmax, which transforms raw values into probabilities. It exponentiates each value and divides it by the sum of the exponentials of all values, ensuring that all outputs are scaled between 0 and 1 and sum to 1. You can normalize directly in Elasticsearch using the normalize aggregation or programmatically in Python using Numpy . With this normalization step, each user is represented by a single document containing a list of artists and their normalized values. The resulting document in Elasticsearch looks like this: Finding your music match is rather easy. We take the entire document for the user Philipp since we want to match him against everyone else. The search looks like this: The response is in JSON and contains the score and the user; we altered it to a table for better readability, then the score is multiplied by 1.000 to remove leading zeros. User Score philipp 0.36043773 karolina 0.050112092 stefanie 0.04934514 iulia 0.048445952 chris 0.039548675 elisheva 0.037409707 emil 0.036741032 On an untuned and out of the box softmax we see that Philipp's best friend is Karolina with a score of 0.050... followed relatively closely by Stefanie with 0.049... . Emil is furthest away from Philipp's taste. After comparing the data for Karolina and Philipp ( using the dashboard from the second blog ), this seems a bit odd. Let's explore how the score is calculated. The issue is that in untuned softmax, the top artist can get a value near 1 and the second artist is already on 0.001..., which emphasises your top artist even more. This is important because the dot product calculation used to identify your closest match works like this: When we calculate the dot product we do 1 * 0.33 = 0.33 , which boosts my compatibility with Karolina a lot. When Philipp is not matching on the top artist of anyone else with a higher value than 0.33, Karolina is my best friend, even though we might have barely anything else in common. To illustrate this here is a table of our top 5 artists, side by side. The number represents the spot in the top artists. Artist Karolina Philipp Fred Again .. 1 Ariana Grande 2 Harry Styles 3 Too Many Zooz 4 Kraftklub 5 Dua Lipa 1 15 David Guetta 2 126 Calvin Harris 3 32 Jax Jones 4 378 Ed Sheeran 5 119 We can observe that Philipp overlaps with Karolina's top 5 artists. Even though they range from place 15, 32, 119, 126, 378 for Philipp, any value that Karolina has is multiplied by Philipp's ranking. In this case, the order of Karolina's top artists weighs more than Philipp's. There are a few ways to fix softmax by adjusting temperature and smoothness . Just trialing out some numbers for temperature and smoothness, we end up with this result (score multiplied by 1.000 to remove leading zeros). A higher temperature describes how sharply softmax assigns the probabilities, this distributes the data more evenly, whilst a lower temperature emphasises a few dominant values, with a sharp decline. User Score philipp 3.59 stefanie 0.50 iulia 0.484 karolina 0.481 chris 0.395 elisheva 0.374 emil 0.367 Adding the temperature and smoothness altered the result. Instead of Karolina being Philipp's best match, it moved to Stefanie. It's interesting to see how adjusting the method of calculating the importance of an artist heavily impacts the search. There are many other options available for building the values for the artists. We can look at the total percentage of an artist represented in a dataset per user. This could lead to better distribution of values than softmax and ensure that the dot product, like described above with Karolina and Philipp for Dua Lipa, wouldn't be that significant anymore. One other option would be to take the total listening time into consideration and not just the count of songs, or their percentage. This would help with artists that publish longer songs that are above ~5-6 minutes. One Fred Again.. a song might be around 2:30 and that would allow Philipp to listen to twice as many songs as someone else. The listened_to_ms is in milliseconds and we end up with a similar discussion around, if a sum() is the correct approach, similar to count of songs played. It is an absolute number, where the higher it gets, the less importance the higher number should get accounted for. We could account for listening completion, there is a listened_to_pct and we could pre-filter the data to only songs that our users finish to at least 80%. Why bother with songs that are skipped in the first few seconds or minutes? The listening percentage punishes people that listen to a lot of songs from random artists using the daily recommended playlists, whilst it emphasises those who like to listen to the same artists over and over again. There are many many opportunities to tweak and alter the dataset to get the best results. All of them take time and have different drawbacks. Conclusion In this blog we took you with us on our journey to identify your music friend. We started off with a limited know-how of Elasticsearch and thought that dense vectors are the answers, and that lead to looking into our dataset and diverting to sparse vectors. Along the way we looked into a few optimisations on the search quality and how to reduce any sort of bias. And then we figured out a way that works best for us and that is the sparse vector with the percentages. Sparse vectors are what powers ELSER as well; instead of artists, it is words. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Discover your musical friends with vectors How does cosine similarity, Euclidian distance and dot product work? Data distribution Using dense_vector type Dataset issues Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Finding your best music friend with vectors: Spotify Wrapped, part 5 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/vectors-spotify-wrapped-part-05","meta_description":"Understanding vectors has never been easier. Learn how to use use vectors to through a practical example."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog When hybrid search truly shines Demonstrating when hybrid search is better than lexical or semantic search on their own. Generative AI Vector Database How To GL By: Gustavo Llermaly On January 1, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this article, we are going to explore hybrid search by examples, and show when it truly shines against using lexical or semantic search techniques alone. What is hybrid search? Hybrid search is a technique that combines different search approaches such as traditional lexical term matching, and semantic search . Lexical search is good when users know the exact words. This approach will find the relevant documents, and sort them in a way that makes sense by using TF-IDF which means: The more common across the dataset the term you are searching is, the less it contributes to the score; and the more common it is within a certain document the more it contributes to the score. But, what if the words in the query are not present in the documents? Sometimes the user is not looking for something in concrete, but for a concept . They may not be looking for a specific restaurant, but for \"a nice place to eat with family\". For this kind of queries, semantic search is useful because it takes into consideration the context of the search query and brings similar documents. You can expect to get more related documents back than with the previous approach, but in return, this approach struggles with precision, especially with numbers. Hybrid search gives us the best of both worlds by blending the precision of term-matching together with the context-aware matching of semantic search. You can read a deep dive on hybrid search on this article , and more about lexical and semantic search differences in this one . Let's create an example using real estate units. The query will be: quiet home in Pinewood with 2 rooms , with quiet place being the semantic component of the query while Pinewood with 2 rooms will be the textual or lexical portion. Configuring ELSER We are going to use ELSER as our model provider. Start by creating the inference endpoint: If this is your first time using ELSER, you may encounter a 502 Bad Gateway error as the model loads in the background.You can check the status of the model in Machine Learning > Trained Models in Kibana.Once is deployed, you can proceed to the next step. Configuring index For the index, we are going to use text fields, and semantic_text for the semantic field. We are going to copy the descriptions, because we want to use them for both match and semantic queries. Indexing data Querying data Let's start by the classic match query, that will search by the content of the title and description: This is the first result: It is not bad. It managed to capture the neighborhood Pinewood , and also the 2 bedroom requirement, however, this is not a quiet place at all. Now, a pure semantic query: This is the first result: Now the results considered the quiet home piece by relating it to things like \"secluded and private\", but this one is a 3 bedroom and we are looking for 2. Let's run a hybrid search now. We will use RRF (Reciprocal rank fusion) to achieve this purpose and combine the two previous queries. The RRF algorithm will blend the scores of both queries for us. This is the first result: Now the results considered both being a quiet place, but also having 2 bedrooms. Evaluating results For the evaluation, we are going to use the Ranking Evaluation API which allows us to automate the process of running queries and then checking the position of the relevant results. You can choose between different evaluation metrics. For this example I will pick Mean reciprocal ranking (MRR) which takes into consideration the result position and reduces the score as the position gets lower by 1/position#. For this scenario, we are going to test our 3 queries ( multi_match , semantic , hybrid ) against the initial question: quiet home 2 bedroom in Pinewood Expecting the following apartment to be in the first position as it meets all the criteria. Retired apartment in a serene neighborhood, perfect for those seeking a retreat. This well-maintained residence offers two bedrooms with abundant natural light and silence.\" We can configure as many queries as we need, and put on ratings the id of the documents we expect to be in the first positions: As you can see on the image, the query got a score of 1 for hybrid search (1st position), and 0.5 on the other ones, meaning the expected result was returned on the second position. Conclusion Full-text search techniques–which find terms and sort the results by term frequency–and semantic search–which will search by semantic proximity–are powerful in different scenarios. On the one hand, text search shines when users are specific with what they want to search, for example providing the exact SKU for an article or words present on a technical manual. On the other hand, semantic search is useful when users are looking for concepts or ideas not explicitly defined in the documents. Combining both approaches with hybrid search, gives you both full-text search capabilities as well as adding semantically related documents, which can be useful in specific scenarios that require keyword matching and contextual understanding. This dual approach enhances search accuracy and relevance, making it ideal for complex queries and diverse content types. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Jump to What is hybrid search? Configuring ELSER Configuring index Indexing data Querying data Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"When hybrid search truly shines - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-hybrid-search","meta_description":"Exploring hybrid search and demonstrating when hybrid search is better than lexical or semantic search on their own."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Perform text queries with the Elasticsearch Go client Learn how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client through a practical example. Go How To CR LS By: Carly Richmond and Laurent Saint-Félix On October 31, 2023 Part of Series Using the Elasticsearch Go client for keyword search, vector search & hybrid search Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Building software in any programming language, including Go, is committing to a lifetime of learning. Throughout her university and working career, Carly has needed to adapt to being a polyglot and dabble in many programming languages, including Python, C, JavaScript, TypeScript, and Java. But that wasn't enough! So recently she started playing with Go too! Just like animals, programming languages, and one of your friendly authors, search has undergone an evolution of different practices that can be difficult to decide between for your own search use case. In this blog, we'll share an overview of traditional keyword search along with an example using Elasticsearch and the Elasticsearch Go client . Prerequisites To follow with this example, ensure the following prerequisites are met: Installation of Go version 1.21 or later Create your own Go repo using the recommended structure and package management covered in the Go documentation Creation of your own Elasticsearch cluster, populated with a set of rodent-based pages including for our friendly Gopher , from Wikipedia: Connecting to Elasticsearch In our examples, we will make use of the Typed API offered by the Go client. Establish a secure connection for any query requires configuring the client using either: Cloud ID and API key if making use of Elastic Cloud. Cluster URL, username, password and the certificate. Connecting to our cluster located on Elastic Cloud would look like this: The client connection can then be used for searching, as shown later. Keyword search Keyword search is the foundational search type that we have been familiar with since the inception of Archie , the first documented internet search engine written in 1990. A central component of keyword search is the translation of documents into an inverted index. Exactly like the index found at the back of a textbook, an inverted index contains a mapping between a list of tokens and their location in each document. The below diagram shows the key stages of generating the index: As shown above, the generation of tokens in Elasticsearch comprises three key stages: Stripping of unnecessary characters via zero or more char_filters . In our example we are stripping out HTML elements within the body_content field via the html_strip filter. Splitting the tokens from the content with the standard tokenizer , which will split by spacing and key punctuation. Removing unwanted tokens or transforming tokens from the output stream of the tokenizer using zero or more filter options, such as the lowercase token filter or stemmers such as the snowball stemmer to transform tokens back to their language root. Searching in Elasticsearch with Go When querying with the Go client, we specify the index we want to search and pass in the query and other options, just like in the below example: In the above example, we perform a standard match query to find any document in our index that contains the specified string passed into our function. Note we pass a new empty context to the search execution via Do(context.Background()) . Furthermore, any errors returned by Elasticsearch are output to the err attribute for logging and error handling. Results are returned in res.Hits.Hits with the _Source attribute containing the document itself in a JSON format. To convert this source to a Go-friendly struct, we need to unmarshal the JSON response using the Go encoding/json package , as shown in the below example: Searching and unmarshalling the query gopher will return the Wikipedia page for Gopher as expected: However, if we ask What do Gophers eat? we don't quite get the results we want: A simple keyword search allows results returned to your Go application in a performant way that works in a way we are familiar with from the applications we use. It also works great for exact term matches that are relevant for scenarios such as looking for a particular company or term. However, as we see above, it struggles to identify context and semantics due to the vocabulary mismatch problem. Furthermore, support for non-text file formats such as images and audio is challenging. Conclusion Here we've discussed how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client . Given Go is widely used for infrastructure scripting and building web servers, it's useful to know how to search in Go. Check out the GitHub repo for all the code in this series. Follow on to part 2 to gain an overview of vector search and how to perform vector search in Go. Until then, happy gopher hunting! Resources Elasticsearch Guide Elasticsearch Go client Understanding Analysis in Elasticsearch (Analyzers) by Bo Andersen | #CodingExplained Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Prerequisites Connecting to Elasticsearch Keyword search Searching in Elasticsearch with Go Conclusion Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Perform text queries with the Elasticsearch Go client - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/perform-text-queries-with-the-elasticsearch-go-client","meta_description":"Learn how to perform traditional text queries in Elasticsearch using the Elasticsearch Go client through a practical example."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog ChatGPT and Elasticsearch revisited: Building a chatbot using RAG Learn how to create a chatbot using ChatGPT and Elasticsearch, utilizing all of the newest RAG features. Generative AI Python JV By: Jeff Vestal On August 19, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Follow up to the blog ChatGPT and Elasticsearch: OpenAI meets private data . In this blog, you will learn how to: Create an Elasticsearch Serverless project Create an Inference Endpoint to generate embeddings with ELSER Use a Semantic Text field for auto-chunking and calling the Inference Endpoint Use the Open Crawler to crawl blogs Connect to an LLM using Elastic’s Playground to test prompts and context settings for a RAG chat application. If you want to jump right into the code, you can view the accompanying Jupyter Notebook here . ChatGPT and Elasticsearch (April 2023) A lot has changed since I wrote the initial ChatGPT and Elasticsearch: OpenAI meets private data . Most people were just playing around with ChatGPT, if they had tried it at all. And every booth at every tech conference didn’t feature the letters “AI” (whether it is a useful fit or not). Updates in Elasticsearch (August 2024) Since then, Elastic has embraced being a full featured vector database and is putting a lot of engineering effort into making it the best vector database option for anyone building a search application. So as not to spend several pages talking about all the enhancements to Elasticsearch, here is a non-exhaustive list in no particular order: ELSER - The Elastic Learned Sparse Encoder Elastic Serverless Service was built and is in public beta Elasticsearch open Inference API Embeddings Chat completion Semantic rerankers Semantic_text type - Simplify semantic search Automatic chunking Playground - Visually experiment with RAG application building in Elasticsearch Retrievers Open web crawler With all that change and more, the original blog needs a rewrite. So let’s get started. Updated flow: ChatGPT, Elasticsearch & RAG The plan for this updated flow will be: Setup Create a new Elasticsearch serverless search project Create an embedding inference API using ELSER Configure an index template with a semantic_text field Create a new LLM connector Configure a chat completion inference service using our LLM connector Ingest and Test Crawl the Elastic Labs sites (Search, Observability, Security) with the Elastic Open Web Crawler. Use Playground to test prompts using our indexed Labs content Configure and deploy our App Export the generated code from Playground to an application using FastAPI as the backend and React as the front end. Run it locally Optionally deploy our chatbot to Google Cloud Run Setup Elasticsearch Serverless Project We will be using an Elastic serverless project for our chatbot. Serverless removes much of the complexity of running an Elasticsearch cluster and lets you focus on actually using and gaining value from your data. Read more about the architecture of Serverless here . If you don’t have an Elastic Cloud account, you can create a free two-week trial at elastic.co (Serverless pricing available here ). If you already have one, you can simply log in. Once logged in, you will need to create a cloud API key . NOTE: In the steps below, I will show the relevant parts of Python code. For the sake of brevity, I’m not going to show complete code that will import required libraries, wait for steps to complete, catch errors, etc. For more robust code you can run, please see the accompanying Jypyter notebook ! Create Serverless Project We will use our newly created API key to perform the next setup steps. First off, create a new Elasticsearch project. url - This is the standard Serverless endpoint for Elastic Cloud project_data - Your Elasticsearch Serverless project settings name - Name we want for the project region_id - Region to deploy optimized_for - Configuration type - We are using vector which isn’t strictly required for the ELSER model but can be suitable if you select a dense vector model such as e5. Create Elasticsearch Python client One nice thing about creating a programmatic project is that you will get back the connection information and credentials you need to interact with it! ELSER Embedding API Once the project is created, which usually takes less than a few minutes, we can prepare it to handle our labs’ data. The first step is to configure the inference API for embedding . We will be using the Elastic Learned Sparse Encoder (ELSER). Command to create the inference endpoint Specify this endpoint will be for generating sparse embeddings model_config - Settings we want to use for deploying our semantic reranking model service - Use the pre-defined elser inference service service_settings.num_allocations - Deploy the model with 8 allocations service_settings.num_threads - Deploy with one thread per allocation inference_id - The name you want to give to you inference endpoint task_type - Specifies this endpoint will be for generating sparse embeddings This single command will trigger Elasticsearch to perform a couple of tasks: It will download the ELSER model. It will deploy (start) the ELSER model with eight allocations and one thread per allocation. It will create an inference API we use in our field mapping in the next step. Index Mapping With our ELSER API created, we will create our index template. index_patterns - The pattern of indices we want this template to apply to. body - The main content of a web page the crawler collects will be written to type - It is a text field copy_to - We need to copy that text to our semantic text field for semantic processing semantic_body is our semantic text field This field will automatically handle chunking of long text and generating embeddings which we will later use for semantic search inference_id specifies the name of the inference endpoint we created above, allowing us to generate embeddings from our ELSER model headings - Heading tags from the html id - crawl id for this document meta_description - value of the description meta tag from the html title is the title of the web page the content is from Other fields will be indexed but auto-mapped. The ones we are focused on pre-defining in the template will not need to be both keyword and text type, which is defined automatically otherwise. Most importantly, for this guide, we must define our semantic_text field and set a source field to copy from with copy_to . In this case, we are interested in performing semantic search on the body of the text, which the crawler indexes into the body . Crawl All the Labs! We can now install and configure the crawler to crawl the Elastic * Labs. We will loosely follow the excellent guide from the Open Crawler released for tech-preview Search Labs blog. The steps below will use docker and run on a MacBook Pro. To run this with a different setup, consult the Open Crawler Github readme . Clone the repo Open the command line tool of your choice. I’ll be using Iterm2. Clone the crawler repo to your machine. Build the crawler container Run the following command to build and run the crawler. Configure the crawler Create a new YAML in your favorite editor (vim): We want to crawl all the documents on the three labs’ sites, but since blogs and tutorials on those sites tend to link out to other parts of elastic.co, we need to set a couple of runs to restrict the scope. We will allow crawling the three paths for our site and then deny anything else. Paste the following in the file and save Copy the configuration into the Docker container: Validate the domain Ensure the config file has no issues by running: Start the crawler When you first run the crawler, processing all the articles on the three lab sites may take several minutes. Confirm articles have been indexed We will confirm two ways. First, we will look at a sample document to ensure that ELSER embeddings have been generated. We just want to look at any doc so we can search without any arguments: Ensure you get results and then check that the field body contains text and semantic_body.inference.chunks.0.embeddings contains tokens. We can check we are gathering data from each of the three sites with a terms aggregation: You should see results that start with one of our three site paths. To the Playground! With our data ingested, chunked, and inference, we can start working on the backend application code that will interact with the LLM for our RAG app. LLM Connection We need to configure a connection for Playground to make API calls to an LLM. As of this writing, Playground supports chat completion connections to OpenAI, AWS Bedrock, and Google Gemini. More connections are planned, so check the docs for the latest list. When you first enter the Playground UI, click on “Connect to an LLM” Since I used OpenAI for the original blog, we’ll stick with that. The great thing about the Playground is that you can switch connections to a different service, and the Playground code will generate code specifically to that service’s API specification. You only need to select which one you want to use today. In this step, you must fill out the fields depending on which LLM you wish to use. As mentioned above, since Playground will abstract away the API differences, you can use whichever supported LLM service works for you, and the rest of the steps in this guide will work the same. If you don’t have an Azure OpenAI account or OpenAI API account, you can get one here (OpenAI now requires a $5 minimum to fund the API account). Once you have completed that, hit “Save,” and you will get confirmation that the connector has been added. After that, you just need to select the indices we will use in our app. You can select multiple, but since all our crawler data is going into elastic-labs, you can choose that one. Click “Add data sources” and you can start using Playground! Select the “restaurant_reviews” index created earlier. Playing in the Playground After adding your data source you will be in the Playground UI. To keep getting started as simple as possible, we will stick with all the default settings other than the prompt. However, for more details on Playground components and how to use them, check out the Playground: Experiment with RAG applications with Elasticsearch in minutes blog and the Playground documentation . Experimenting with different settings to fit your particular data and application needs is an important part of setting up a RAG-backed application. The defaults we will be using are: Querying the semantic_body chunks Using the three nearest semantic chunks as context to pass to the LLM Creating a more detailed prompt The default prompt in Playground is simply a placeholder. Prompt engineering continues to develop as LLMs become more capable. Exploring the ever-changing world of prompt engineering is a blog, but there are a few basic concepts to remember when creating a system prompt: Be detailed when describing the app or service the LLM response is part of. This includes what data will be provided and who will consume the responses. Provide example questions and responses. This technique, called few-shot-prompting , helps the LLM structure its responses. Clearly state how the LLM should behave. Specify the Desired Output Format. Test and Iterate on Prompts. With this in mind, we can create a more detailed system prompt: Feel free to to test out different prompts and context settings to see what results you feel are best for your particular data. For more examples on advanced techiques, check out the Prompt section on the two part blog Advanced RAG Techniques . Again, see the Playground blog post for more details on the various settings you can tweak. Export the Code Behind the scenes, Playground generates all the backend chat code we need to perform semantic search, parse the relevant contextual fields, and make a chat completion call to the LLM. No coding work from us required! In the upper right corner click on the “View Code” button to expand the code flyout You will see the generated python code with all the settings your configured as well as the the functions to make a semantic call to Elasticsearch, parse the results, built the complete prompt, make the call to the LLM, and parse those results. Click the copy icon to copy the code. You can now incorporate the code into your own chat application! Wrapup A lot has changed since the first iteration of this blog over a year ago, and we covered a lot in this blog. You started from a cloud API key, created an Elasticsearch Serverless project, generated a cloud API key, configured the Open Web Crawler, crawled three Elastic Lab sites, chunked the long text, generated embeddings, tested out the optimal chat settings for a RAG application, and exported the code! Where’s the UI, Vestal? Be on the lookout for part two where we will integrate the playground code into a python backend with a React frontend. We will also look at deploying the full chat application. For a complete set of code for everything above, see the accompanying Jypyter notebook Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Jump to ChatGPT and Elasticsearch (April 2023) Updates in Elasticsearch (August 2024) Updated flow: ChatGPT, Elasticsearch & RAG Setup Elasticsearch Serverless Project Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"ChatGPT and Elasticsearch revisited: Building a chatbot using RAG - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/chatgpt-elasticsearch-rag-enhancements","meta_description":"Learn how to build a chatbot using ChatGPT, Elasticsearch, and the newest RAG features."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Anomaly detection population jobs: Spotify Wrapped, part 3 Anomaly detection can be a daunting task at first, but in this blog, we'll dive into it and figure out how the different jobs can help us find unusual patterns in our Spotify Wrapped data. How To PK By: Philipp Kahr On March 24, 2025 Part of Series The Spotify Wrapped series Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the first part , we talked about how to get your Spotify Wrapped data and how to visualize it. In the second part , we talked about how to process the data and how to visualize it. In this third part, we will talk about how to detect anomalies in your Spotify Wrapped data. What is an anomaly detection job? An anomaly detection job in Elastic tries to find unusual patterns in your data. It uses machine learning to learn the normal behavior of your data and then tries to find data points that do not fit this normal behavior. This can be useful for finding unusual patterns in your data, like a sudden increase in the number of songs you listened to or a sudden change in the average duration of the songs you listened to. There are many different types of anomaly detection jobs: Single metric Multi-metric Population Categorization Rare Geo Advanced Now, we will take a look at a few of those. Population job A population job is a type of anomaly detection job that tries to find unusual patterns in the distribution of your data. The distribution is defined by the population. Let's create that job together! 1. Go to the Machine Learning app in Kibana. 2. Click on \"Create job.\" 3. Select the Spotify-History data view (if you do not have that one yet, don't forget to import the dashboard and data view from here ). 4. Select Population. 5. Select a proper timeframe, for me that is 1st January 2017 and 31st December 2024. 6. Select the field you want to define your population on. Let's use artist . 7. For the Add Metric , we want to use sum(listened_to_ms) . This will sum up the total time you listened to a specific artist. 8. In the right bottom for influencers, add the title . This will give us more information when an anomaly occurs. 9. Bucket Span so that is a big point to discuss. In my case, I do not think there is any sense in looking at something different than a daily pattern. You might even want to check out weekly patterns. It really depends on a lot of factors. Going lower than 1 day could be a bit too fine, and therefore, you'll get not optimal results. 10. Click on next, give it a valid name and click next until it says create job . Ensure that the toggle for Start immediately is turned on. Click create job . It will start immediately and probably look like this: Perfect, let's dive into the details. Click View Results —this page will be very interesting for our analysis Interpreting that is simple. Everything red is a higher scoring anomaly and everything blue or lighter shaded is less high. We immediately spot that on the right-hand side, the band Kraftklub has some bars in red, orange, red, and shading out to blue. When focusing on Kraftklub by doing artist: Kraftklub in the search bar on top, it immediately tells me: September 30th, 2022 Actual 3.15 hours instead of 3.75 minutes . For me, that means that I regularly listen to roughly one song of Kraftklub per day, on this day I listened to over 3 hours of Kraftklub. That is clearly an anomaly. What could have triggered such a listening behavior? The concert was a bit far off, it was on the 19th of November, 2022. Maybe a new album that came out? We can actually spot that by clicking on the anomaly and selecting the View in Discover from the dropdown. Once we are in Discover, we can do additional visualizations and breakdowns to investigate this further. We will pull up the title , album and spotify_metadata.album.release_date to see if a new album came out on that day. We can immediately see that on 22nd September 2022, the album KARGO was released. 8 days later, it appears that I took an interest and started listening to it. What else can we find? Maybe something seasonal? Let's zoom into Fred Again.. which I listen to a lot (as you can tell from blog number two) . There are roughly 10 days back to back as an anomaly. on average, I listened roughly an hour per day to Fred Again. I know that Fred Again.. probably didn't release an album during that time. ES|QL will help us in figuring out more details. When switching to ES|QL, the time picker value will be kept, but any filters in the search bar will be removed. The first thing we need to do is to add that back. The next thing I want to know is how many albums I listened to and whether any were released near those days. We perform a simple count to get the count of records. The values allow us to retrieve the value of the document and not perform any aggregation on it, and we split those up by the album name. I cannot spot any release date near the anomaly days. My \"head date math\" is not always on point, so let's add a difference in days from the release date to the first listen date (during this anomaly) as it is quite clear that an album release did not trigger this anomaly. Single metric A single metric job is a type of anomaly detection job that tries to find unusual patterns in a single metric. The metric is defined by the field you select. So, what could be an interesting single metric? Let's use the listened_to_pct . This tells us how much of a song I complete before I skip to the next one. This is quite intriguing—let’s see if there are certain days when I skip more than others. 1. Go to the Machine Learning app in Kibana. 2. Click on \"Create job.\" 3. Select the Spotify-History data view. 4. Select Single Metric. 5. Select a proper timeframe, for me that is from 1st January 2017 to 31st December 2024. Now it gets tricky, do we use mean(), high_mean(), low_mean() ? Well, it depends on what we want to anomaly on. Mean will give you anomalies for low values such as 0, as well for high. High mean on the other hand is more on the high side, meaning that if the listening completion drops to 0 for a couple of days, it won't trigger an anomaly. Mean and low mean would. High mean is often useful when you want to detect spikes in your data. You don't want an anomaly if your service that is processing data is fast, you don't care if it finishes in 1 ms. But If it takes 10 ms, you want an anomaly. In this case, I guess we should try mean() and see where it takes us. 6. Don't forget to set the bucket to 1 day. 7. Click on next , give it a valid name and click next until it says create job . Ensure that the toggle for Start immediately is turned on. Click create job . It will start immedaitely. Here are the results: It's fascinating to see that on 18th August 2024, I only listened to songs for ~3% of their total duration on average. Usually, I listen to nearly 70% of the song before pressing the next button. All in all, I would say that mean() is a good choice for this metric. Multi metric Now, I want to figure out if I have a single song that spikes within an artist. To do that, we can leverage a multi metric job. 1. Go to the Machine Learning app in Kibana. 2. Click on \"Create job.\" 3. Select the Spotify-History data view. 4. Select Multi Metric. 5. Select a proper timeframe, for me that is from 1st January 2017 to 31st December 2024. 6. Select the distinct count(title) and split by artist , add artist, title, album to the influencers. 7. Don't forget the bucket span to 1d. It might give you a warning about high memory usage because instead of modeling everything in one large model, it now creates a single model for each artist. This can be quite memory intensive, so be careful. Go into the same anomaly detection table view and pick any album at random. I chose Live in Paris from Meute . At first glance, that's super interesting and it shows how accurate anomaly detection can be. I have the song You & Me from the album Live in Paris in my liked songs as well as roughly 10 other songs from different albums. I actively listened to the Live in Paris album on the 27th, 28th, 29th and 30th of December 2024. Conclusion In this blog, we dived into the rabbit hole of anomaly detection and what that can entail. It might be daunting at first, but it's not that complicated once you get a hang of it and it can also provide really good and quick insights. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. In this blog post, we may have used or referred to third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. Elastic, Elasticsearch, ESRE, Elasticsearch Relevance Engine and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is an anomaly detection job? Population job Single metric Multi metric Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Anomaly detection population jobs: Spotify Wrapped, part 3 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-anomaly-detection-jobs","meta_description":"Learn about anomaly detection jobs Elasticsearch, like population, single metric & multi metric jobs, and how to use them to uncover unusual patterns in your data."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Introducing Elasticsearch vector database to Azure OpenAI Service On Your Data (preview) Microsoft and Elastic partner to add Elasticsearch (preview) as an officially supported vector database and retrieval augmentation technology for Azure OpenAI On Your Data, enabling users to build chat experiences with advanced AI models grounded by enterprise data. Generative AI AT By: Aditya Tripathi On March 26, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Microsoft and Elastic are thrilled to announce that Elasticsearch, the world's most downloaded vector database is an officially supported vector store and retrieval augmented search technology for Azure OpenAI Service On Your Data in public preview. The groundbreaking feature empowers you to leverage the power of OpenAI models, such as GPT-4, and incorporates the advanced capabilities of RAG (Retrieval Augmented Generation) model, directly on your data with enterprise-grade security on Azure. Read the announcement from Microsoft here . Azure OpenAI Service On Your Data makes conversational experiences come alive for your employees, customers and users. With the addition of Elasticsearch vector database and vector search technology, LLMs are enriched by your business data, and conversations deliver superior quality responses out-of-the-box. All of this adds up to helping you better understand your data, and make more informed decisions. Build powerful conversational chat experiences, fast Business users, such as users on e-commerce teams, product managers, and others can add documents from an Elasticsearch index to build a conversational chat experience very quickly. All it takes is a few simple steps to configure the chat experience with parameters such as message history, and you're good to go! Customers can realize benefits pretty much right away.. Quickly roll out conversational experiences to your users, customers, or employees--backed by context from your business data Common use cases include offering internal knowledge search, users self-service, or chatbots that help process common business workflows How Elasticsearch vector database works with On Your Data The new native experience within Azure OpenAI Studio makes adding an Elastic index a simple matter. Developers can pick Elasticsearch as their chosen vector database option from the drop-down menu.. You can bring your existing Elasticsearch indexes to On Your Data—whether those indexes live on Azure or on-prem. Just select Elasticsearch as your data source, add your Elastic endpoint and API key, add an Elastic index, and you're all set! With the Elasticsearch vector database running in the background, users get all the Elastic advantages you'd expect. Precision of BM25 (text) search, the semantic understanding of vector search, and the best of both worlds with hybrid search Document and field level security, so users can only access information they're entitled to based on their permissions Filters, facets, and aggregations that add a real boost to how quickly relevant context is pulled from your organisation's data, and sent to an LLM Choice of leveraging a range of large language model providers, including Azure OpenAI, Hugging Face, or other 3rd party models Elastic on Microsoft Azure: a proven combination Elastic is a proud winner of the Worldwide Microsoft Partner of the Year award for Commercial Marketplace. Elastic and Microsoft customers have been using Elasticsearch and Azure OpenAI to build futuristic search experiences, that leverage the best of AI and machine learning, today . Ali Dalloul, VP, Azure AI Customer eXperience Engineering had this to say about the collaboration, \"By harnessing the power of Azure Cloud and OpenAI, Elastic is driving the development of AI-driven solutions that redefine customer experiences. This partnership is more than just a collaboration; it's a feedback loop of innovation, benefiting customers, Elastic, and Microsoft, while empowering the broader partner ecosystem. We're delighted to offer customers Elasticsearch's strong vector database and retrieval augmentation capabilities to store and search vector embeddings for On Your Data.\" \"This really helps customers connect data wherever it lives. We are happy to open the spectrum of building conversational AI solutions, agnostic to location, including Elasticsearch. We are excited to see how developers build upon this integration.\" Adds Pavan Li, Principal Product Manager of Azure OpenAI Service On Your Data. Elastic's clear strengths in hybrid search--combining BM25/text search with vector search for semantic relevance, was an important differentiator. With the backing of the open source Apache Lucene community, Elastic's vector database has already been widely adopted by large companies for enterprise scale use cases. Try On Your Data with Elasticsearch vector database today Unlock the insights with conversational AI, using Elasticsearch and Azure OpenAI On Your Data today! Visit Azure OpenAI Studio to build your first conversational copilot Connect Elasticsearch with OpenAI models Read more on the Microsoft Tech Community blog Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to Build powerful conversational chat experiences, fast How Elasticsearch vector database works with On Your Data Elastic on Microsoft Azure: a proven combination Try On Your Data with Elasticsearch vector database today Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Introducing Elasticsearch vector database to Azure OpenAI Service On Your Data (preview) - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/azure-openai-on-your-data-elasticsearch-vector-database","meta_description":"Microsoft and Elastic partner to add Elasticsearch (preview) as an officially supported vector database and retrieval augmentation technology for Azure OpenAI On Your Data, enabling users to build chat experiences with advanced AI models grounded by enterprise data."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Advanced integration tests with real Elasticsearch Mastering advanced Elasticsearch integration testing: Faster, smarter, and optimized. How To PP By: Piotr Przybyl On January 31, 2025 Part of Series Integration tests using Elasticsearch Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In the previous post on integration testing , we covered shortening the execution time of integration tests relying on real Elasticsearch by changing the approach to data initialization strategies. In this installment, we're about to shorten the test suite duration even further, this time by applying advanced techniques to the Docker container running Elasticsearch and Elasticsearch itself. Please note that the techniques described below can often be cherry-picked: you can choose what makes the most sense in your specific case. Here be dragons: The trade-offs Before we delve into the ins and outs of various approaches in pursuit of performance, it's important to understand that not every optimization should always be applied. While they tend to improve things, they can also make the setup more obscure, especially to an untrained eye. In other words, in the following sections, we're not going to change anything within the tests; only the \"infrastructure code around\" is going to be redesigned. These changes can make the code more difficult to understand for less-experienced team members. Using the techniques described below is not rocket science, but some caution is advised, and experience is recommended. Snapshots When we left off our demo code, we were still initializing Elasticsearch with data for every test . This approach has some advantages, especially if our dataset differs between test cases, e.g., we index somewhat different documents sometimes. However, if all our test cases can rely on the same dataset, we can use the snapshot-and-restore approach. It's helpful to understand how snapshot and restore work in Elasticsearch, which is explained in the official documentation . In our approach, instead of handling this via the CLI or the DevOps method, we will integrate it into the setup code around our tests. This ensures smooth test execution on developer machines as well as in CI/CD. The idea is quite simple: instead of deleting indices and recreating them from scratch before each test, we: Create a snapshot in the container's local file system (if it doesn't already exist, as this will become necessary later). Restore the snapshot before each test. Prepare snapshot location One important thing to note – which makes Elasticsearch different from many relational databases – is that before we send a request to create a snapshot, we first need to register a location where the snapshots can be stored, the so-called repository. There are many storage options available (which is very handy for cloud deployments); in our case, it's enough to keep them in a local directory inside the container. Note: The /tmp/... location used here is suitable only for volatile integration tests and should never be used in a production environment. In production, always store snapshots in a location that is safe and reliable for backups. To avoid the temptation of storing backups in an unsafe location, we first add this to our test: Next, we configure the ElasticsearchContainer to ensure it can use this location as a backup location: Change the setup Now we're ready to append the following logic to our @BeforeAll method: And our @BeforeEach method should start with: Checking if the snapshot exists can be done by verifying that the REPO_LOCATION directory exists and contains some files: The setupDataInContainer() method has minor changes: it's no longer called in @BeforeEach (we execute it on demand when needed), and the DELETE books request can be removed (as it is no longer necessary). To create a snapshot, we first need to register a snapshot location and then store any number of snapshots there (although we'll keep only one, as the tests don't require more): Once the snapshot is created, we can restore it before each test as follows: Please note the following: Before restoring an index, it cannot exist, so we must delete it first. If you need to delete multiple indices, you can do so in a single curl call, e.g., \"https://localhost:9200/indexA,indexB\" . To chain several commands in a container, you don't need to wrap them in separate execInContainer calls; running a simple script can improve readability (and reduce some network round-trips). In the example project, this technique shortened my build time to 26 seconds. While this might not seem like a significant gain at first glance, the approach is a universal technique that can be applied before, or even instead of, switching to _bulk ingestion (discussed in the previous post). In other words, you can prepare data for your tests in @BeforeAll in any way and then make a snapshot of it to use in @BeforeEach . If you want to maximize efficiency, you can even copy the snapshot back to the testing machine using elasticsearch.copyFileFromContainer(...) , allowing it to serve as a form of cache that is only purged when you need to update the dataset (e.g., for new features to test). For a complete example, check out the tag snapshots . RAM the data Sometimes, our test cases are noticeably data-heavy, which can negatively impact performance, especially if the underlying storage is slow. If your tests need to read and write large amounts of data, and the SSD or even hard drive is painfully slow, you can instruct the container to keep the data in RAM – provided you have enough memory available. This is essentially a one-liner, requiring the addition of .withTmpFs(Map.of(\"/usr/share/elasticsearch/data\", \"rw\")) to your container definition. The container setup will look like this: The slower your storage is, the more significant the performance improvement will be, as Elasticsearch will now write to and read from a temporary file system in RAM. Note: As the name implies, this is a temporary file system, meaning it is not persistent. Therefore, this solution is suitable only for tests. Do not use this in production, as it could lead to data loss. To assess how much this solution can improve performance on your hardware, you can try the tag tmpfs . More work, same time The size of a product's codebase grows the most during the active development phase. Then, when it moves into a maintenance phase (if applicable), it usually involves just bug fixes. However, the size of the test base grows continuously, as both features and bugs need to be covered by tests to prevent regressions. Ideally, a bug fix should always be accompanied by a test to prevent the bug from reappearing. This means that even when development is not particularly active, the number of tests will keep growing. The approach described in this section provides hints on how to manage a growing test base without significantly increasing test suite duration, provided sufficient resources are available to enable parallelization. Let's assume, for simplicity, that the number of test cases in our example has doubled (rather than writing additional tests, we will copy the existing ones for this demo). In the simplest approach, we could add three more @Test methods to the BookSearcherIntTest class. We can then observe CPU and memory consumption by using, in a somewhat unorthodox way, one of Java's profilers: Java Flight Recorder. Since we added it to our POM , after running the tests, we can open recording-1.jfr in the main directory. The results may look like this in Environment -> Processes : As you can see, running six tests in a single class doubled the time required. Additionally, the predominant color in the CPU usage chart above is... no color at all, as CPU utilization barely reaches 20% during peak moments. Underutilizing your CPU is wasteful when you’re paying for usage time (whether to cloud providers or in terms of your own wall clock time to get meaningful feedback). Chances are, the CPU you’re using has more than one core. The optimization here is to split the workload into two parts, which should roughly halve the duration. To achieve this, we move the newly added tests into another class called BookSearcherAnotherIntTest and instruct Maven to run two forks for testing using -DforkCount=2 . The full command becomes: With this change, and using JFR and Java Mission Control, we observe the following: Here, the CPU is utilized much more effectively. This example should not be interpreted with a focus on exact numbers. Instead, what matters is the general trend, which applies not only to Java: Check whether your CPU is being properly utilized during tests. If not, try to parallelize your tests as much as possible (though other resources might sometimes limit you). Keep in mind that different environments may require different parallelization factors (e.g., -DforkCount=N in Maven). It’s better to avoid hardcoding these factors in the build script and instead tune them per project and environment: This can be skipped for developer machines if only a single test class is being run. A lower number might suffice for less powerful CI environments. A higher number might work well for more powerful CI setups. For Java, it’s important to avoid having one large class and instead divide tests into smaller classes as much as it makes sense. Different parallelization techniques and parameters apply to other technology stacks, but the overarching goal remains to fully utilize your hardware resources. To refine things further, avoid duplicating setup code across test classes. Keep the tests themselves separate from infrastructure/setup code. For instance, configuration elements like the image version declaration should be maintained in one place. In Testcontainers for Java, we can use (or slightly repurpose) inheritance to ensure that the class containing infrastructure code is loaded (and executed) before the tests. The structure would look like this: For a complete demo, refer again to the example project on GitHub. Reuse - Start once and once only The final technique described in this post is particularly useful for developer machines. It may not be suitable for traditional CIs (e.g., Jenkins hosted in-house) and is generally unnecessary for ephemeral CI environments (like cloud-based CIs, where build machines are single-use and decommissioned after each build). This technique relies on a preview feature of Testcontainers, known as reuse . Typically, containers are cleaned up automatically after the test suite finishes. This default behavior is highly convenient, especially in long-running CIs, as it ensures no leftover containers regardless of the test results. However, in certain scenarios, we can keep a container running between tests so that subsequent tests don’t waste time starting it again. This approach is especially beneficial for developers working on a feature or bug fix over an extended period (sometimes days), where the same test (class) is run repeatedly. How to enable reuse Enabling reuse is a two-step process: 1. Mark the container as reusable when declaring it: 2. Opt-in to enable the reuse feature in the environments where it makes sense (e.g., on your development machine). The simplest and most persistent way to do this on a developer workstation is by ensuring that the configuration file in your $HOME directory has the proper content. In ~/.testcontainers.properties , include the following line: That’s all! On first use, tests won’t be any faster because the container still needs to start. However, after the initial test: Running docker ps will show the container still running (this is now a feature, not a bug). Subsequent tests will be faster. Note: Once reuse is enabled, stopping the containers manually becomes your responsibility. Leveraging reuse with snapshots or init data The reuse feature works particularly well in combination with techniques like copying initialization data files to the container only once or using snapshots. With reuse enabled, there’s no need to recreate snapshots for subsequent tests, saving even more time. All the pieces of optimization start falling into place. Reuse forked containers While reuse works well in many scenarios, issues arise when combining reuse with multiple forks during the second run. This can result in errors or gibberish output related to containers or Elasticsearch being in an improper state. If you wish to use both improvements simultaneously (e.g., running many integration tests on a powerful workstation before submitting a PR), you’ll need to make an additional adjustment. The problem The issue may manifest itself in errors like the following: This happens due to how Testcontainers identifies containers for reuse. When both forks start and no Elasticsearch containers are running, each fork initializes its own container. Upon restarting, however, each fork looks for a reusable container and finds one. Because all containers look identical to Testcontainers, both forks may select the same container. This results in a race condition, where more than one fork tries to use the same Elasticsearch instance. For example, one fork may be reinstating a snapshot while the other is attempting to do the same, leading to errors like the one above. The solution To resolve this, we need to introduce differentiation between containers and ensure that forks select containers deterministically based on these differences. Step 1: Update pom.xml Modify the Surefire configuration in your pom.xml to include the following: This adds a unique identifier ( fork_${surefire.forkNumber} ) for each fork as an environment variable. Step 2: Modify container declaration Adjust the Elasticsearch container declaration in your code to include a label based on the fork identifier: The effect These changes ensure that each fork creates and uses its own container. The containers are slightly different due to the unique labels, allowing Testcontainers to assign them deterministically to specific forks. This approach eliminates the race condition, as no two forks will attempt to reuse the same container. Importantly, the functionality of Elasticsearch within the containers remains identical, and tests can be distributed between the forks dynamically without affecting the outcome. Was it really worth it? As warned at the beginning of this post, the improvements introduced here should be applied with caution, as they make the setup code of our tests less intuitive. What are the benefits? We started this post with three integration tests taking around 25 seconds on my machine. After applying all the improvements together and doubling the number of actual tests to six, the execution time on my laptop dropped to 8 seconds. Doubled the tests; shortened the build by two-thirds. It's up to you to decide if it makes sense for your case. ;-) It doesn't stop here This miniseries on testing with real Elasticsearch ends here. In part one we discussed when it makes sense to mock Elasticsearch index and when it's a better idea to go for integration tests. In part two , we have addressed the most common mistakes that make your integration tests slow. This part three goes the extra mile to make integration tests run even faster, in seconds instead of minutes. There are more ways to optimize your experience and reduce costs associated with integration tests of systems using Elasticsearch. Don’t hesitate to explore these possibilities and experiment with your tech stack. If your case involves any of the techniques mentioned above, or if you have any questions, feel free to reach out on our Discuss forums or community Slack channel . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Here be dragons: The trade-offs Snapshots Prepare snapshot location Change the setup RAM the data Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Advanced integration tests with real Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-improve-performance-integration-tests","meta_description":"Here's how to master advanced Elasticsearch integration testing: We'll explain how to make integration tests run faster, in seconds instead of minutes."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Ingest geospatial data into Elasticsearch with Kibana for use in ES|QL How to use Kibana and the csv ingest processor to ingest geospatial data into Elasticsearch for use with search in Elasticsearch Query Language (ES|QL). Elasticsearch has powerful geospatial search features, which are now coming to ES|QL for dramatically improved ease of use and OGC familiarity. But to use these features, we need Geospatial data. How To CT By: Craig Taverner On October 25, 2024 Part of Series Elasticsearch geospatial search Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. We recently published a blog describing how to use the new geospatial search features in ES|QL , Elasticsearch's new, powerful piped query language . To use these features, you need to have geospatial data in Elasticsearch. So in this blog, we'll show you how to ingest geospatial data, and how to use it in ES|QL queries. Importing geospatial data using Kibana The data we used for the examples in the previous blog were based on data we use internally for integration tests. For your convenience we've, included it here in the form of a few CSV files that can easily be imported using Kibana. The data is a mix of airports, cities, and city boundaries. You can download the data from: airports.csv This contains a merger of three datasets: Airports (names, locations and related data) from Natural Earth City locations from SimpleMaps Airport elevations from The global airport database airport_city_boundaries.csv This contains a merger of airport and city names from above with one new source: City boundaries from OpenStreetMap As you can guess, we spent some time combining these data sources into the two files above, with the goal of being able to test the geospatial features of ES|QL. This might not be quite the same as your specific data needs, but hopefully this gives you an idea of what is possible. In particular we want to demonstrate a few interesting things: Importing data with geospatial fields together with other indexable data Importing both geo_point and geo_shape data and using them together in queries Importing data into two indexes that can be joined using a spatial relationship Creating an ingest pipeline for facilitating future imports (beyond Kibana) Some examples of ingest processors, like csv , convert and split While we'll be discussing working with CSV data in this blog, it is important to understand there are several ways to add geo data using Kibana . Within the Map application you can upload delimited data like CSV, GeoJSON and ESRI ShapeFiles and you can also draw shapes directly in the Map. For this blog we'll focus on importing CSV files from the Kibana home page. Importing the airports The first file, airports.csv , has some interesting quirks we need to deal with. Firstly the columns have additional whitespace separating them, not typical of CSV files. Secondly, the type field is a multi-value field, which we need to split into separate fields. Finally, some fields are not strings, and need to be converted to the right type. All this can be done using Kibana's CSV import facility. Start at the Kibana home-page. There is a section called \"Get started by adding integrations\", which has a link called \"Upload a file\": Click on this link, and you will be taken to the \"Upload file\" page. Here you can drag and drop the airports.csv file, and Kibana will analyse the file and present you with a preview of the data. It should have automatically detected the delimiter as a comma, and the first row as the header row. However, it probably did not trim the extra whitespace between the columns, nor determined the types of the fields, assuming all fields are either text or keyword . We need to fix this. Click Override settings and check the checkbox for Should trim fields , and Apply to close the settings. Now we need to fix the types of the fields. This is available on the next page, so go ahead and click Import . First choose an index name, and then select Advanced to get to the field mappings and ingest processor page. Here we need to make changes to both the field mappings for the index, as well as the ingest pipeline for importing the data. Firstly, while Kibana likely auto-detected the scalerank field as long , it mistakenly perceived the location and city_location fields as keyword . Edit them to geo_point , ending up with mappings that look something like: You have some flexibility here, but note that what type you choose will affect how the field is indexed and what kind of queries are possible. For example, if you leave location as keyword you cannot perform any geospatial search queries on it. Similarly, if you leave elevation as text you cannot perform numerical range queries on it. Now it's time to fix the ingest pipeline. If Kibana auto-detected scalerank as long above, it will also have added a processor to convert the field to a long . We need to add a similar processor for the elevation field, this time converting it to double . Edit the pipeline to ensure you have this conversion in place. Before saving this, we want one more conversion, to split the type field into multiple fields. Add a split processor to the pipeline, with the following configuration: The final ingest pipeline should look like: Note that we did not add a convert processor for the location and city_location fields. This is because the geo_point type in the field mapping already understands the WKT format of the data in these fields. The geo_point type can understand a range of formats, including WKT, GeoJSON, and more . If we had, for example, two columns in the CSV file for latitude and longitude , we would have needed to add either a script or a set processor to combine these into a single geo_point field (eg. \"set\": {\"field\": \"location\", \"value\": \"{{lat}},{{lon}}\"} ). We are now ready to import the file. Click Import and the data will be imported into the index with the mappings and ingest pipeline we just defined. If there are any errors ingesting the data, Kibana will report them here, so you can either edit the source data or the ingest pipeline and try again. Notice that a new ingest-pipeline has been created. This can be viewed by going to the Stack Management section of Kibana, and selecting Ingest pipelines . Here you can see the pipeline we just created, and edit it if necessary. In fact the Ingest pipelines section can be used for creating and testing ingest pipelines, a very useful feature if you plan to do even more complex ingests. If you want to explore this data immediately skip down to the later sections, but if you want to import the city boundaries as well, continue reading. Importing the city boundaries The city boundaries file available at airport_city_boundaries.csv is a bit simpler to import than the previous example. It contains a city_boundary field that is a WKT representation of the city boundary as a POLYGON , and a city_location field that is a geo_point representation of the city location. We can import this data in a similar way to the airports data, but with a few differences: We needed to select the override setting Has header row since that was not autodetected We did not need to trim fields, as the data was already clean of extra whitespace We did not need to edit the ingest pipeline because all types were either string or spatial types We did, however, have to edit the field mappings to set the city_boundary field to geo_shape and the city_location field to geo_point Our final field mappings looked like: As with the airports.csv import before, simply click Import to import the data into the index. The data will be imported with the mappings we edited and ingest pipeline that Kibana defined. Exploring geospatial data with dev-tools In Kibana it is usual to explore the indexed data with \"Discover\". However, if your intention is to write your own app using ES|QL queries, it might be more interesting to try access the raw Elasticsearch API. Kibana has a convenient console for experimenting with writing queries. This is called the Dev Tools console, and can be found in the Kibana side-bar. This console talks directly to the Elasticsearch cluster, and can be used to run queries, create indexes, and more. Try the following: This should provide the following results: distance abbrev name location country city elevation 273418.05776847183 HAM Hamburg POINT (10.005647830925 53.6320011640866) Germany Norderstedt 17.0 337534.653466062 TXL Berlin-Tegel Int'l POINT (13.2903090925074 52.5544287044101) Germany Hohen Neuendorf 38.0 483713.15032266214 OSL Oslo Gardermoen POINT (11.0991032762581 60.1935783171386) Norway Oslo 208.0 522538.03148094116 BMA Bromma POINT (17.9456175406145 59.3555902065112) Sweden Stockholm 15.0 522538.03148094116 ARN Arlanda POINT (17.9307299016916 59.6511203397372) Sweden Stockholm 38.0 624274.8274399083 DUS Düsseldorf Int'l POINT (6.76494446612174 51.2781820420774) Germany Düsseldorf 45.0 633388.6966435644 PRG Ruzyn POINT (14.2674849854076 50.1076511703671) Czechia Prague 381.0 635911.1873311149 AMS Schiphol POINT (4.76437693232812 52.3089323889822) Netherlands Hoofddorp -3.0 670864.137958866 FRA Frankfurt Int'l POINT (8.57182286907608 50.0506770895207) Germany Frankfurt 111.0 683239.2529970079 WAW Okecie Int'l POINT (20.9727263383587 52.171026749259) Poland Piaseczno 111.0 Visualizing geospatial data with Kibana Maps Kibana Maps is a powerful tool for visualizing geospatial data. It can be used to create maps with multiple layers, each layer representing a different dataset. The data can be filtered, aggregated, and styled in various ways. In this section, we will show you how to create a map in Kibana Maps using the data we imported in the previous section. In the Kibana menu, navigate to Analytics -> Maps to open a new map view. Click on Add Layer and select Documents , choosing the data view airports and then editing the layer style to color the markers using the elevation field, so we can easily see how high each airport is. Click 'Keep changes' to save the Map: Now add a second layer, this time selecting the airport_city_boundaries data view. This time, we will use the city_boundary field to style the layer, and set the fill color to a light blue. This will show the city boundaries on the map. Make sure to reorder the layers to ensure that the airport markers are on top. Spatial joins ES|QL does not support JOIN commands, but you can achieve a special case of a join using the ENRICH command . This command operates akin to a 'left join' in SQL, allowing you to enrich results from one index with data from another index based on a spatial relationship between the two datasets. For example, let's enrich the results from a table of airports with additional information about the city they serve by finding the city boundary that contains the airport location, and then perform some statistics on the results: If you run this query without first preparing the enrich index, you will get an error message like: This is because, as we mentioned before, ES|QL does not support true JOIN commands. One important reason for this is that Elasticsearch is a distributed system, and joins are expensive operations that can be difficult to scale. However, the ENRICH command can be quite efficient, because it makes use of specially prepared enrich indexes that are duplicated across the cluster, enabling local joins to be performed on each node. To better understand this, let's focus on the ENRICH command in the query above: This command instructs Elasticsearch to enrich the results retrieved from the airports index, and perform an intersects join between the city_location field of the original index, and the city_boundary field of the airport_city_boundaries index, which we used in a few examples earlier. But some of this information is not clearly visible in this query. What we do see is the name of an enrich policy city_boundaries , and the missing information is encapsulated within that policy definition. Here we can see that it will perform a geo_match query ( intersects is the default), the field to match against is city_boundary , and the enrich_fields are the fields we want to add to the original document. One of those fields, the region was actually used as the grouping key for the STATS command, something we could not have done without this 'left join' capability. For more information on enrich policies, see the enrich documentation . The enrich indexes and policies in Elasticsearch were originally designed for enriching data at index time, using data from another prepared enrich index. In ES|QL, however, the ENRICH command works at query time, and does not require the use of ingest pipelines. This effectively makes it quite similar to an SQL LEFT JOIN , except youn cannot join any two indexes, only a normal index on the left with a specially prepared enrich index on the right. In either case, whether for ingest pipelines or use in ES|QL, it is necessary to perform a few preparatory steps to set up the enrich index and policy. We already imported the airport_city_boundaries index above, but this is not directly usable as an enrich index in the ENRICH command. We first need to perform two steps: Create the enrich policy described above to define the source index, the field in the source index to match against, and the fields to return once matched. Execute this policy to create the enrich index. This will build a special internal index, by reading the original source index into a more efficient data structure which is copied across the cluster. The enrich policy can be created using the following command: And the policy can be executed using the following command: Note that if you ever change the contents of the airport_city_boundaries index, you will need to re-execute this policy to see the changes reflected in the enrich index. Now let's run the original ES|QL query again: This returns the top 5 regions with the most airports, along with the centroid of all the airports that have matching regions, and the range in length of the WKT representation of the city boundaries within those regions: centroid count region POINT (-12.139086859300733 31.024386116624648) 126 null POINT (-83.10398317873478 42.300230911932886) 3 Detroit POINT (39.74537850357592 47.21613017376512) 3 городской округ Батайск POINT (-156.80986787192523 20.476673701778054) 3 Hawaii POINT (-73.94515332765877 40.70366442203522) 3 City of New York POINT (-83.10398317873478 42.300230911932886) 3 Detroit POINT (-76.66873019188643 24.306286952923983) 2 New Providence POINT (-3.0252167768776417 51.39245774131268) 2 Cardiff POINT (-115.40993484668434 32.73126147687435) 2 Municipio de Mexicali POINT (41.790108773857355 50.302146775648) 2 Центральный район POINT (-73.88902732171118 45.57078813901171) 2 Montréal You may also notice that the most commonly found region was null . What could this imply? Recall that I likened this command to a 'left join' in SQL, meaning if no matching city boundary is found for an airport, the airport is still returned but with null values for the fields from the airport_city_boundaries index. It turns out there were 125 airports that found no matching city_boundary , and one airport with a match where the region field was null . This lead to a count of 126 airports with no region in the results. If your use case requires that all airports can be matched to a city boundary, that would require sourcing additional data to fill in the gaps. It would be necessary to determine two things: which records in the airport_city_boundaries index do not have city_boundary fields which records in the airports index do not match using the ENRICH command (ie. do not intersect) Using ES|QL for geospatial data in Kibana Maps Kibana has added support for Spatial ES|QL in the Maps application. This means that you can now use ES|QL to search for geospatial data in Elasticsearch, and visualize the results on a map. There is a new layer option in the add layers menu, called \"ES|QL\". Like all of the geospatial features described so far, this is in \"technical preview\". Selecting this option allows you to add a layer to the map based on the results of an ES|QL query. For example, you could add a layer to the map that shows all the airports in the world. Or you could add a layer that shows the polygons from the airport_city_boundaries index, or even better, how about that complex ENRICH query above that generates statistics for how many airports are in each region? What's next The previous Geospatial search blog focused on the use of functions like ST_INTERSECTS to perform searching, available in Elasticsearch since 8.14. And this blog shows you how to import the data we used for those searches. However, Elasticsearch 8.15 came with a particularly interesting function: ST_DISTANCE which can be used to perform efficient spatial distance searches, and this will be the topic of the next blog! Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Importing geospatial data using Kibana Importing the airports Importing the city boundaries Exploring geospatial data with dev-tools Visualizing geospatial data with Kibana Maps Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Ingest geospatial data into Elasticsearch with Kibana for use in ES|QL - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/geospatial-data-ingest-for-esql","meta_description":"Here's how to ingest geospatial data into Elasticsearch using Kibana. This blog also covers how to use ES|QL to search and visualize geospatial data. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Build RAG quickly with minimal code in Elastic 8.15 Learn how to build an end-to-end RAG pipeline with the S3 Connector, semantic_text datatype, and Elastic Playground. How To HC By: Han Xiang Choong On September 4, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Elastic 8.15 is out, and Semantic Search is easier than ever to pull off. We're going to cover how to accomplish all of these tasks in 15 minutes: Store your documents in some data storage service like an AWS S3 Bucket Set up an Elastic S3 Connector Upload an embedding model using the eland library , set-up an inference API in Elastic Connect that to an index that uses the semantic_text datatype Add your inference API to that index Configure and sync content with the S3 Connector Use the Elastic Playground immediately You will need: An Elastic Cloud Deployment updated to Elastic 8.15 An S3 bucket An LLM API service (Anthropic, Azure, OpenAI, Gemini) And that's it! Let's get this done. Collecting data To follow along with this specific demo, I've uploaded a zip file containing the data used here . It's the first 60 or so pages of the Silmarillion , each as a separate pdf file. I'm going through a Lord of the Rings kick at the moment. Feel free to download it and upload it to your S3 bucket! Splitting the document into individual pages is sometimes necessary for large documents, as the native Elastic S3 Connector will not ingest content from files over 10MB in size. I use this Python script for splitting a PDF into individual pages: Setting up the S3 connector The connector can ingest a huge variety of data types . Here, we're sticking to an S3 bucket loaded with pdf pages. My S3 Bucket I'll just hop on my Elastic Cloud deployment, go to Search->Content->Connectors, and make a new connector called aws-connector, with all the default settings. Then I'll open up the configuration and add the name of my bucket, and the secret key and access key tagged to my AWS user. Elastic Cloud S3 Connector Configuration Run a quick sync to verify that everything is working okay. Synchronization will ingest every uningested file in your data source, extract its content, and store it as a unique document within your index. Each document will contain its original filename. Data source documents with the same filenames as existing indexed documents won't be reingested, so have no fear! Synchronization can also be regularly scheduled. The method is described in the documentation. If everything is working fine, assuming my AWS credentials and permissions are all in order, the data's going to go into an index called aws-connector. First successful sync of our S3 connector Looks like it's all good. Let's grab our embedding model! Uploading an embedding model Eland is a Python Elasticsearch client which makes it easy to convert numpy, pandas, and scikit-learn functions to Elasticsearch powered equivalents. For our purposes, it will be our method of uploading models from HuggingFace, for deployment in our Elasticsearch cluster. You can install eland like so: Now get to a bash editor and make this little .sh script, filling out each parameter appropriately: MODEL_ID refers to a model taken from huggingface. I'm choosing all-MiniLM-L6-v2 mainly because it is very good, but also very small, and easily runnable on a CPU. Run the bash script, and once done, your model should appear in your Elastic deployment under Machine Learning -> Model Management -> Trained Models. Deploy the model you just uploaded with eland Just click the circled play button to deploy the model, and you're done. Setting up your semantic_text index Time to set up semantic search. Navigate to Management -> Dev Tools, and delete your index because it does not have the semantic_text datatype enabled. Check the model_id of your uploaded model with: Now create an inference endpoint called minilm-l6, and pass it the correct model_id. Let's not worry about num_allocations and num_threads, because this isn't production and minilm-l6 is not a big-boy. Now recreate the aws-connector index. Set the \"body\" property as type \"semantic_text\", and add the id of your new inference endpoint. Get back to your connector and run another full-content sync (For real this time!). The incoming documents are going to be automatically chunked into blocks of 250 words, with an overlap of 100 words. You don't have to do anything explicitly. Now that's convenient! Sync your S3 connector for real this time! And it's done. Check out your aws-connector index, there'll be 140 documents in there, each of which is now an embedded chunk: Index full of chunked documents Do RAG with the Elastic Playground Scurry over to Search -> Build -> Playground and add an LLM connector of your choice. I'm using Azure OpenAI: Set your endpoint and API key Now let's set up a chat experience. Click Add Data Sources and select aws-connector: Set up your chat experience Check out the query tab of your new chat experience. Assuming everything was properly set up, it will automatically be set to this hybrid search query, with the model_id minilm-l6. Default hybrid search query Let's ask a question! We'll take three documents for the context, and add my special RAG prompt: Add a prompt and select the number of search results for context Query: Describe the fall from Grace of Melkor We'll use a relatively open-ended RAG query. To be answered satisfactorily, it will need to draw information from multiple parts of the text. This will be a good indicator of whether RAG is working as expected. Well I'm convinced. It even has citations! One more for good luck: Query: Who were the greatest students of Aule the Smith? This particular query is nothing too difficult, I'm simply looking for a reference to a very specific quote from the text. Let's see how it does! Well, that's correct. Looks like RAG is working just fine. Conclusion That was incredibly convenient and painless — hot damn! We're truly living in the future. I can definitely work with this. I hope you're as excited to try it as I am to show it off. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Collecting data Setting up the S3 connector Uploading an embedding model Setting up your semantic_text index Do RAG with the Elastic Playground Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Build RAG quickly with minimal code in Elastic 8.15 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/build-rag-in-elastic-815","meta_description":"Learn how to easily build a RAG pipeline with the S3 Connector, semantic_text datatype and Elastic Playground."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Serverless semantic search with ELSER in Python: Exploring Summer Olympic games history This blog shows how to fetch information from an Elasticsearch index, in a natural language expression, using semantic search. We will load previous olympic games data set and then use the ELSER model to perform semantic searches. How To EK By: Essodjolo Kahanam On July 26, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. This blog shows how to fetch information from an Elasticsearch index, in a natural language expression, using semantic search. We will create a serverless Elasticsearch project, load previous olympic games data set into an index, generate inferred data (in a sparse vector field) using the inference processor along with ELSER model, and finally search for historical olympic competition information in a natural language expression, thanks to text expansion query . The tools and the data set For this project we will use an Elasticsearch serverless project, and the serverless Python client (elasticsearch_serverless) for interactions with Elasticsearch. To create a serverless project, simply follow the get started with serverless guide. More information on serverless including pricing can be found here . When setting up a serverless project, be sure to select the option for Elasticsearch and the general purpose option for working this tutorial. The data set used is that of summer olympic games competitors from 1896 to 2020, obtained from Kaggle ( Athletes_summer_games.csv ). It contains information about the competition year, the type of competition, the name of the participant, whether they won a medal or not and which medal eventually, along with other information. For the data set manipulation, we will use Eland , a Python client and toolkit for DataFrames and machine learning in Elasticsearch. Finally the natural language processing (NLP) model used is Elastic Learned Sparse EncodeR ( ELSER ), a retrieval model trained by Elastic that allows to retrieve more relevant search results through semantic search. Before following the steps below, please make sure you have installed the serverless Python client and Eland. Please note the versions I used below. If you are not using the same versions, you might need to adjust the code to any eventual syntax change in the versions you are using. Download and deploy ELSER model We will use the Python client to download and deploy the ELSER model. Before doing that, let's first confirm that we can connect to our serverless project. The URL and API key below are read from environment variables; you need to use the appropriate values in your case, or use whichever method you prefer for reading credentials. If everything is properly configured, you should get an output like below: Now that we've confirmed that the Python client is successfully connecting to the serverless Elasticsearch project, let’s download and deploy the ELSER model. We will check if the model was previously deployed and delete it in order to perform a fresh install. Also, as the deploy phase could take a few minutes, we will continuously check the model configuration information to make sure that the model definition is present before moving to the next phase. For more information check the Get trained models API. Once we get the confirmation that the model is downloaded and ready to be deployed, we can go ahead and start ELSER. It can take a little while to fully be ready to be deployed. Load the data set into Elasticsearch using Eland eland.csv_to_eland allows reading a comma-separated values (csv) file into a data frame stored in an Elasticsearch index. We will use it to load the Olympics data ( Athletes_summer_games.csv ) into Elasticsearch. The es_type_overrides allows to override default mappings. After executing the lines above, the data will be written in the index elser-olympic-games . You can also retrieve the resulting dataframe ( eland.DataFrame ) into a variable for further manipulations. Create an ingest pipeline for inference based on ELSER The next step in our journey to explore past Olympic competition data using semantic search is to create an ingest pipeline containing an inference processor that runs the ELSER model. A set of fields has been selected and concatenated into a single field on which the inference processor will work. Depending on your use case, you might want to use another strategy. The concatenation is done using the script processor. The inference processor uses the previously deployed ELSER model, taking as input the concatenated field, and storing the output in a sparse vector type field (see following point). Preparing the index This is the last stage before being able to query past Olympic competition data using natural language expressions. We will update the previously created index’s mapping adding a sparse vector type field. Update the mapping: add a sparse vector field We will update the index mapping by adding a field that will hold the concatenated data, and a sparse vector field that will hold the inferred information computed by the inference processor using the ELSER model. Populate the sparse vector field We will run an update by query to call the previously created ingest pipeline in order to populate the sparse vector field in each document. The request will take a few moments depending on the number of documents, and the number of allocations and threads per allocation used for deploying ELSER. Once this step is completed, we can now start exploring past olympic data set using semantic search. Let's explore the Olympic data set using semantic search Now we will use text expansion queries to retrieve information about past Olympic game competitions using natural language expressions. Before going to the demonstration, let's create a function to retrieve and format the search results. The function above will receive a question about past Olympic games competition winners, performing a semantic search using Elastic’s text expansion query. The retrieved results are formatted and printed. Notice that we force the existence of medals in the query, as we are only interested in the winners. We also limited the size of the result to 3 as we expect three winners (gold, silver, bronze). Again, based on your use case, you might not necessarily do the same thing. 🏌️‍♂️ “Who won the Golf competition in 1900?” Request: Output: 🏃‍♀️ “2004 Women's Marathon winners” Request: Output: 🏹 “Women archery winners of 1908” Request: Output: 🚴‍♂️ “Who won the individual cycling competition in 1972?” Request: Output: Conclusion This blog showed how you can perform semantic search with the Elastic Learned Sparse EncodeR (ELSER) NLP model, in Python programming language using Serverless. You will want to make sure you turn off severless after running this tutorial to avoid any extra charges. To go further, feel free to check out our Elasticsearch Relevance Engine (ESRE) Engineer course where you can learn how to leverage the Elasticsearch Relevance Engine (ESRE) and large language models (LLMs) to build advanced RAG (Retrieval-Augmented Generation) applications that combine the storage, processing, and search features of Elasticsearch with the generative power of an LLM. The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to The tools and the data set Download and deploy ELSER model Load the data set into Elasticsearch using Eland Create an ingest pipeline for inference based on ELSER Preparing the index Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Serverless semantic search with ELSER in Python: Exploring Summer Olympic games history - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/serverless-semantic-search-with-elser-in-python","meta_description":"This blog shows how to fetch information from an Elasticsearch index, in a natural language expression, using semantic search. We will load previous olympic games data set and then use the ELSER model to perform semantic searches."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Text similarity search with vector fields This post explores how text embeddings and Elasticsearch’s new dense_vector type could be used to support similarity search. Vector Database JT By: Julie Tibshirani On October 6, 2022 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. From its beginnings as a recipe search engine , Elasticsearch was designed to provide fast and powerful full-text search. Given these roots, improving text search has been an important motivation for our ongoing work with vectors. In Elasticsearch 7.0, we introduced experimental field types for high-dimensional vectors, and now the 7.3 release brings support for using these vectors in document scoring. This post focuses on a particular technique called text similarity search. In this type of search, a user enters a short free-text query, and documents are ranked based on their similarity to the query. Text similarity can be useful in a variety of use cases: Question-answering: Given a collection of frequently asked questions, find questions that are similar to the one the user has entered. Article search: In a collection of research articles, return articles with a title that’s closely related to the user’s query. Image search: In a dataset of captioned images, find images whose caption is similar to the user’s description. A straightforward approach to similarity search would be to rank documents based on how many words they share with the query. But a document may be similar to the query even if they have very few words in common — a more robust notion of similarity would take into account its syntactic and semantic content as well. The natural language processing (NLP) community has developed a technique called text embedding that encodes words and sentences as numeric vectors. These vector representations are designed to capture the linguistic content of the text, and can be used to assess similarity between a query and a document. This post explores how text embeddings and Elasticsearch’s dense_vector type could be used to support similarity search. We’ll first give an overview of embedding techniques, then step through a simple prototype of similarity search using Elasticsearch. Note: Using text embeddings in search is a complex and evolving area. This blog is not a recommendation for a particular architecture or implementation. Start here to learn how you can enhance your search experience with the power of vector search . What are text embeddings? Let's take a closer look at different types of text embeddings, and how they compare to traditional search approaches. Word embeddings A word embedding model represents a word as a dense numeric vector. These vectors aim to capture semantic properties of the word — words whose vectors are close together should be similar in terms of semantic meaning. In a good embedding, directions in the vector space are tied to different aspects of the word’s meaning. As an example, the vector for \"Canada\" might be close to \"France\" in one direction, and close to \"Toronto\" in another. The NLP and search communities have been interested in vector representations of words for quite some time. There was a resurgence of interest in word embeddings in the past few years, when many traditional tasks were being revisited using neural networks. Some successful word embedding algorithms were developed, including word2vec and GloVe . These approaches make use of large text collections, and examine the context each word appears in to determine its vector representation: The word2vec Skip-gram model trains a neural network to predict the context words around a word in a sentence. The internal weights of the network give the word embeddings. In GloVe, the similarity of words depends on how frequently they appear with other context words. The algorithm trains a simple linear model on word co-occurrence counts. Many research groups distribute models that have been pre-trained on large text corpora like Wikipedia or Common Crawl, making them convenient to download and plug into downstream tasks. Although pre-trained versions are sometimes used directly, it can be helpful to adjust the model to fit the specific target dataset and task. This is often accomplished by running a 'fine-tuning' step on the pre-trained model. Word embeddings have proven quite robust and effective, and it is now common practice to use embeddings in place of individual tokens in NLP tasks like machine translation and sentiment classification. Sentence embeddings More recently, researchers have started to focus on embedding techniques that represent not only words, but longer sections of text. Most current approaches are based on complex neural network architectures, and sometimes incorporate labelled data during training to aid in capturing semantic information. Once trained, the models are able to take a sentence and produce a vector for each word in context, as well as a vector for the entire sentence. Similarly to word embedding, pre-trained versions of many models are available, allowing users to skip the expensive training process. While the training process can be very resource-intensive, invoking the model is much more lightweight — sentence embedding models are typically fast enough to be used as part of real-time applications. Some common sentence embedding techniques include InferSent , Universal Sentence Encoder , ELMo , and BERT . Improving word and sentence embeddings is an active area of research, and it’s likely that additional strong models will be introduced. Comparison to traditional search approaches In traditional information retrieval, a common way to represent text as a numeric vector is to assign one dimension for each word in the vocabulary. The vector for a piece of text is then based on the number of times each term in the vocabulary appears. This way of representing text is often referred to as \"bag of words,\" because we simply count word occurrences without regard to sentence structure. Text embeddings differ from traditional vector representations in some important ways: The encoded vectors are dense and relatively low-dimensional, often ranging from 100 to 1,000 dimensions. In contrast, bag of words vectors are sparse and can comprise 50,000+ dimensions. Embedding algorithms encode the text into a lower-dimensional space as part of modeling its semantic meaning. Ideally, synonymous words and phrases end up with a similar representation in the new vector space. Sentence embeddings can take the order of words into account when determining the vector representation. For example the phrase \"tune in\" may be mapped as a very different vector than \"in tune\". In practice, sentence embeddings often don’t generalize well to large sections of text. They are not commonly used to represent text longer than a short paragraph. Using embeddings for similarity search Let’s suppose we had a large collection of questions and answers. A user can ask a question, and we want to retrieve the most similar question in our collection to help them find an answer. We could use text embeddings to allow for retrieving similar questions: During indexing, each question is run through a sentence embedding model to produce a numeric vector. When a user enters a query, it is run through the same sentence embedding model to produce a vector. To rank the responses, we calculate the vector similarity between each question and the query vector. When comparing embedding vectors, it is common to use cosine similarity . This repository gives a simple example of how this could be accomplished in Elasticsearch. The main script indexes ~20,000 questions from the StackOverflow dataset , then allows the user to enter free-text queries against the dataset. We’ll soon walk through each part of the script in detail, but first let’s look at some example results. In many cases, the method is able to capture similarity even when there was not strong word overlap between the query and indexed question: \"zipping up files\" returns \"Compressing / Decompressing Folders & Files\" \"determine if something is an IP\" returns \"How do you tell whether a string is an IP or a hostname\" \"translate bytes to doubles\" returns \"Convert Bytes to Floating Point Numbers in Python\" Implementation details The script begins by downloading and creating the embedding model in TensorFlow. We chose Google’s Universal Sentence Encoder, but it’s possible to use many other embedding methods. The script uses the embedding model as-is, without any additional training or fine-tuning. Next, we create the Elasticsearch index, which includes mappings for the question title, tags, and also the question title encoded as a vector: In the mapping for dense_vector, we’re required to specify the number of dimensions the vectors will contain. When indexing a title_vector field, Elasticsearch will check that it has the same number of dimensions as specified in the mapping. To index documents, we run the question title through the embedding model to obtain a numeric array. This array is added to the document in the title_vector field. When a user enters a query, the text is first run through the same embedding model and stored in the parameter query_vector. As of 7.3, Elasticsearch provides a cosineSimilarity function in its native scripting language. So to rank questions based on their similarity to the user’s query, we use a script_score query: We make sure to pass the query vector as a script parameter to avoid recompiling the script () on every new query. Since Elasticsearch does not allow negative scores, it's necessary to add one to the cosine similarity. | Note: this blog post originally used a different syntax for vector functions that was available in Elasticsearch 7.3, but was deprecated in 7.6. | Important limitations The script_score query is designed to wrap a restrictive query, and modify the scores of the documents it returns. However, we’ve provided a match_all query, which means the script will be run over all documents in the index. This is a current limitation of vector similarity in Elasticsearch — vectors can be used for scoring documents, but not in the initial retrieval step. Support for retrieval based on vector similarity is an important area of ongoing work . To avoid scanning over all documents and to maintain fast performance, the match_all query can be replaced with a more selective query. The right query to use for retrieval is likely to depend on the specific use case. While we saw some encouraging examples above, it’s important to note that the results can also be noisy and unintuitive. For example, \"zipping up files\" also assigns high scores to \"Partial .csproj Files\" and \"How to avoid .pyc files?\". And when the method returns surprising results, it is not always clear how to debug the issue — the meaning of each vector component is often opaque and doesn’t correspond to an interpretable concept. With traditional scoring techniques based on word overlap, it is often easier to answer the question \"why is this document ranked highly?\" As mentioned earlier, this prototype is meant as an example of how embedding models could be used with vector fields, and not as a production-ready solution. When developing a new search strategy, it is critical to test how the approach performs on your own data, making sure to compare against a strong baseline like a match query. It may be necessary to make major changes to the strategy before it achieves solid results, including fine-tuning the embedding model for the target dataset, or trying different ways of incorporating embeddings such as word-level query expansion. Conclusions Embedding techniques provide a powerful way to capture the linguistic content of a piece of text. By indexing embeddings and scoring based on vector distance, we can compare documents using a notion of similarity that goes beyond their word-level overlap. We’re looking forward to introducing more functionality based around the vector field type. Using vectors for search is a nuanced and developing area — as always, we would love to hear about your use cases and experiences on Github and the Discuss forums ! Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to What are text embeddings? Word embeddings Sentence embeddings Comparison to traditional search approaches Using embeddings for similarity search Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Text similarity search with vector fields - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/text-similarity-search-with-vectors-in-elasticsearch","meta_description":"This post explores how text embeddings and Elasticsearch’s new dense_vector type could be used to support similarity search."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Optimizing vector distance computations with the Foreign Function & Memory (FFM) API Learn how to optimize vector distance computations using the Foreign Function & Memory (FFM) API to achieve faster performance. Lucene Vector Database CH By: Chris Hegarty On February 23, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. At the heart of any vector database are the distance functions that determine how close two vectors are. These distance functions are executed many times, both during indexing and searching. When merging segments or navigating the graph for nearest neighbors, much of the execution time is spent comparing vectors for similarity. Micro optimizing these distance functions is time well spent, we're already benefiting from similar previous optimizations, e.g. see SIMD , FMA . With the recent support for scalar quantization in both Lucene and Elasticsearch, we're now more than ever leaning on the byte variants of these distance functions. We know from previous experience that there's still the potential for significant performance improvements in these variants. Current state of play: The Panama Vector API When we leveraged the Panama Vector API to accelerate the distance functions in Lucene , much of the focus was on the float (32-bit) variants. We were quite happy with the performance improvements we managed to achieve for these. However, the improvements for the byte (8-bit) variants was a little disappointing - and believe me, we tried! The fundamental problem with the byte variants is that they do not take full advantage of the most optimal SIMD instructions available on the CPU. When doing arithmetic operations in Java, the narrowest type is int (32-bit). The JVM automatically sign-extends byte values to values of type int . Consider this simple scalar dot product implementation: The multiplication of elements from a and b is performed as if a and b are of type int , whose value is the byte value loaded from the appropriate array index sign-extended to int . Our SIMD-ized implementation must be equivalent, so we need to be careful to ensure that overflows when multiplying large byte values are not lost. We do this by explicitly widening the loaded byte values to short (16-bit), since we know that all signed byte values when multiplied will fit without loss into signed short . We then need a further widen to int (32-bit) when accumulating. Here's an excerpt from the inner loop body of Lucene's 128-bit dot product code: Visualizing this we can see that we're only processing 4 elements at a time. e.g. This is all fine, even with these explicit widening conversations, we get some nice speed up through the extra data parallelism of the arithmetic operations, just not as much as we know is possible. The reason we know that there is potential left is that each widening halves the number of lanes, which effectively halves the number of arithmetic operations. The explicit widening conversations are not being optimized by the JVM's C2 JIT compiler. Additionally, we're only accessing the lower half of the data - accessing anything other than the lower half just does not result in good machine code. This is where we're leaving potential performance \"on the table\". For now, this is as good as we can do in Java. Longer term, the Panama Vector API and/or C2 JIT compiler should provide better support for such operations, but for now, at least, this is as good as we can do. Or is it? Introducing the Foreign Function & Memory (FFM) API OpenJDK's project Panama has several different strands, we've already seen the Panama Vector API in action, but the flagship of the project is the Foreign Function & Memory API (FFM). The FFM API offers a low overhead for interacting with code and memory outside the Java runtime. The JVM is an amazing piece of engineering, abstracting away much of the differences between architectures and platforms, but sometimes it's not always possible for it to make the best tradeoffs, which is understandable. FFM can rescue us when the JVM cannot easily do so, by allowing the programmer to take things into her own hands if she doesn't like the tradeoff that's been made. This is one such area, where the tradeoff of the Panama Vector API is not the right one for byte sized vectors. FFM usage example We're already leveraging the foreign memory support in Lucene to mediate safer access to mapped off-heap index data. Why not use the foreign invocation support to call already optimized distance computation functions? Since our distance computation functions are tiny, and for some set of deployments and architectures for which we already know the optimal set of CPU instructions, why not just write the small block of native code that we want. Then invoke it through the foreign invocation API. Going foreign Elastic Cloud has a profile that is optimized for vector search. This profile targets the ARM architecture, so let's take a look at how we might optimize for this. Let's write our distance function, say dot product, in C with some ARM Neon intrinsics. Again, we'll focus on the inner body of the loop. Here's what that looks like: We load 16 8-bit values from our a and b vectors into va8 and vb8 , respectively. We then multiply the lower half and store the result in va16 - this result holds 8 16-bit values and the operation implicitly handles the widening. Similar with the higher half. Finally, since we operated on the full original 16 values, it's faster to use to two accumulators to store the results. The vpadalq_s16 add and accumulate intrinsic knows how to widen implicitly as it accumulates into 4 32-bit values. In summary, we've operated on all 16 byte values per loop iteration. Nice! The disassembly for this is very clean and mirrors the above instrinsics. Neon SIMD on ARM has arithmetic instructions that offer the semantics we want without having to do the extra explicit widening. The C instrinsics expose these instructions for use in a way that we can leverage. The operations on registers densely packed with values is much cleaner than what we can do with the Panama Vector API. Back in Java-land The last piece of the puzzle is a small \"shim\" layer in Java that uses the FFM API to link to our foreign code. Our vector data is off-heap, we map it with a MemorySegment , and determine offsets and memory addresses based on the vector dimensions. The dot product method looks like this: We have a little more work to do here since this is now platform-specific Java code, so we only execute it on aarch64 platforms, falling back to an alternative implementation on other platforms. So is it actually faster than the Panama Vector code? Performance improvements with FFM API Micro benchmarks of the above dot product for signed byte values show a performance improvement of approximately 6 times, than that of the Panama Vector code. And this includes the overhead of the foreign call. The primary reason for the speedup is that we're able to pack the full 128-bit register with values and operate on all of them without explicitly moving or widening the data. Macro benchmarks, SO_Dense_Vector with scalar quantization enabled, shows significant improvements in merge times, approximately 3 times faster - the experiment only plugged in the optimized dot product for segment merges. We expect search benchmarks to show improvement too. Summary Recent advancements in Java, namely the FFM API, allows to interoperate with native code in a much more performant and straightforward way than was previously possible. Significant performance benefits can be had by providing micro-optimized platform-specific vector distance functions that are called through FFM. We're looking forward to a future version of Elasticsearch where scalar quantized vectors can take advantage of this performance improvement. And of course, we're giving a lot of thought to how this relates to Lucene and even the Panama Vector API, to determine how these can be improved too. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Integrations Vector Database March 19, 2025 Exploring GPU-accelerated Vector Search in Elasticsearch with NVIDIA Powered by NVIDIA cuVS, the collaboration looks to provide developers with GPU-acceleration for vector search in Elasticsearch. CH HM By: Chris Hegarty and Hemant Malik Jump to Current state of play: The Panama Vector API Introducing the Foreign Function & Memory (FFM) API FFM usage example Going foreign Back in Java-land Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Optimizing vector distance computations with the Foreign Function & Memory (FFM) API - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/vector-similarity-computations-ludicrous-speed","meta_description":"Learn how to optimize vector distance computations using the Foreign Function & Memory (FFM) API to achieve faster performance."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Open Crawler now in beta The Open Crawler is now in beta. This latest version 0.2 update also comes with several new features. Ingestion NF By: Navarone Feekery On September 17, 2024 Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. We have released version 0.2 of Open Crawler, which has also been promoted to beta ! Open Crawler was initially released (version 0.1 ) in June 2024 as a tech-preview . Since then, we've been iterating on the product and have added several new features. To get access to these changes you can use the latest Docker artifact , or download the source code directly from the GitHub repository . Follow the setup instructions in our documentation to get started. What's new in the Open Crawler? A list of every change can be found in our changelog in the Open Crawler repository. In this blog we will go over only new features, and configuration format changes Open Crawler features Feature Description Extraction rules Allows for the extraction of HTML content using CSS and XPath selectors, and URL content using regex. Binary content extraction Allows for the extraction of binary content from file types supported by the Apache Tika project. Crawl rules Used to enable or disable certain URL patterns from being crawled and ingested. Purge crawls Deletes outdated documents from the index at the end of a crawl job. Scheduling Recurrent crawls can be scheduled based on a cron expression. Configuration changes in the Open Crawler Among the new features, the config.yml file format has changed for a few fields, so existing configuration files will not work between 0.1 and 0.2 . Notably, the configuration field domain_allowlist has been changed to domains , and seed_urls is now a subset of domains instead of a top-level field. This change was made so new features like extraction rules and crawl rules could be applied to specific domains, while allowing a single crawler to configure multiple domains for a crawl job. Make sure to reference the updated config.yml.example file to fix your configuration. Here is an example for migrating a 0.1 configuration to 0.2 : Showcase 1: crawl rules We're very excited to bring the crawl rules feature to Open Crawler. This is an existing feature in the Elastic Crawler . The biggest difference for crawl rules between Open Crawler and Elastic Crawler is the way these rules are configured. Elastic Crawler is configured using Kibana, while Open Crawler has crawl rules defined for a domain in the crawler.yml config file. Crawling only specific endpoints When determining if a URL is crawlable, Open Crawler will execute crawl rules in order from top to bottom. In this example below, we want to crawl only the content of https://www.elastic.co/search-labs . Because this has links to other URLs within the https://www.elastic.co domain, it's not enough to limit just the seed_urls to this entry point. Using crawl rules, we need two more rules: An allow rule for everything under (and including) the /search-labs URL pattern A deny everything rule to catch all other URLs In this example we are using a regex pattern for the deny rule. If I want to add another URL to ingest to this configuration (for example, /security-labs ), I need to: Add it as a seed_url Add it to the crawl_rules above the deny all rule Through this manner of configuration, you can be very specific about what webpages Crawler will ingest. If you have debug logs enabled , each denied URL will show up in the logs like this: Here's an actual example from my crawl results: Crawling everything except a specific endpoint This pattern is much easier to implement, as Crawler will crawl everything by default. All that is needed is to add a deny rule for URL pattern that you want to exclude. In this example, I want to crawl the entire https://www.elastic.co website, except for anything under /search-labs . Because I want to crawl everything , seed_urls is not needed for this configuration. Now if I run a crawl, Crawler will not ingest webpages with URLs that begins with /search-labs . Showcase 2: extraction rules Extraction rules are another much-asked-for feature for Open Crawler. Like crawl rules, extraction rules function almost the same as they do for Elastic Crawler , except for how they are configured. Extraction rules are configured under extraction_rulesets , which belong to a single item from domains . Getting the CSS selector For this example, I want to extract the authors' names for each blog article in /search-labs and assign it to the field authors . Without extraction rules, each blog's Elasticsearch document will have the author names buried in the body field. Using my browser developer tools (in my case, the Firefox dev tools ), I can visit the webpage and use the selector tool to find what CSS selectors an HTML element has. I can now see that the authors are stored in a <p> element with a few different classes, but most eye-catching is the class .author-name . Now, to test that using the selector .author-name is enough to fetch only the author name from this field, I can use the dev tools HTML search feature . Unfortunately, I can see that using only this class name returns 11 results for this blog post. After some investigation, I found that this is because the \"Recommended articles\" section at the bottom of a page also uses the .author-name class. To remedy this, we need a more restrictive selector. Examining the HTML code directly, I can see that the side-bar containing the author name that I want to extract is nested a few levels under a class called .sticky . This class refers to the sidebar that contains the author name I want to extract. We can combine these selectors into a single selector .sticky .author-name that will only search for .author-name classes that are nested within .sticky classes. We can then test this in the same HTML search bar as before, and ta-da ! Only one hit -- we've found our CSS selector! Configuring the extraction rules Now we can add the CSS selector from the previous step. We also need to define the url_filters for this rule. This will determine which endpoints the extraction rule is executed against. All articles for search labs fall under the format https://www.elastic.co/search-labs/blog/<slug> , so this can be achieved with a simple regex pattern: /search-labs/blog/.+$ . /search-labs/blog/ asserts the start of the URL .+ matches any character except line breaks $ marks the end of the string This stops sub-URLs like https://www.elastic.co/search-labs/blog/<slug>/<something-else> from having this extraction rule In this example we will also utilize crawl rules, to avoid crawling the entire https://www.elastic.co website. After completing a crawl with the above configuration, I can check for the new author field in the ingested documents. I can do this using a _search query to find articles written by the author Sebastien Guilloux . And we have a single hit! Showcase 3: combining it all with Semantic Text Jeff Vestal wrote a fantastic article combining Open Crawler with Semantic Text search, among other cool RAG things. Read up on that here . Comparing with Elastic Crawler We now maintain a feature comparison table on the Open Crawler repository to compare the features available for Open Crawler vs Elastic Crawler. Open Crawler next steps The next release will bring Open Crawler to version 1.0 , and will also promote it to GA (generally available). We don't have a release date planned for this version yet. We do have a general idea of some features we want to include: Extraction using data attributes and meta tags Full HTML extraction Send event logs to Elasticsearch This list is not exhaustive, and depending on user feedback we will include other features in the 1.0 GA release. If there are other features you would like to see included, feel free to create an enhancement issue directly on the Open Crawler repository. Feedback like this will help us prioritize what to include in the next release. Report an issue Related content Integrations Ingestion +1 March 7, 2025 Ingesting data with BigQuery Learn how to index and search Google BigQuery data in Elasticsearch using Python. JR By: Jeffrey Rengifo Integrations Ingestion +1 February 19, 2025 Elasticsearch autocomplete search Exploring different approaches to handling autocomplete, from basic to advanced, including search as you type, query time, completion suggester, and index time. AK By: Amit Khandelwal Integrations Ingestion +1 February 18, 2025 Exploring CLIP alternatives Analyzing alternatives to the CLIP model for image-to-image, and text-to-image search. JR TM By: Jeffrey Rengifo and Tomás Murúa Ingestion How To February 4, 2025 How to ingest data to Elasticsearch through Logstash A step-by-step guide to integrating Logstash with Elasticsearch for efficient data ingestion, indexing, and search. AL By: Andre Luiz Integrations Ingestion +1 February 3, 2025 Elastic Playground: Using Elastic connectors to chat with your data Learn how to use Elastic connectors and Playground to chat with your data. We'll start by using connectors to search for information in different sources. JR TM By: Jeffrey Rengifo and Tomás Murúa Jump to What's new in the Open Crawler? Open Crawler features Configuration changes in the Open Crawler Showcase 1: crawl rules Crawling only specific endpoints Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Open Crawler now in beta - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-open-crawler-beta-release","meta_description":"Elastic's Open Crawler version 0.2 is in beta. Explore the Open Crawler features, configuration changes and its crawl & extraction rules."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Scalar quantization 101 Understand what scalar quantization is, how it works and its benefits. This guide also covers the math behind quantization and examples. Lucene ML Research BT By: Benjamin Trent On October 25, 2023 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction to scalar quantization Most embedding models output f l o a t 32 float32 f l o a t 32 vector values. While this provides the highest fidelity, it is wasteful given the information that is actually important in the vector. Within a given data set, embeddings never require all 2 billion options for each individual dimension. This is especially true on higher dimensional vectors (e.g. 386 dimensions and higher). Quantization allows for vectors to be encoded in a lossy manner, thus reducing fidelity slightly with huge space savings. Understanding buckets in scalar quantization Scalar quantization takes each vector dimension and buckets them into some smaller data type. For the rest of the blog, we will assume quantizing f l o a t 32 float32 f l o a t 32 values into i n t 8 int8 in t 8 . To bucket values accurately, it isn't as simple as rounding the floating point values to the nearest integer. Many models output vectors that have dimensions continuously on the range [ − 1.0 , 1.0 ] [-1.0, 1.0] [ − 1.0 , 1.0 ] . So, two different vector values 0.123 and 0.321 could both be rounded down to 0. Ultimately, a vector would only use 2 of its 255 available buckets in i n t 8 int8 in t 8 , losing too much information. Figure 1: Illustration of quantization goals, bucketing continuous values from − 1.0 -1.0 − 1.0 to 1.0 1.0 1.0 into discrete i n t 8 int8 in t 8 values. The math behind the numerical transformation isn't too complicated. Since we can calculate the minimum and maximum values for the floating point range, we can use min-max normalization and then linearly shift the values. i n t 8 ≈ 127 m a x − m i n × ( f l o a t 32 − m i n ) int8 \\approx \\frac{127}{max - min} \\times (float32 - min) in t 8 ≈ ma x − min 127 ​ × ( f l o a t 32 − min ) f l o a t 32 ≈ m a x − m i n 127 × i n t 8 + m i n float32 \\approx \\frac{max - min}{127} \\times int8 + min f l o a t 32 ≈ 127 ma x − min ​ × in t 8 + min Figure 2: Equations for transforming between i n t 8 int8 in t 8 and f l o a t 32 float32 f l o a t 32 . Note, these are lossy transformations and not exact. In the following examples, we are only using positive values within int8. This aligns with the Lucene implementation. The role of statistics in scalar quantization A quantile is a slice of a distribution that contains a certain percentage of the values. So, for example, it may be that 99 % 99\\% 99% of our floating point values are between [ − 0.75 , 0.86 ] [-0.75, 0.86] [ − 0.75 , 0.86 ] instead of the true minimum and maximum values of [ − 1.0 , 1.0 ] [-1.0, 1.0] [ − 1.0 , 1.0 ] . Any values less than -0.75 and greater than 0.86 are considered outliers. If you include outliers when attempting to quantize results, you will have fewer available buckets for your most common values. And fewer buckets can mean less accuracy and thus greater loss of information. Figure 3: Illustration of the 99 % 99\\% 99% confidence interval and the individual quantile values. 99 % 99\\%% 99% of all values fall within the range [ − 0.75 , 0.86 ] [-0.75, 0.86] [ − 0.75 , 0.86 ] . This is all well and good, but now that we know how to quantize values, how can we actually calculate distances between two quantized vectors? Is it as simple as a regular dot_product ? The role of algebra in scalar quantization We are still missing one vital piece, how do we calculate the distance between two quantized vectors. While we haven't shied away from math yet in this blog, we are about to do a bunch more. Time to break out your pencils and try to remember polynomials and basic algebra. The basic requirement for dot_product and cosine similarity is being able to multiply floating point values together and sum up their results. We already know how to transform between f l o a t 32 float32 f l o a t 32 and i n t 8 int8 in t 8 values, so what does multiplication look like with our transformations? f l o a t 3 2 i × f l o a t 3 2 i ′ ≈ ( m a x − m i n 127 × i n t 8 i + m i n ) × ( m a x − m i n 127 × i n t 8 i ′ + m i n ) float32_i \\times float32'_i \\approx (\\frac{max - min}{127} \\times int8_i + min) \\times (\\frac{max - min}{127} \\times int8'_i + min) f l o a t 3 2 i ​ × f l o a t 3 2 i ′ ​ ≈ ( 127 ma x − min ​ × in t 8 i ​ + min ) × ( 127 ma x − min ​ × in t 8 i ′ ​ + min ) We can then expand this multiplication and to simplify we will substitute α \\alpha α for m a x − m i n 127 \\frac{max - min}{127} 127 ma x − min ​ . α 2 × i n t 8 i × i n t 8 i ′ + α × i n t 8 i × m i n + α × i n t 8 i ′ × m i n + m i n 2 \\alpha^2 \\times int8_i \\times int8'_i + \\alpha \\times int8_i \\times min + \\alpha \\times int8'_i \\times min + min^2 α 2 × in t 8 i ​ × in t 8 i ′ ​ + α × in t 8 i ​ × min + α × in t 8 i ′ ​ × min + mi n 2 What makes this even more interesting, is that only one part of this equation requires both values at the same time. However, dot_product isn't just two floats being multiplied, but all the floats for each dimension of the vector. With vector dimension count d i m dim d im in hand, all the following can be pre-calculated at query time and storage time. d i m × α 2 dim\\times\\alpha^2 d im × α 2 is just d i m × ( m a x − m i n 127 ) 2 dim\\times(\\frac{max-min}{127})^2 d im × ( 127 ma x − min ​ ) 2 and can be stored as a single float value. ∑ i = 0 d i m − 1 m i n × α × i n t 8 i \\sum_{i=0}^{dim-1}min\\times\\alpha\\times int8_i ∑ i = 0 d im − 1 ​ min × α × in t 8 i ​ and ∑ i = 0 d i m − 1 m i n × α × i n t 8 i ′ \\sum_{i=0}^{dim-1}min\\times\\alpha\\times int8'_i ∑ i = 0 d im − 1 ​ min × α × in t 8 i ′ ​ can be pre-calculated and stored as a single float value or calculated once at query time. d i m × m i n 2 dim\\times min^2 d im × mi n 2 can be pre-calculated and stored as a single float value. Of all this: d i m × α 2 × d o t P r o d u c t ( i n t 8 , i n t 8 ′ ) + ∑ i = 0 d i m − 1 m i n × α × i n t 8 i + ∑ i = 0 d i m − 1 m i n × α × i n t 8 i ′ + d i m × m i n 2 dim \\times \\alpha^2 \\times dotProduct(int8, int8') + \\sum_{i=0}^{dim-1}min\\times\\alpha\\times int8_i + \\sum_{i=0}^{dim-1}min\\times\\alpha\\times int8'_i + dim\\times min^2 d im × α 2 × d o tP ro d u c t ( in t 8 , in t 8 ′ ) + i = 0 ∑ d im − 1 ​ min × α × in t 8 i ​ + i = 0 ∑ d im − 1 ​ min × α × in t 8 i ′ ​ + d im × mi n 2 The only calculation required for dot_product is just d o t P r o d u c t ( i n t 8 , i n t 8 ′ ) dotProduct(int8, int8') d o tP ro d u c t ( in t 8 , in t 8 ′ ) with some pre-calculated values combined with the result. Ensuring accuracy in quantization So, how is this accurate at all? Aren't we losing information by quantizing? Yes, we are, but quantization takes advantage of the fact that we don't need all the information. For learned embeddings models, the distributions of the various dimensions usually don't have fat-tails . This means they are localized and fairly consistent. Additionaly, the error introduced per dimension via quantization is independent. Meaning, the error cancels out for our typical vector operations like dot_product. Conclusion Whew, that was a ton to cover. But now you have a good grasp of the technical benefits of quantization, the math behind it, and how you can calculate the distances between vectors while accounting for the linear transformation. Look next at how we implemented this in Lucene and some of the unique challenges and benefits available there. Report an issue Related content Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Jump to Introduction to scalar quantization Understanding buckets in scalar quantization The role of statistics in scalar quantization The role of algebra in scalar quantization Ensuring accuracy in quantization Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Scalar quantization 101 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/scalar-quantization-101","meta_description":"Understand what scalar quantization is, how it works and its benefits. This guide also covers the math behind quantization and examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Apache Lucene 10 is out! Improvements to Lucene's hardware efficiency & more Apache Lucene 10 has been released, with a focus on hardware efficiency! Check out the main release highlights. Lucene AG By: Adrien Grand On October 14, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Apache Lucene 10 is finally out! With more than 2,000 commits from 185 unique contributors since Lucene 9.0 - which was released in December 2021 , almost 3 years ago - a lot has been going on. To be fair, a majority of these changes have been delivered in 9.x minor releases. However, the most ambitious changes usually need a major version, such as the introduction of multi-dimensional points in Lucene 6.0, dynamic pruning in 8.0, or vector search in 9.0. In 10.0, the area of focus for Lucene has been hardware efficiency, ie. making Lucene better at taking advantage of modern hardware. Let me guide you through the main release highlights. Lucene 10 release highlights More search parallelism For many years now, Lucene has had the ability to parallelize search execution, by creating groups of segments, searching each group in a different thread, and combining results in the end. One downside of this approach is that it couples the index geometry - how the index is organized into segments - and search parallelism. For instance, an index that has been force-merged down to a single segment can no longer take advantage of multiple execution threads to be searched. Quite disappointing when modern CPUs commonly have tens of cores! To overcome this limitation, Lucene's query evaluation logic now allows splitting the index into logical partitions, that no longer need to be aligned with segments. For instance, an index that has been force-merged down to a single segment could still be sliced into 10 logical partitions that have one tenth of the documents of this segment each. This change will help increase search parallelism, especially on machines that have many cores and/or on indexes that have few segments on their highest tier. This change doesn't work nicely with queries that have a high cost for creating a Scorer yet - such as range queries and prefix queries, but we're hoping to lift this limitation in upcoming minor releases. Better I/O parallelism Until now, Lucene would use synchronous I/O and perform at most one I/O operation at a time per search thread. For indexes that significantly exceed the size of the page cache, this could lead to queries being bound on I/O latency, while the host is still very far from maxing out IOPS. Frustrating! To overcome this, Lucene's Directory abstraction introduced a new IndexInput#prefetch API, to let the OS know about regions of files that it is about to read. The OS can then parallelize retrieving pages that intersect with these regions, within a single OS thread. For instance, a BooleanQuery with TermQuery clauses would now perform the I/O of terms dictionary lookups in parallel and then retrieve the first couple pages of each postings list in parallel, within a single execution thread. MMapDirectory , Lucene's default Directory implementation, implements this prefetch API using madvise 's MADV_WILLNEED advice on Linux and Mac OS. We are very excited about this change, which has already proved to help on fast local NVMe disks, and will further help on storage systems that have worse latencies while retaining good parallelism such as network-attached disks (GCP persistent storage, Amazon EBS, Azure managed disks) or even object storage (GCP Cloud storage, Amazon S3, Azure blob storage). Better CPU efficiency and storage efficiency with sparse indexing Lucene 10 introduces support for sparse indexing , sometimes called primary-key indexing or zone indexing in other data stores. The idea is simple: if your data is stored on disk in sorted order , then you can organize it into blocks, record the minimum and maximum values per block, and your queries will be able to take advantage of this information to skip blocks that don't intersect with the query, or to fully match blocks that are contained by the query. Only blocks that partially intersect with the query will need further inspection, and the challenge consists of picking the best index sort that will minimize the number of such blocks. Lucene's sparse indexes are currently implemented via 4 levels of blocks that have 4k, 32k, 256k and 2M docs each respectively. When done right, this form of indexing is extremely space-efficient (only a few bytes per block) and CPU-efficient (can make a decision about whether thousands of documents match or not with only a few CPU instructions). The downside is that the index can only be stored in a single order on disk, so not all fields can benefit from it. Typically, the index would be sorted on the main dimensions of the data. For instance, for an e-commerce catalog containing products, these dimensions could be the category and the brand of the products. Conclusion Note that some hardware-efficiency-related changes have also been released in 9.x minor releases. In particular, it's worth highlighting that: Lucene now takes advantage of explicit vectorization when comparing vectors and decoding postings , Lucene's concurent search execution logic performs work stealing in order to reduce the overhead of forking tasks, Lucene's postings format has been updated to have a more sequential access pattern, Lucene now passes a MADV_RANDOM advice when opening files that have a random-access pattern. We are pretty excited about this new Lucene release and the hardware-efficiency focus. In case you are curious to learn more about these improvements, we will be writing more detailed blogs about them in the coming weeks. Stay tuned. Report an issue Related content Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Lucene Vector Database February 27, 2025 Filtered HNSW search, fast mode Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation. BT By: Benjamin Trent Lucene February 7, 2025 Concurrency bugs in Lucene: How to fix optimistic concurrency failures Thanks to Fray, a deterministic concurrency testing framework from CMU’s PASTA Lab, we tracked down a tricky Lucene bug and squashed it BT AL By: Benjamin Trent and Ao Li Vector Database Lucene +1 January 7, 2025 Early termination in HNSW for faster approximate KNN search Learn how HNSW can be made faster for KNN search, using smart early termination strategies. TT By: Tommaso Teofili Lucene Vector Database January 6, 2025 Optimized Scalar Quantization: Improving Better Binary Quantization (BBQ) Here we explain optimized scalar quantization in Elasticsearch and how we used it to improve Better Binary Quantization (BBQ). BT By: Benjamin Trent Jump to Lucene 10 release highlights More search parallelism Better I/O parallelism Better CPU efficiency and storage efficiency with sparse indexing Conclusion Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Apache Lucene 10 is out! Improvements to Lucene's hardware efficiency & more - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/apache-lucene-10-release-highlights","meta_description":"Apache Lucene 10 is here! Discover Lucene 10 release highlights: search parallelism, better I/O performance, and sparse indexing for better CPU and storage efficiency. "}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch .NET client evolution: From NEST to Elastic.Clients.Elasticsearch Learn about the evolution of the Elasticsearch .NET client and the transition from NEST to Elastic.Clients.Elasticsearch. .NET How To FB By: Florian Bernd On April 16, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction to .NET client and NEST In the .NET world, integration with Elasticsearch has long been facilitated by the NEST library, which serves as a robust interface for developers to interact with Elasticsearch's powerful search and analytics capabilities. NEST , born out of a need for a native .NET client for Elasticsearch, quickly gained popularity among developers for its rich feature set and seamless integration capabilities. For nearly 14 years and only 8 months after Elasticsearch's first commit NEST has been faithfully tracking Elasticsearch releases. Transitioning from NEST to Elastic.Clients.Elasticsearch As Elasticsearch evolved, maintaining NEST 's complex codebase became increasingly difficult. We recognized the need for a more sustainable approach to client development and went on a journey to redesign the .NET client from the ground up. It took us almost a year to release a first beta version and another year to get close to supporting every single server endpoint. One of the most difficult decisions was to reduce the scope of the library in order to prioritize maintainability instead. Given the size of the Elasticsearch API surface today, it is no longer practical to maintain over 450 endpoints and nearly 3000 types (requests, responses, queries, aggregations, etc.) by hand. To ensure consistent, accurate, and timely alignment between language clients and Elasticsearch, the 8.x clients, and many of the associated types are now automatically code-generated from a shared specification . This is a common solution to maintaining alignment between client and server among SDKs and libraries, such as those for Azure, AWS and the Google Cloud Platform. The Elasticsearch specification was created over 8 years ago by exporting the type mappings from NEST and through the hard work of the clients team we can now use the same specification to create a new .NET client (and clients for multiple other languages like Java, Go, etc.). With the release of version 8.13, the deprecation of NEST was officially announced. As Elasticsearch transitions to Elastic.Clients.Elasticsearch , NEST will gradually phase out, reaching its end-of-life at the close of the year. Developers are strongly encouraged to commence migration efforts early to ensure a smooth transition and mitigate any potential disruptions. Embracing Elastic.Clients.Elasticsearch not only ensures compatibility with the latest server features but also future-proofs applications against deprecated functionality. Elastic.Clients.Elasticsearch: features and changes overview Switching to the v8 client Elastic.Clients.Elasticsearch enables access to all the new features of Elasticsearch 8 and also brings numerous modernizations to the library itself but also implies a reduction in convenience features compared to its predecessor. Some of the new core features include the query language ES|QL , modern machine learning (ML) capabilities and improved diagnostics in the form of OpenTelemetry-compatible activities. Starting with version 8.13, Elastic.Clients.Elasticsearch supports almost all server features of Elasticsearch 8. An important breaking change, for example, is related to aggregations. In NEST , the fluent API usage looks like this: while the v8 client requires the following syntax: Migrating from NEST v7 to .NET client v8 A comprehensive migration guide is available here: Migration guide: From NEST v7 to .NET Client v8 . Additional resources Elastic.Clients.Elasticsearch v8 Client on GitHub Elastic.Clients.Elasticsearch v8 Client on NuGet Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Introduction to .NET client and NEST Transitioning from NEST to Elastic.Clients.Elasticsearch Elastic.Clients.Elasticsearch: features and changes overview Migrating from NEST v7 to .NET client v8 Additional resources Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch .NET client evolution: From NEST to Elastic.Clients.Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/net-client-evolution","meta_description":"Learn about the evolution of the Elasticsearch .NET client and the transition from NEST to Elastic.Clients.Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog RaBitQ binary quantization 101 Understand the most critical components of RaBitQ binary quantization, how it works and its benefits. This guide also covers the math behind the quantization and examples. Vector Database Lucene ML Research JW By: John Wagster On October 22, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Introduction As we have discussed previously in Scalar Quantization 101 most embedding models output f l o a t 32 float32 f l o a t 32 vector values, which is often excessive to represent the vector space. Scalar quantization techniques greatly reduce the space needed to represent these vectors. We've also previously talked about Bit Vectors In Elasticsearch and how binary quantization can often be unacceptably lossy. With binary quantization techniques such as those presented in the RaBitQ paper we can address the problems associated with naively quantizing into a bit vector and maintain the quality associated with scalar quantization by more thoughtfully subdividing the space and retaining residuals of the transformation. These newer techniques allow for better optimizations and generally better results over other similar techniques like product quantization (PQ) in distance calculations and a 32x level of compression that typically is not possible with scalar quantization. Here we'll walk through some of the core aspects of binary quantization and leave the mathematical details to the RaBitQ paper . Building the Bit Vectors Because we can more efficiently pre-compute some aspects of the distance computation, we treat the indexing and query construction separately. To start with let's walk through indexing three very simple 2 dimensional vectors, v 1 v_1 v 1 ​ , v 2 v_2 v 2 ​ , and v 3 v_3 v 3 ​ , to see how they are transformed and stored for efficient distance computations at query time. v 1 = [ 0.56 , 0.82 ] v_1 = [0.56, 0.82] v 1 ​ = [ 0.56 , 0.82 ] v 2 = [ 1.23 , 0.71 ] v_2 = [1.23, 0.71] v 2 ​ = [ 1.23 , 0.71 ] v 3 = [ - 3.28 , 2.13 ] v_3 = [\\text{-}3.28, 2.13] v 3 ​ = [ - 3.28 , 2.13 ] Our objective is to transform these vectors into much smaller representations that allow: A reasonable proxy for estimating distance rapidly Some guarantees about how vectors are distributed in the space for better control over the total number of data vectors needed to recall the true nearest neighbors We can achieve this by: Shifting each vector to within a hyper-sphere, in our case a 2d circle, the unit circle Snapping each vector to a single representative point within each region of the circle Retaining corrective factors to better approximate the distance between each vector and the query vector Let's unpack that step by step. Find a Representative Centroid In order to partition each dimension we need to pick a pivot point. For simplicity, we'll select one point to use to transform all of our data vectors. Let's continue with our vectors, v 1 v_1 v 1 ​ , v 2 v_2 v 2 ​ , and v 3 v_3 v 3 ​ . We pick their centroid as the pivot point. v 1 = [ 0.56 , 0.82 ] v 2 = [ 1.23 , 0.71 ] v 3 = [ - 3.28 , 2.13 ] c = [ ( 0.56 + 1.23 + - 3.28 ) / 3 , ( 0.82 + 0.71 + 2.13 ) / 3 ] c = [ - 0.49 , 1.22 ] v_1 = [0.56, 0.82] \\newline v_2 = [1.23, 0.71] \\newline v_3 = [\\text{-}3.28, 2.13] \\newline ~\\\\ c = [(0.56 + 1.23 + \\text{-}3.28) / 3, (0.82 + 0.71 + 2.13) / 3] \\newline ~\\\\ c = [\\text{-}0.49, 1.22] v 1 ​ = [ 0.56 , 0.82 ] v 2 ​ = [ 1.23 , 0.71 ] v 3 ​ = [ - 3.28 , 2.13 ] c = [( 0.56 + 1.23 + - 3.28 ) /3 , ( 0.82 + 0.71 + 2.13 ) /3 ] c = [ - 0.49 , 1.22 ] Here's all of those points graphed together: Figure 1: graph of the example vectors and the derived centroid of those three vectors. Each residual vector is then normalized . We'll call these v c 1 v_{c1} v c 1 ​ , v c 2 v_{c2} v c 2 ​ , and v c 3 v_{c3} v c 3 ​ . v c 1 = ( v 1 − c ) / ∥ v 1 − c ∥ v c 2 = ( v 2 − c ) / ∥ v 2 − c ∥ v c 3 = ( v 3 − c ) / ∥ v 3 − c ∥ v_{c1} = (v_1 - c) / \\|v_1 - c\\| \\newline v_{c2} = (v_2 - c) / \\|v_2 - c\\| \\newline v_{c3} = (v_3 - c) / \\|v_3 - c\\| \\newline v c 1 ​ = ( v 1 ​ − c ) /∥ v 1 ​ − c ∥ v c 2 ​ = ( v 2 ​ − c ) /∥ v 2 ​ − c ∥ v c 3 ​ = ( v 3 ​ − c ) /∥ v 3 ​ − c ∥ Let's do the math for one of the vectors together: v c 1 = ( v 1 − c ) / ∥ v 1 − c ∥ v 1 − c = [ 0.56 , 0.82 ] − [ - 0.49 , 1.22 ] = [ 1.05 , - 0.39 ] ∥ v 1 − c ∥ = 1.13 v c 1 = ( v 1 − c ) / ∥ v 1 − c ∥ = ( [ 1.05 , - 0.39 ] ) / ∥ [ 1.05 , - 0.39 ] ∥ = ( [ 1.05 , - 0.39 ] ) / 1.13 = [ 0.94 , - 0.35 ] v_{c1} = (v_1 - c) / \\|v_1 - c\\| \\newline ~\\\\ \\begin{align*} v_1 - c &= [0.56, 0.82] - [\\text{-}0.49, 1.22] \\newline &= [1.05, \\text{-}0.39] \\end{align*} \\newline ~\\\\ \\|v_1 - c\\| = 1.13 \\newline ~\\\\ \\begin{align*} v_{c1} &= (v_1 - c) / \\|v_1 - c\\| \\newline &= ([1.05, \\text{-}0.39]) / \\|[1.05, \\text{-}0.39]\\| \\newline &= ([1.05, \\text{-}0.39]) / 1.13 \\newline &= [0.94, \\text{-}0.35] \\end{align*} \\newline v c 1 ​ = ( v 1 ​ − c ) /∥ v 1 ​ − c ∥ v 1 ​ − c ​ = [ 0.56 , 0.82 ] − [ - 0.49 , 1.22 ] = [ 1.05 , - 0.39 ] ​ ∥ v 1 ​ − c ∥ = 1.13 v c 1 ​ ​ = ( v 1 ​ − c ) /∥ v 1 ​ − c ∥ = ([ 1.05 , - 0.39 ]) /∥ [ 1.05 , - 0.39 ] ∥ = ([ 1.05 , - 0.39 ]) /1.13 = [ 0.94 , - 0.35 ] ​ And for each of the remaining vectors: v c 2 = [ 0.96 , - 0.28 ] v c 3 = [ - 0.95 , 0.31 ] v_{c2} = [0.96, \\text{-}0.28] \\newline v_{c3} = [\\text{-}0.95, 0.31] v c 2 ​ = [ 0.96 , - 0.28 ] v c 3 ​ = [ - 0.95 , 0.31 ] Let's see that transformation and normalization all together: Figure 2: Animation of the example vectors and the derived centroid transformed within the unit circle. As you may be able to see our points now sit on the unit circle around the centroid as if we had placed the centroid at 0,0. 1 Bit, 1 Bit Only Please With our data vectors centered and normalized, we can apply standard binary quantization encoding each transformed vector component with a 0 if it is negative and 1 if it is positive. In our 2 dimensional example this splits our unit circle into four quadrants and the binary vectors corresponding to v 1 c v_{1c} v 1 c ​ , v c 2 v_{c2} v c 2 ​ , and v c 3 v_{c3} v c 3 ​ become r 1 = [ 1 , 0 ] , r 2 = [ 1 , 0 ] , r 3 = [ 0 , 1 ] r_1 = [1, 0], r_2 = [1, 0], r_3 = [0, 1] r 1 ​ = [ 1 , 0 ] , r 2 ​ = [ 1 , 0 ] , r 3 ​ = [ 0 , 1 ] , respectively. We finish quantizing each data vector by snapping it to a representative point within each region specifically picking a point equidistant from each axis on the unit circle: ± 1 d \\pm \\frac{1}{\\sqrt{d}} ± d ​ 1 ​ We'll denote each quantized vector as v ‾ 1 \\overline{v}_1 v 1 ​ , v ‾ 2 \\overline{v}_2 v 2 ​ , and v ‾ 3 \\overline{v}_3 v 3 ​ . So for instance if we snap v c 1 v_{c1} v c 1 ​ to its representative point within its region r 1 r_1 r 1 ​ we get: v ‾ 1 = 1 d ( 2 r 1 − 1 ) = 1 2 [ 1 , − 1 ] = [ 1 2 , − 1 2 ] \\begin{align*} \\overline{v}_1 &= \\frac{1}{\\sqrt{d}} (2 r_1 - 1) \\newline &= \\frac{1}{\\sqrt{2}} [1, -1] \\newline &= [\\frac{1}{\\sqrt{2}}, -\\frac{1}{\\sqrt{2}}] \\end{align*} v 1 ​ ​ = d ​ 1 ​ ( 2 r 1 ​ − 1 ) = 2 ​ 1 ​ [ 1 , − 1 ] = [ 2 ​ 1 ​ , − 2 ​ 1 ​ ] ​ And here are the quantized forms for the other data vectors, v 2 v_2 v 2 ​ and v 3 v_3 v 3 ​ : v ‾ 2 = [ 1 2 , − 1 2 ] v ‾ 3 = [ − 1 2 , 1 2 ] \\overline{v}_2 = [\\frac{1}{\\sqrt{2}}, -\\frac{1}{\\sqrt{2}}] \\newline \\overline{v}_3 = [-\\frac{1}{\\sqrt{2}}, \\frac{1}{\\sqrt{2}}] v 2 ​ = [ 2 ​ 1 ​ , − 2 ​ 1 ​ ] v 3 ​ = [ − 2 ​ 1 ​ , 2 ​ 1 ​ ] Picking these representative points has some nice mathematical properties as outlined in the RaBitQ paper . And are not unlike the codebooks seen in product quantization (PQ) . Figure 3: Binary quantized vectors within a region snapped to representative points. At this point we now have a 1 bit approximation; albeit a somewhat fuzzy one that we can use to do distance comparisons. Clearly, v ‾ 1 \\overline{v}_1 v 1 ​ and v ‾ 2 \\overline{v}_2 v 2 ​ are now identical in this quantized state, which is not ideal and is a similar problem experienced when we discussed encoding float vectors as Bit Vectors In Elasticsearch. The elegance of this is that at query time we can use something akin to a dot product to compare each data vector and each query vector rapidly for an approximation of distance. We'll see that in more detail when we discuss handling the query. The Catch As we saw above, a lot of information is lost when converting to bit vectors. We'll need some additional information to help compensate for the loss and correct our distance estimations. In order to recover fidelity we'll store the distance from each vector to the centroid and the projection (dot product) of the vector (e.g v c 1 v_{c1} v c 1 ​ ) with its quantized form (e.g. v 1 ‾ \\overline{v_1} v 1 ​ ​ ) as two f l o a t 32 float32 f l o a t 32 values. The Euclidean distance to the centroid is straight-forward and we already computed it when quantizing each vector: ∥ v 1 − c ∥ = 1.13 ∥ v 2 − c ∥ = 1.79 ∥ v 3 − c ∥ = 2.92 \\|v_1 - c\\| = 1.13 \\newline \\|v_2 - c\\| = 1.79 \\newline \\|v_3 - c\\| = 2.92 ∥ v 1 ​ − c ∥ = 1.13 ∥ v 2 ​ − c ∥ = 1.79 ∥ v 3 ​ − c ∥ = 2.92 Precomputing distances from each data vector to the centroid restores the transformation of centering the vectors. Similarly, we'll compute the distance from the query to the centroid. Intuitively the centroid acts as a go-between instead of directly computing the distance between the query and data vector. The dot product of the vector and the quantized vector is then: v c 1 ⋅ v ‾ 1 = v c 1 ⋅ 1 2 ( 2 r 1 − 1 ) = [ 0.94 , - 0.35 ] ⋅ [ 1 2 , − 1 2 ] = 0.90 \\begin{align*} v_{c1} \\cdot \\overline{v}_1 &= v_{c1} \\cdot \\frac{1}{\\sqrt{2}} (2 r_1 - 1) \\\\ &= [0.94, \\text{-}0.35] \\cdot \\left[\\frac{1}{\\sqrt{2}}, -\\frac{1}{\\sqrt{2}}\\right] \\\\ &= 0.90 \\end{align*} v c 1 ​ ⋅ v 1 ​ ​ = v c 1 ​ ⋅ 2 ​ 1 ​ ( 2 r 1 ​ − 1 ) = [ 0.94 , - 0.35 ] ⋅ [ 2 ​ 1 ​ , − 2 ​ 1 ​ ] = 0.90 ​ And for our other two vectors: v c 2 ⋅ v ‾ 2 = 0.95 v c 3 ⋅ v ‾ 3 = 0.89 v_{c2} \\cdot \\overline{v}_2 = 0.95 \\newline v_{c3} \\cdot \\overline{v}_3 = 0.89 v c 2 ​ ⋅ v 2 ​ = 0.95 v c 3 ​ ⋅ v 3 ​ = 0.89 The dot product between the quantized vector and the original vector, being the second corrective factor, captures for how far away the quantized vector is from its original position. In section 3.2 the RaBitQ paper shows there is a bias correcting the dot product between the quantized data and query vectors in a naive fashion. This factor exactly compensates it. Keep in mind we are doing this transformation to reduce the total size of the data vectors and reduce the cost of vector comparisons. These corrective factors while seemingly large in our 2d example become insignificant as vector dimensionality increases. For example, a 1024 dimensional vector if stored as f l o a t 32 float32 f l o a t 32 requires 4096 bytes. If stored with this bit compression and corrective factors, only 136 bytes are required. To better understand why we use these factors refer to the RaBitQ paper . It gives an in-depth treatment of the math involved. The Query q = [ 0.68 , - 1.72 ] q = [0.68, \\text{-}1.72] q = [ 0.68 , - 1.72 ] To be able to compare our quantized data vectors to a query vector we must first get the query vector into a quantized form and shift it relative to the unit circle. We'll refer to the query vector as q q q , the transformed vector as q c q_c q c ​ , and the scalar quantized vector as q ‾ \\overline{q} q ​ . q − c = ( 0.68 − - 0.49 ) , ( - 1.72 − 1.22 ) = [ 1.17 , − 2.95 ] q c = ( q − c ) / ∥ q − c ∥ = [ 1.17 , − 2.95 ] / 3.17 = [ 0.37 , − 0.92 ] \\begin{align*} q - c &= (0.68 - \\text{-}0.49), (\\text{-}1.72 - 1.22) \\newline &= [1.17, −2.95] \\end{align*} \\newline ~\\\\ \\begin{align*} q_c &= (q - c) / \\|q - c\\| \\newline &= [1.17, −2.95] / 3.17 \\newline &= [0.37, −0.92] \\end{align*} q − c ​ = ( 0.68 − - 0.49 ) , ( - 1.72 − 1.22 ) = [ 1.17 , − 2.95 ] ​ q c ​ ​ = ( q − c ) /∥ q − c ∥ = [ 1.17 , − 2.95 ] /3.17 = [ 0.37 , − 0.92 ] ​ Next we perform Scalar Quantization on the query vector down to 4 bits; we'll call this vector q ‾ \\overline{q} q ​ . It's worth noting that we do not quantize down to a bit representation but instead maintain an i n t 4 int4 in t 4 scalar quantization, q ‾ \\overline{q} q ​ as an int4 byte array, for estimating the distance. We can take advantage of this asymmetric quantization to retain more information without additional storage. l o w e r = - 0.92 u p p e r = 0.37 w i d t h = ( u p p e r − l o w e r ) / ( 2 4 − 1 ) ; = ( 0.37 − - 0.92 ) / 15 ; = 0.08 q ‾ = ⌊ ( q c − l o w e r ) / w i d t h ⌉ = ⌊ ( [ 0.37 , − 0.92 ] − [ - 0.92 , - 0.92 ] ) / 0.08 ⌉ = [ 15 , 0 ] lower = \\text{-}0.92 \\newline upper = 0.37 \\newline ~\\\\ \\begin{align*} width &= (upper - lower) / (2^4 - 1); \\newline &= (0.37 - \\text{-}0.92) / 15; \\newline &= 0.08 \\end{align*} \\newline ~\\\\ \\newline \\begin{align*} \\overline{q} &= \\lfloor{(q_c - lower) / width}\\rceil \\newline &= \\lfloor{([0.37, −0.92] - [\\text{-}0.92, \\text{-}0.92]) / 0.08}\\rceil \\newline &= [15, 0] \\end{align*} l o w er = - 0.92 u pp er = 0.37 w i d t h ​ = ( u pp er − l o w er ) / ( 2 4 − 1 ) ; = ( 0.37 − - 0.92 ) /15 ; = 0.08 ​ q ​ ​ = ⌊ ( q c ​ − l o w er ) / w i d t h ⌉ = ⌊ ([ 0.37 , − 0.92 ] − [ - 0.92 , - 0.92 ]) /0.08 ⌉ = [ 15 , 0 ] ​ Figure 4: Query with centroid transformation applied. As you can see because we have only 2 dimensions our quantized query vector now consists of two values at the ceiling and floor of the i n t 4 int4 in t 4 range. With longer vectors you would see a variety of int4 values with one of them being the ceiling and one of them being the floor. Now we are ready to perform a distance calculation comparing each indexed data vector with this query vector. We do this by summing up each dimension in our quantized query that's shared with any given data vector. Basically, a plain old dot-product, but with bits and bytes. q ‾ ⋅ r 1 = [ 15 , 0 ] ⋅ [ 1 , 0 ] = 15 \\begin{align*} \\overline{q} \\cdot r_1 &= [15, 0] \\cdot [1, 0] \\newline &= 15 \\end{align*} q ​ ⋅ r 1 ​ ​ = [ 15 , 0 ] ⋅ [ 1 , 0 ] = 15 ​ We can now apply corrective factors to unroll the quantization and get a more accurate reflection of the estimated distance. To achieve this we'll collect the upper and lower bound from the quantized query, which we derived when doing the scalar quantization of the query. Additionally we need the distance from the query to the centroid. Since we computed the distance between a vector and a centroid previously we'll just include that distance here for reference: ∥ q − c ∥ = 3.17 \\|q - c\\| = 3.17 ∥ q − c ∥ = 3.17 Estimated Distance Alright! We have quantized our vectors and collected corrective factors. Now we are ready to compute the estimated distance between v 1 v_1 v 1 ​ and q q q . Let's transform our Euclidean distance into an equation that has much more computationally friendly terms: d i s t ( v 1 , q ) = ∥ v 1 − q ∥ = ∥ ( v 1 − c ) − ( q − c ) ∥ 2 = ∥ v 1 − c ∥ 2 + ∥ q − c ∥ 2 − 2 × ∥ v 1 − c ∥ × ∥ q − c ∥ × ( q c ⋅ v c 1 ) \\begin{align*} dist(v_1, q) &= \\|v_1 - q\\| \\newline &= \\sqrt{\\|(v_1 - c) - (q - c)\\|^2} \\newline &= \\sqrt{\\|v_1 - c\\|^2 + \\|q - c\\|^2 - 2 \\times \\|v_1 - c\\| \\times \\|q - c\\| \\times (q_c \\cdot v_{c1})} \\end{align*} d i s t ( v 1 ​ , q ) ​ = ∥ v 1 ​ − q ∥ = ∥ ( v 1 ​ − c ) − ( q − c ) ∥ 2 ​ = ∥ v 1 ​ − c ∥ 2 + ∥ q − c ∥ 2 − 2 × ∥ v 1 ​ − c ∥ × ∥ q − c ∥ × ( q c ​ ⋅ v c 1 ​ ) ​ ​ In this form, most of these factors we derived previously, such as ∥ v 1 − c ∥ \\|v_1-c\\| ∥ v 1 ​ − c ∥ , and notably can be pre-computed prior to query or are not direct comparisons between the query vector and any given data vector such as v 1 v_1 v 1 ​ . We however still need to compute q c ⋅ v c 1 q_c \\cdot v_{c1} q c ​ ⋅ v c 1 ​ . We can utilize our corrective factors and our quantized binary distance metric q ‾ ⋅ r 1 \\overline{q} \\cdot r_1 q ​ ⋅ r 1 ​ to estimate this value reasonably and quickly. Let's walk through that. q c ⋅ v c 1 ≈ ( q c ⋅ v ‾ 1 ) / ( v c 1 ⋅ v ‾ 1 ) q_c \\cdot v_{c1} \\approx (q_c \\cdot \\overline{v}_1) / (v_{c1} \\cdot \\overline{v}_1) q c ​ ⋅ v c 1 ​ ≈ ( q c ​ ⋅ v 1 ​ ) / ( v c 1 ​ ⋅ v 1 ​ ) Let's start by estimating q c ⋅ v ‾ 1 q_c \\cdot \\overline{v}_1 q c ​ ⋅ v 1 ​ which requires this equation which essentially unrolls our transformations using the representative points we defined earlier: q c ⋅ v ‾ 1 ≈ ( l o w e r + w i d t h ⋅ q ‾ ) ⋅ ( 1 d ( 2 r 1 − 1 ) ) q_c \\cdot \\overline{v}_1 \\approx (lower + width \\cdot \\overline{q}) \\cdot (\\frac{1}{\\sqrt{d}}(2r_1 - 1)) q c ​ ⋅ v 1 ​ ≈ ( l o w er + w i d t h ⋅ q ​ ) ⋅ ( d ​ 1 ​ ( 2 r 1 ​ − 1 )) Specifically, 1 d ( 2 r 1 − 1 ) \\frac{1}{\\sqrt{d}}(2r_1-1) d ​ 1 ​ ( 2 r 1 ​ − 1 ) maps the binary values back to our representative point and l o w e r + w i d t h ⋅ q ‾ lower + width \\cdot \\overline{q} l o w er + w i d t h ⋅ q ​ undoes the shift and scale used to compute the scalar quantized query components. We can rewrite this to make it more computationally friendly like this. First, though, let's define a couple of helper variables: total number of 1's bits in r 1 r_1 r 1 ​ as v ‾ b 1 \\overline{v}_{b1} v b 1 ​ , 1 1 1 in this case total number of all quantized values in q ‾ b \\overline{q}_b q ​ b ​ as q ‾ b \\overline{q}_b q ​ b ​ , 15 15 15 in this case q c ⋅ v ‾ 1 ≈ 2 × w i d t h d × ( q ‾ ⋅ r 1 ) + 2 × l o w e r d × v ‾ b 1 − w i d t h d × q ‾ b − d × l o w e r ≈ 2 × 0.08 2 × 15 + 2 × - 0.92 2 × 1 − 0.08 2 × 15 − 2 × - 0.92 ≈ 0.92 \\begin{align*} q_c \\cdot \\overline{v}_1 &\\approx \\frac{2 \\times width}{\\sqrt{d}} \\times (\\overline{q} \\cdot r_1) + \\frac{2 \\times lower}{\\sqrt{d}} \\times \\overline{v}_{b1} - \\frac{width}{\\sqrt{d}} \\times \\overline{q}_b - \\sqrt{d} \\times lower \\newline &\\approx \\frac{2 \\times 0.08}{\\sqrt{2}} \\times 15 + \\frac{2 \\times \\text{-}0.92}{\\sqrt{2}} \\times 1 - \\frac{0.08}{\\sqrt{2}} \\times 15 - \\sqrt{2} \\times \\text{-}0.92 \\newline &\\approx 0.92 \\end{align*} \\newline q c ​ ⋅ v 1 ​ ​ ≈ d ​ 2 × w i d t h ​ × ( q ​ ⋅ r 1 ​ ) + d ​ 2 × l o w er ​ × v b 1 ​ − d ​ w i d t h ​ × q ​ b ​ − d ​ × l o w er ≈ 2 ​ 2 × 0.08 ​ × 15 + 2 ​ 2 × - 0.92 ​ × 1 − 2 ​ 0.08 ​ × 15 − 2 ​ × - 0.92 ≈ 0.92 ​ With this value and v c 1 ⋅ v ‾ 1 v_{c1} \\cdot \\overline{v}_1 v c 1 ​ ⋅ v 1 ​ , which we precomputed when indexing our data vector, we can then plug those values in to compute an approximation of q c ⋅ v c 1 q_c \\cdot v_{c1} q c ​ ⋅ v c 1 ​ : q c ⋅ v c 1 ≈ ( q c ⋅ v ‾ 1 ) / ( v c 1 ⋅ v ‾ 1 ) ≈ 0.92 / 0.90 ≈ 1.01 \\begin{align*} q_c \\cdot v_{c1} &\\approx (q_c \\cdot \\overline{v}_1) / (v_{c1} \\cdot \\overline{v}_1) \\newline &\\approx 0.92 / 0.90 \\newline &\\approx 1.01 \\end{align*} q c ​ ⋅ v c 1 ​ ​ ≈ ( q c ​ ⋅ v 1 ​ ) / ( v c 1 ​ ⋅ v 1 ​ ) ≈ 0.92/0.90 ≈ 1.01 ​ Finally, let's plug this into our larger distance equation noting that we are using an estimate for q c ⋅ v c 1 q_c \\cdot v_{c1} q c ​ ⋅ v c 1 ​ : d i s t ( v 1 , q ) = ∥ v 1 − c ∥ 2 + ∥ q − c ∥ 2 − 2 × ∥ v 1 − c ∥ × ∥ q − c ∥ × ( q c ⋅ v c 1 ) e s t _ d i s t ( v 1 , q ) = 1.1 3 2 + 3.1 7 2 − 2 × 1.13 × 3.17 × 1.01 dist(v_1, q) = \\sqrt{\\|v_1-c\\|^2 + \\|q-c\\|^2 - 2 \\times \\|v_1-c\\| \\times \\|q-c\\| \\times (q_c \\cdot v_{c1})} \\newline est\\_dist(v_1, q) = \\sqrt{1.13^2 + 3.17^2 − 2 \\times 1.13 \\times 3.17 \\times 1.01} d i s t ( v 1 ​ , q ) = ∥ v 1 ​ − c ∥ 2 + ∥ q − c ∥ 2 − 2 × ∥ v 1 ​ − c ∥ × ∥ q − c ∥ × ( q c ​ ⋅ v c 1 ​ ) ​ es t _ d i s t ( v 1 ​ , q ) = 1.1 3 2 + 3.1 7 2 − 2 × 1.13 × 3.17 × 1.01 ​ With all of the corrections applied we are left with a reasonable estimate of the distance between two vectors. For instance in this case our estimated distances between each of our original data vectors, v 1 v_1 v 1 ​ , v 2 v_2 v 2 ​ , v 3 v_3 v 3 ​ , and q q q compared to the true distances are: e s t _ d i s t ( v 1 , q ) = 2.02 e s t _ d i s t ( v 2 , q ) = 1.15 e s t _ d i s t ( v 3 , q ) = 6.15 est\\_dist(v_1, q) = 2.02 \\newline est\\_dist(v_2, q) = 1.15 \\newline est\\_dist(v_3, q) = 6.15 \\newline ~\\\\ es t _ d i s t ( v 1 ​ , q ) = 2.02 es t _ d i s t ( v 2 ​ , q ) = 1.15 es t _ d i s t ( v 3 ​ , q ) = 6.15 e u c l _ d i s t ( v 1 , q ) = 2.55 e u c l _ d i s t ( v 2 , q ) = 2.50 e u c l _ d i s t ( v 3 , q ) = 5.52 eucl\\_dist(v_1, q) = 2.55 \\newline eucl\\_dist(v_2, q) = 2.50 \\newline eucl\\_dist(v_3, q) = 5.52 e u c l _ d i s t ( v 1 ​ , q ) = 2.55 e u c l _ d i s t ( v 2 ​ , q ) = 2.50 e u c l _ d i s t ( v 3 ​ , q ) = 5.52 For details on how the linear algebra is derived or simplified when applied refer to the RaBitQ paper . Re-ranking As you can see from the results in the prior section, these estimated distances are indeed estimates. Binary quantization produces vectors whose distance calculations are, even with the extra corrective factors, only an approximation of the distance between vectors. In our experiments we were able to achieve high recall by involving a multi-stage process. This confirms the findings in the RaBitQ paper . Therefore to achieve high quality results, a reasonable sample of vectors returned from binary quantization then must be re-ranked with a more exact distance computation. In practice this subset of candidates can be small achieving typically high > 95% recall with 100 or less candidates for large datasets (>1m). With RaBitQ results are re-ranked continually as part of the search operation. In our experiments to achieve a more scalable binary quantization we decoupled the re-ranking step. While RaBitQ is able to maintain a better list of top N N N candidates by re-ranking while searching, it is at the cost of constantly paging in full f l o a t 32 float32 f l o a t 32 vectors, which is untenable for some larger production-like datasets. Conclusion Whew! You made it! This blog is indeed a big one. We are extremely excited about this new algorithm as it can alleviate many of the pain points of Product Quantization (e.g. code-book building cost, distance estimation slowness, etc.) and provide excellent recall and speed. Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz Vector Database Lucene April 7, 2025 Speeding up merging of HNSW graphs Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs. TV MS By: Thomas Veasey and Mayya Sharipova Jump to Introduction Building the Bit Vectors Find a Representative Centroid 1 Bit, 1 Bit Only Please The Catch Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"RaBitQ binary quantization 101 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/rabitq-explainer-101","meta_description":"Understand the most critical components of RaBitQ binary quantization, how it works and its benefits, including the math behind quantization and examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Getting Started with the Elastic Chatbot RAG app using Vertex AI running on Google Kubernetes Engine Learn how to configure the Elastic Chatbot RAG app using Vertex AI and run it on Google Kubernetes Engine (GKE). Integrations How To JS By: Jonathan Simon On April 4, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. I’ve been having fun playing around with the Elastic Chatbot RAG example app . It’s open source and it’s a great way to get started with understanding Retrieval Augmented Generation (RAG) and trying your hand at running a RAG app. The app supports integration with a variety of GenAI Large Language Models (LLMs) like OpenAI, AWS Bedrock, Azure OpenAI, Google Vertex AI, Mistral AI, and Cohere. You can run the app on your local computer using Python or using Docker . You can also run the app in Kubernetes. I recently deployed the app to Google Kubernetes Engine (GKE) configured to use Google Vertex AI as the backing LLM. I was able to do it all just using a browser, Google Cloud, and Elastic Cloud. This blog post will walk you through the step-by-step process that I followed to configure the Chatbot RAG app to use Vertex AI and how to run it on GKE. Enable Vertex API Since this blog post is focused on running the Elastic Chatbot RAG app with Vertex AI as the backing LLM, the very first step is to go to Google Cloud and enable the Vertex AI API . If this is your first time using Vertex AI, you'll see an Enable all recommended APIs button. Click that button to enable the necessary Google Cloud APIs to use Vertex AI. Once you've done that, you should see that the Vertex AI API is now enabled. Use Google Cloud Shell Editor to clone the Chatbot RAG app Now that you’ve got the Vertex AI API enabled, the next step is to clone the code for the Chatbot RAG app. Google Cloud has the perfect tool for doing this right in your browser: Cloud Shell Editor. 1. Open Google Cloud Shell Editor . 2. Open your terminal in Cloud Shell Editor. Click the Terminal menu and select New Terminal. 3. Clone the Chatbot RAG app by running the following command in the terminal. 4. Change directory to the Chatbot RAG app’s directory using the following command. Use Google Cloud Shell Editor to create an app configuration file The app needs to access Elastic Cloud and Vertex AI and it does so using configuration values that are stored in a configuration file. The configuration file for the app should have the filename .env and you will create it now. The example app includes an example configuration file named env.example that you can copy to create a new file. 1. Create a .env file that will contain the app’s configuration values using the following command: 2. Click the View menu and select Toggle Hidden Files . Files like .env are hidden in Cloud Shell Editor by default. 3. Open the .env file for editing. Find the line that sets the ELASTICSEARCH_URL value. That’s where you’ll make your first edit. Elastic Cloud - Create Deployment The Chatbot RAG app needs an Elasticsearch backend that will power the retrieval augmentation part of the RAG app. So the next step is to create an Elastic Cloud deployment with Elasticsearch and ML enabled. Once the deployment is ready, copy the Elasticsearch Endpoint URL to add it to the app’s .env configuration file. Create an Elastic Cloud deployment. Copy the Elasticsearch Endpoint URL. Use Google Cloud Shell Editor to update the .env configuration file with Elasticsearch URL Add Elasticsearch Endpoint URL to the .env file. Comment out unused configuration lines. Uncomment the line where the ELASTICSEARCH_API_KEY is set. Elastic Cloud - Create API Key and add its value to .env configuration file Jumping back into the Elastic Cloud deployment, click the Create API Key button to create a new API Key that will be used by the app to access Elasticsearch running in your deployment. Paste the copied API Key into your .env configuration file using Google Cloud Shell Editor. Create an Elastic Cloud API Key. Copy the Key’s encoded value and add it to the app’s .env configuration file in Google Cloud Shell Editor. Use Google Cloud Shell Editor to update the .env configuration file to use Vertex AI Moving down in the .env configuration file, find the lines to configure a connection to Vertex AI and uncomment them. The first custom values that you'll need to set are GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_REGION . Set GOOGLE_CLOUD_PROJECT to your Google Cloud Project ID, which you can find right on the welcome page for your Google Cloud project . Set GOOGLE_CLOUD_REGION to one of the available regions supported by Vertex AI that you’d like to use. For this blog post, we used us-central1 . Uncomment the Vertex AI lines in the .env configuration file. Set GOOGLE_CLOUD_PROJECT to your Google Cloud Project ID in the .env configuration file. Set GOOGLE_CLOUD_REGION in the .env configuration file to one of the available regions supported by Vertex AI . Save the changes to the .env configuration file. Google Cloud IAM - Create Service Account and download its Key file Now it’s time to set up the app’s access to Vertex AI and GKE. You can do this by creating a Google IAM Service Account and assigning it the Roles to grant the necessary permissions. 1. Create a Service Account with the IAM Roles necessary to access Vertex AI and GKE. Add the following IAM Roles: Vertex AI Custom Code Service Agent Kubernetes Engine Default Node Service Account 2. Create a Service Account Key and download it to your local computer. Google Kubernetes Engine - Create cluster Google Kubernetes Engine (GKE) is where you’re going to deploy and run the Chatbot RAG app. GKE is the gold standard of managed Kubernetes, providing a super scalable infrastructure for running your applications. While creating a new GKE cluster, in the “create cluster” dialog, edit the Advanced settings > Security setting to use the Service Account you created in the previous step. Create a new GKE cluster. Use the Service Account created previously, within Advanced settings > Security , when creating the cluster. Google Cloud Shell Editor - Upload Google Service Account Key file Back in Google Cloud Shell Editor you can now complete the configuration of the app's settings by adding the Google Service Account key you previously downloaded to your local computer. Click the Cloud Shell Editor’s More button to upload and add the Google Cloud Service Account key file to the app. Upload the Google Cloud Service Account key file using Cloud Shell Editor. Save the file to the top level directory of the Chatbot RAG app. Google Cloud Shell Editor - Deploy app to Google Kubernetes Engine Everything for the app’s configuration is in place, so you can now deploy the app to GKE. Connect the Cloud Shell terminal to your GKE cluster using the gcloud command line tool. Once you’re connected to the cluster, you can use the kubectl command line tool to add the configuration values from your .env configuration file to your cluster. Next, use kubectl to add the Google Cloud Service Account key file to your cluster. Then, use kubectl to deploy the app to GKE. 1. Connect the Cloud Shell terminal to your GKE cluster using gcloud. Replace example-project in the command with your Google Cloud Project ID found on the welcome page for your Google Cloud project . 2. Add your .env configuration file values to your cluster using kubectl. 3. Add the Google Cloud Service Account key file to your cluster using kubectl . 4. Deploy the app to your cluster using kubectl . This command will create a new Elasticsearch index in Elastic Cloud with sample data, initialize the frontend and backend of the app with the values that you provided in the .env file and then deploy the app to the GKE cluster. It will take a few minutes for the app to be deployed. You can use the GKE cluster’s details page to watch its status. Google Kubernetes Engine - Expose deployed app The final required step is to expose the app in GKE so it's viewable on the Internet and in your browser. You can do this in Google Cloud’s GKE Workloads , which is where your deployed app will appear as chatbot-rag-app in the list of running GKE workloads. Select your workload by clicking on its workload Name link. In the details page of your app’s workload, use the Actions menu to select the Expose action. In the Expose dialog, set the Target port 1 to 4000 which is the port that the Chatbot RAG app is configured to run on in the k8s-manifest.yml file that was used for its deployment to GKE. Select the chatbot-rag-app in GKE Workloads. Use the Expose action from the Actions menu to expose the app. In the Expose dialog, set the Target port 1 to 4000 . Try out the app After clicking the Expose button for the workload, you’ll be taken to the workload’s Service Details page in GKE. Once the exposed app is ready, you'll see External Endpoints displayed along with a linked IP address. Click the IP address to try out the Chatbot RAG app. Elastic Cloud is your starting point for GenAI RAG apps Thanks for reading. Check out a guided tour of all the steps included in this blog post. Get started with building GenAI RAG apps today and give Elastic Cloud a try. Read to explore more? Try a hands-on tutorial where you can build a RAG app in a sandbox environment. To learn more about using RAG for real world applications, see our recent blog post series GenAI for customer support . Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Enable Vertex API Use Google Cloud Shell Editor to clone the Chatbot RAG app Use Google Cloud Shell Editor to create an app configuration file Elastic Cloud - Create Deployment Use Google Cloud Shell Editor to update the Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Getting Started with the Elastic Chatbot RAG app using Vertex AI running on Google Kubernetes Engine - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-rag-chatbot-vertex-ai-gke","meta_description":"Learn how to configure the Elastic Chatbot RAG app using Vertex AI and run it on Google Kubernetes Engine (GKE)."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Building Elastic Cloud Serverless Explore the architecture of Elastic Cloud Serverless and key design and scalability decisions we made along the way of building it. Elastic Cloud Serverless JT By: Jason Tedor On May 15, 2024 Learn more about Elastic Cloud Serverless , and start a 14-day free trial to test it out yourself. This blog explores the architectural decisions we made along the journey of building Elastic Cloud Serverless, including key design and scalability decisions. Architecture of Elastic Cloud Serverless In October 2022 we introduced the Stateless architecture of Elasticsearch . Our primary goal with that initiative was to evolve Elasticsearch to take advantage of operational, performance, and cost efficiencies offered by cloud-native services. That initiative became part of a larger endeavor that we recently announced called the Search AI Lake , and serves as the foundation for our new Elastic Cloud Serverless offering. In this endeavor, we aimed not only to make Elastic Stack products such as Elasticsearch and Kibana more cloud-native, but also their orchestration too. We designed and built a new backend platform powered by Kubernetes for orchestrating Elastic Cloud Serverless projects, and evolved the Elastic Stack products to be easier for us to orchestrate in Kubernetes. In this article, we'd like to detail a few of the architectural decisions we made along the way. In future articles, we will dive deeper into some of these aspects. One of the main reasons that we settled on Kubernetes to power the backend is due to the wealth of resources in Kubernetes for solving container lifecycle management, scaling, resiliency, and resource management issues. We established an early principle of \"doing things the native Kubernetes way\", even if that meant non-trivial evolutions of Elastic Stack products. We have built a variety of Kubernetes-native services for managing, observing, and orchestrating physical instances of Elastic Stack products such as Elasticsearch and Kibana. This includes custom controllers/operators for provisioning, managing software updates, and autoscaling; and services for authentication, managing operational backups, and metering. For a little bit of background, our backend architecture has two high-level components. Control Plane: This is the user-facing management layer. We provide UIs and APIs for users to manage their Elastic Cloud Serverless projects. This is where users can create new projects, control who has access to their projects, and get an overview of their projects. Data Plane: This is the infrastructure layer that powers the Elastic Cloud Serverless projects, and the layer that users interact with when they want to use their projects. The Control Plane is a global component, and the Data Plane consists of multiple \"regional components\". These are individual Kubernetes clusters in individual Cloud Service Provider (CSP) regions. Key design decisions in building Elastic Cloud Serverless Scale Kubernetes horizontally Our Data Plane will be deployed across AWS, Azure, and Google Cloud. Within each major CSP, we will operate in several CSP regions. Rather than vertically scaling up massive Kubernetes clusters, we have designed for horizontally scaling independent Kubernetes clusters using a cell-based architecture . Within each CSP region, we will be running many Kubernetes clusters. This design choice enables us to avoid Kubernetes scaling limits, and also serves as smaller fault domains in case a Kubernetes cluster fails. Push vs. pull One interesting debate we had was \"push vs. pull\". In particular, how should the global Control Plane communicate with individual Kubernetes clusters in the Data Plane? For example, when a new Elastic Cloud Serverless project is created and needs to be scheduled in a Kubernetes cluster in the Data Plane, should the global Control Plane push the configuration of that project down to a selected Kubernetes cluster, or should a Kubernetes cluster in the Data Plane watch for and pull that configuration from the global Control Plane? As always, there are tradeoffs in both approaches. We settled on the push model because: The scheduling logic is simpler as the global Control Plane solely chooses an appropriate Kubernetes cluster Dataflow will be uni-directional vs. dataflow must be bi-directional in the pull model Kubernetes clusters in the Data Plane can operate independently from the global Control Plane services Simplified operations and handling of failure scenarios there is no need to manage two Data Plane clusters in the same region competing for scheduling or rescheduling an application if the global Control Plane fails, we can manually interact with the Kubernetes API server in the target cluster to simulate the blessed path in the push model; however, simulating the blessed path in the pull model is not easily achievable when the global Control Plane being watched is unavailable Managed Kubernetes infrastructure Within each major CSP, we have elected to use their managed Kubernetes offerings (AWS EKS, Azure AKS, and Google GKE). This was an early decision for us to reduce the management burden of the clusters themselves. While the managed Kubernetes offerings meet that goal, they are otherwise barebones. We wanted to be more opinionated about the Kubernetes clusters that our engineers build on, reducing the burden on our engineering teams, and providing certain things for free. The (non-exhaustive) types of things that we provide out-of-the-box to our engineering teams are: guarantees around the configuration of and the services available on the clusters a managed network substrate a secure baseline, and compliance guarantees around internal policies and security standards managed observability—the logs and metrics of every component are automatically collected and shipped to centralized observability infrastructure, which is also based on the Elastic Stack capacity management Internally we call this wrapped infrastructure \"Managed Kubernetes Infrastructure\". It is a foundational building block for us, and enables our engineering teams to focus on building and operating the services that they create. Clusters are disposable An important architectural principle we took here is that our Kubernetes clusters are considered disposable. They are not the source of truth for any important data so we will never experience data loss on a Kubernetes disaster and they can be recreated at any time. This level of resiliency is important to safeguard our customer's data. This architectural principle will simplify the operability of our platform at our scale. Key scalability decisions in building Elastic Cloud Serverless Object store API calls As the saying goes, that's how they get you. We previously outlined the stateless architecture of Elasticsearch where we are using object stores (AWS S3, Azure Blob Storage, Google Cloud Storage) as a primary data store. At a high-level, the two primary cost dimensions when using a major CSP object store are storage, and API calls. The storage dimension is fairly obvious and easy to estimate. But left unchecked, the cost of object store API calls can quickly explode. With the object store serving as the primary data store, and the per-shard data structures such as the translog, this meant that every write to Elasticsearch would go to the object store, and therefore every write to a shard would incur at least one object store API call. For an Elasticsearch node holding many shards frequently receiving writes, the costs would add up very quickly. To address this, we evolved the translog writes to be performed per-node, where we coalesce the writes across per-shard translogs on a node, and flush them to the object store every 200ms. A related aspect is refreshes . In Elasticsearch, refreshes translate to writes to its backing data store, and in the stateless architecture, this means writes to the object store and therefore object store API calls. As some use cases expect a high refresh rate, for example, every second, these object store API calls would amplify quickly when an Elasticsearch node is receiving writes across many shards. This means we have to trade off between suboptimal UX and high costs. What is more, these refresh object store API calls are independent of the amount of data ingested in that one second period which means they're difficult to tie to perceived user value. We considered several ways to address this: an intermediate data store that doesn't have per-operation costs, that would sit between Elasticsearch and the object store decoupling refreshes from writes to the object store compounding into a single object refreshes across all shards on node We ultimately settled on decoupling refreshes from writes to the object store. Instead of a refresh triggering a write to the object store that the search nodes would read so they had access to the recently performed operations, the primary shard will push the refreshed data (segments) directly to the search nodes, and defer writing to the object store until a later time. There's no risk of data loss with this deferment because we still persist operations to the translog in the object store. While this deferment does increase recovery times, it comes with a two order of magnitude reduction in the number of refresh-triggered object store API calls. Autoscaling One major UX goal we had with Elastic Cloud Serverless was to remove the need for users to manage the size/capacity of their projects. While this level of control is a powerful knob for some users, we envisioned a simpler experience where Elastic Cloud Serverless would automatically respond to the demand of increased ingestion rates or querying over larger amounts of data. With the separation of storage and compute in the stateless Elasticsearch architecture, this is a much easier problem to solve than before as we can now manage the indexing and search resources independently. One early problem that we encountered was the need to have an autoscaler that can support both vertical and horizontal autoscaling, so that as more demand is placed on a project, we can both scale up to larger nodes, and scale out to more nodes. Additionally, we ran into scalability issues with the Kubernetes Horizontal Pod Autoscaler. To address this, we have built custom autoscaling controllers . These custom controllers obtain application-level metrics (specific to the workload being scaled, e.g., indexing vs. search), make autoscaling decisions, and push these decisions to the resource definitions in Kubernetes. These decisions are then acted upon to actually scale the application to the desired resource level. With this framework in place, we can independently add more tailored metrics (e.g., search query load metrics) and therefore intelligence to the autoscaling decisions. This will enable Elastic Cloud Serverless projects to iteratively respond more dynamically to user workloads over time. Conclusion These are only a few of the interesting architectural decisions we made along the journey of building Elastic Cloud Serverless. We believe this new platform gives us a foundation to rapidly deliver more functionality to our users over time, while being easier to operate, performant, scalable, and cost efficient. Stay tuned to several future articles where we will dive deeper into some of the above concepts. Report an issue Related content Elastic Cloud Serverless Agent March 4, 2025 The AI Agent to manage Elasticsearch Serverless projects A natural language-powered AI Agent that effortlessly manages Elasticsearch Serverless projects—enabling project creation, deletion, and status checks. FS By: Fram Souza Elastic Cloud Serverless December 10, 2024 Autosharding of data streams in Elasticsearch Serverless In Elastic Cloud Serverless we spare our users from the need to fiddle with sharding by automatically configuring the optimal number of shards for data streams based on the indexing load. AD By: Andrei Dan Elastic Cloud Serverless December 2, 2024 Elasticsearch Serverless is now generally available Elasticsearch Serverless, built on a new stateless architecture, is generally available. It’s fully managed so you can get projects started quickly without operations or upgrades, and you can access the latest vector search and generative AI capabilities. YL By: Yaru Lin Elastic Cloud Serverless December 2, 2024 Elastic Cloud Serverless: A deep dive into autoscaling and performance stress testing at scale Dive into how Elasticsearch Cloud Serverless dynamically scales to handle massive data volumes and complex queries. We explore its performance under real-world conditions and the results from extensive stress testing. DB JB GE +1 By: David Brimley , Jason Bryan , Gareth Ellis and 1more Vector Database Generative AI +3 October 4, 2024 Using Eland on Elasticsearch Serverless Learn how to use Eland on Elasticsearch Serverless. QP By: Quentin Pradet Jump to Architecture of Elastic Cloud Serverless Key design decisions in building Elastic Cloud Serverless Scale Kubernetes horizontally Push vs. pull Managed Kubernetes infrastructure Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Building Elastic Cloud Serverless - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/building-elastic-cloud-serverless","meta_description":"Explore the architecture of Elastic Cloud Serverless and key design and scalability decisions we made along the way of building it."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elastic Jira connector tutorial part II: Optimization tips After connecting Jira to Elasticsearch, we'll now review best practices to escalate this deployment. Integrations Ingestion How To GL By: Gustavo Llermaly On January 16, 2025 Part of Series Jira connector tutorials Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. In part I of this series , we configured the Elastic Jira connector and indexed objects into Elasticsearch. In this second part, we'll review some best practices and advanced configurations to escalate the connector. These practices complement the current documentation and are to be used during the indexing phase. Having a connector running was just the first step. When you want to index large amounts of data, every detail counts and there are many optimization points you can use when you index documents from Jira. Jira connector optimization points Index only the documents you'll need by applying advanced sync filters Index only the fields you'll use Refine mappings based on your needs Automate Document Level security Offload attachment extraction Monitor the connector's logs 1. Index only the documents you'll need by applying advanced sync filters By default, Jira sends all projects, issues, and attachments. If you're only interested in some of these or, for example, just issues \"In Progress\", we recommend not to index everything. There are three instances to filter documents before we put them in Elasticsearch: Remote : We can use a native Jira filter to get only what we need. This is the best option and you should try to use this option any time you can since with this, the documents don't even come out of the source before getting into Elasticsearch. We'll use advanced sync rules for this. Integration: If the source does not have a native filter to provide what we need, we can still filter at an integration level before ingesting into Elasticearch by using basic sync rules . Ingest Pipelines: The last option to handle data before indexing it is using Elasticsearch ingest pipelines . By using Painless scripts, we get great flexibility to filter or manipulate documents. The downside to this is that the data has already left the source and been through the connector, thus potentially putting a heavy load on the system and creating security issues. Let's do a quick review of the Jira issues: Note: We use \"exists\" query to only return the documents with the field we are filtering. You can see there are many issues in \"To Do\" that we don't need: To only get the issues \"In Progress\", we'll create an advanced sync rule using a JQL query ( Jira query language ): Go to the connector and click on the sync rules tab and then on Draft Rules . Once inside, go to Advanced Sync Rules and add this: Once the rule has been applied, run a Full Content Sync . This rule will exclude all issues that are not \"In Progress\". You can check by running the query again: Here's the new response: 2. Index only the fields you'll use Now that we have only the documents we want, you can see that we're still getting a lot of fields that we don't need. We can hide them when we run the query by using _source , but the best option is to simply not index them. To do so, we'll use the ingest pipelines . We can create a pipeline that drops all the fields we won't use. Let's say we only want this info from an issue: Assignee Title Status We can create a new ingest pipeline that only gets those fields by using the ingest pipelines' Content UI : Click on Copy and customize and then modify the pipeline called index-name@custom that should have just been created and empty. We can do it using Kibana DevTools console , running this command: Let's remove the fields that we don't need and also move the ones that we need to the root of the document. The remove processor with the keep parameter, will delete all the fields but the ones within the keep array from the document. We can check this is working by running a simulation. Add the content of one of the documents from the index: The response will be: This looks much better! Now, let's run a full content sync to apply the changes. 3. Refine mappings based on your needs The document is clean. However, we can optimize things more. We can go into “it depends” territory. Some mappings can work for your use case while others will not. The best way to find out is by experimenting. Let's say we tested and got to this mappings design: assignee : full text search and filters summary : full text search status : filters and sorting By default, the connector will create mappings using dynamic_templates that will configure all text fields for full-text search, filtering and sorting, which is a solid baseline but it can be optimized if we know what we want to do with our fields. This is the rule: Let's create different subfields for different purposes for all text fields. You can find additional information about the analyzers in the documentation . To use these mappings you must: Create the index before you create the connector When you create the connector, select that index instead of creating a new one Create the ingest pipeline to get the fields you want Run a Full Content Sync* *A Full Content Sync will send all documents to Elasticsearch. Incremental Sync will only send to Elasticsearch documents that changed after the last Incremental, or Full Content Sync. Both methods will fetch all the data from the data source. Our optimized mappings are below: For assignee, we kept the mappings as they are because we want this field to be optimized for both search and filters. For summary, we removed the “enum” keyword field because we don’t plan to filter on summaries. We mapped status as a keyword because we only plan to filter on that field. Note: If you're not sure how you will use your fields, the baselines analyzers should be fine. 4. Automate Document Level security In the first section, we learned to manually create API keys for a user and limit access based on it using Document Level Security (DLS) . However, if you want to automatically create an API Key with permissions every time a user visits our site, you need to create a script that takes the request, generates an API Key using the user ID and then uses it to search in Elasticsearch. Here's a reference file in Python: You can call this create_api_key function on each API request to generate an API Key the user can use to query Elasticsearch in the subsequent requests. You can set expiration, and also arbitrary metadata in case you want to register some info about the user or the API that generated the key. 5. Offload attachment extraction For content extraction, like extracting text from PDF and Powerpoint files, Elastic provides an out of the box service that works fine but has a size limitation. By default, the extraction service of the native connectors supports 10MB max per attachment. If you have bigger attachments like a PDF with big images inside or you want to host the extraction service, Elastic offers a tool that lets you deploy your own extraction service. This option is only compatible with Connector Clients, so if you're using a Native connector you will need to convert it to a connector client and host it in your own infrastructure. Follow these steps to do it: a. Configure custom extraction service and run it with Docker EXTRACTION_SERVICE_VERSION you should use 0.3.x for Elasticsearch 8.15 b. Configure yaml con extraction service custom and run Go to the connector client and add the following to the config.yml file to use the extraction service: c. Follow steps to run connector client After configuring you can run the connector client with the connector you want to use. You can refer to the full process in the docs . 6. Monitor Connector's logs It's important to have visibility of the connector's logs in case there's an issue and Elastic offers this out of the box. The first step is to activate logging in the cluster. The recommendation is to send logs to an additional cluster (Monitoring deployment), but in a development environment, you can send the logs to the same cluster where you're indexing documents too. By default, the connector will send the logs to the elastic-cloud-logs-8 index. If you're using Cloud, you can check the logs in the new Logs Explorer : Conclusion In this article, we learned different strategies to consider when we take the next step in using a connector in a production environment. Optimizing resources, automating security, and cluster monitoring are key mechanisms to properly run a large-scale system. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Jira connector optimization points 1. Index only the documents you'll need by applying advanced sync filters 2. Index only the fields you'll use 3. Refine mappings based on your needs 4. Automate Document Level security Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elastic Jira connector tutorial part II: Optimization tips - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elastic-jira-connector-optimization","meta_description":"Discover best practices and advanced configuration tips to escalate the Elastic Jira connector."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Simplifying data lifecycle management for data streams Elasticsearch data streams can now be managed by a data stream property called lifecycle. Learn to set up and update a data stream lifecycle here. How To AD By: Andrei Dan On June 13, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Today, we’ll explore Elasticsearch’s new data management system for data streams: data stream lifecycle , available from version 8.14. With its straightforward and robust execution model, the data stream lifecycle lets you concentrate on the business-related aspects of your data's lifecycle, such as downsampling and retention. Behind the scenes, it automatically ensures that the Elasticsearch structures storing your data are efficiently managed. This blog explains the evolution of data lifecycle management in Elasticsearch, how to configure a data stream lifecycle, update the configured lifecycle, and migrate from ILM to the data stream lifecycle. Data lifecycle management evolution in Elasticsearch Since the 6.x Elasticsearch series, Index Lifecycle Management (ILM) has empowered users to maintain healthy indices and save costs by automatically migrating data between tiers . ILM takes care of indices based on their unique performance, resilience, and retention needs, whilst offering significant control over cost and defining an index's lifecycle in great detail. ILM is a very general solution that caters to a broad range of use cases, from time series indices and data streams to indices that store text content. The lifecycle definitions will be very different for all these use cases, and it gets even more divergent when we factor each individual deployment’s available hardware and data tiering resources. For this reason, ILM allows fully customisable lifecycle definitions, at the cost of complexity (precise rollover definitions; when to force merge, shrink, and (partially) mount indices). As we started working on our Serverless solution we got a chance to look at the lifecycle management through a new lens where our users could (and will) be shielded from Elasticsearch internal concepts like shards, allocations, or cluster topology. Even more, in Serverless we want to be able to change the internal Elasticsearch configuration as much as needed to maintain the best experience for our users. In this new context, we looked at the existing ILM solution which offers the users the internal Elasticsearch concepts as building blocks and decided we need a new solution to manage the lifecycle of data. We took the lessons learned from building and maintaining ILM at scale and created a simpler lifecycle management system for the future. This system is more specific and only applies to data streams . It's configured as a property directly on the data stream (similar to how an index setting belongs to an index), and we call it data stream lifecycle . It’s a built-in mechanism (continuing with the index settings analogy) that is always on and always reactive to the lifecycle needs of a data stream. By scoping the applicability to only data streams (i.e. data with a timestamp that’s rarely updated in place) we were able to eschew customizations in favor of ease-of-use and automatic defaults. Data stream lifecycles will automatically execute the data structure maintenance operations like rollover and force merge, and allow you to only deal with the business-related lifecycle functionality you should care about, like downsampling and data retention . A data stream lifecycle is not as feature-rich as ILM; most notably it doesn’t currently support data tiering , shrinking, or searchable snapshots . However, the use cases that do not need these particular features will be better served by data stream lifecycles. Though data stream lifecycles were originally designed for the needs of the Serverless environment, they are also available in regular on-premise and ESS Elasticsearch deployments. Configuring data stream lifecycle Let’s create an Elasticsearch Serverless project and get started with creating a data stream managed by data stream lifecycle. Once the project is created, go to Index Management and create an index template for the my-data-* index pattern and configure a retention of 30 days: Let’s navigate through the steps and finalize this index template (I’ve configured one text field in the mapping section, but that’s optional): We’ll now ingest some data that’ll target the my-data-stream namespace. I’ll use the Dev Tools section on the left hand side, but you can your preferred way of ingesting data : my-data-stream has now been created and it contains 2 documents. Let’s go to Index Management/Data Streams and check it out: And that’s it! 🎉 Our data stream is managed by data steam lifecycle, and retention for the data is configured to 30 days. All new data streams that match the my-data-* pattern will be managed by data stream and receive a 30 days data retention. Updating the configured lifecycle The data stream lifecycle property belongs to the data stream. So updating the lifecycle for existing data streams is something we configured by navigating to the data stream directly. Let’s go to Index Management/Data Streams and edit the retention for my-data-stream to 7 days: We now see our data stream has a data retention of 7 days: Now that the existing data stream in the system has the desired 7 days retention period configured, let’s also update the index template retention so that new data streams that get created also receive the 7 days retention period: Implementation details The master node periodically (every 5 minutes by default, according to the data_streams.lifecycle.poll_interval setting) iterates over the data streams in the system that are configured to be managed by the lifecycle. On every iteration, each backing index state in the system is evaluated and one operation is executed towards achieving the target state described by the configured lifecycle. For each managed data stream we first attempt to rollover the data stream according to the cluster.lifecycle.default.rollover conditions. This is the only operation attempted for the write index of a data stream. After rolling over, the former write index becomes eligible for merging. As we wanted the merging of the shards maintenance task to be something we execute automatically we implemented a lighter merging operation, an alternative to force merging to 1 segment, that only merges the long tail of small segments instead of the entire shard. The main benefit of this approach is that it can be applied automatically and early after rollover. Once a backing index has been merged, on the next lifecycle execution run, the index will be downsampled. After completing all the scheduled downsample rounds, each time the lifecycle runs, the backing index will be examined for eligibility for data retention. When the specified data retention period lapses (since rollover time), the backing index will be deleted. Both downsampling and data retention are time based operations (e.g. data_retention: 7d ) and are calculated since the index was rolled over. The time since an index has been rolled over is visible in the explain lifecycle API and we call it generation_time and represents the time since a backing index became a generational index (as opposed to being the write index of a data stream). I’ve run the explain lifecycle API for my-data-stream (which has 2 backing indices as it was rolled over) to get some insights into We can see the lifecycle definition for both indices includes the updated data retention of 7 days. The older index, .ds-my-data-stream-2024.05.09-000001, is not the write index of the data stream anymore and we can see the explain API reports the generation_time as 49 minutes. Once the generation time reaches 7 days, the .ds-my-data-stream-2024.05.09-000001 backing index will be deleted to conform with the configured data retention. Index .ds-my-data-stream-2024.05.09-000002 is the write index of the data stream and is waiting to be rolled over once it meets the rollover criteria . The time_since_index_creation field is meant to help calculating when to rollover the data stream according to an automatic max_age criteria when the data stream is not receiving a lot of data anymore. Migrating from ILM to data stream lifecycle Facilitating a smooth transition to data stream lifecycle for testing, experimenting, and eventually production migration of data streams was always a goal for this feature. For this reason, we decided to allow ILM and data stream lifecycle to co-exist on a data stream in cloud environments and on premise deployments. The ILM configuration continues to exist directly on the backing indices whilst the data stream lifecycle is configured on the data stream itself. A backing index is managed by only one management system at a time. If both ILM and data stream lifecycle are applicable for a backing index, ILM takes precedence (by default, but the precedence can be changed to data stream lifecycle using the index.lifecycle.prefer_ilm index setting). The migration path for a data stream will allow the existing ILM-managed backing indices to age out and eventually get deleted by ILM, whilst the new backing indices will start being managed by data stream lifecycle. We’ve enhanced the GET _data_stream API to include rollover information for each backing index (a managed_by field with Index Lifecycle Management , Data stream lifecycle , or Unmanaged as possible values, and the value of the prefer_ilm setting) and at the data stream level a next_generation_managed_by field to indicate the system that’ll manage the next generation backing index. To configure the future backing indices (created after data stream rollover) to be managed by data stream lifecycle two steps need to be executed: Update the index template that’s backing the data stream to set prefer_ilm to false (note that prefer_ilm is an index setting so configuring it in the index template means it’ll only be configured on the new backing indices) and configure the desired data stream lifecycle (this will make sure the new data streams will start being managed by data stream lifecycle). Configure the data stream lifecycle for the existing data streams using the lifecycle API . For a complete tutorial on migrating to data stream lifecycle check out our documentation . Conclusion We’ve built a lifecycle functionality for data streams that handles the underlying data structures maintenance automatically and lets you focus on the business lifecycle needs like downsampling and data retention. Try out our new Serverless offering and learn more about the possibilities of data stream lifecycle. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Data lifecycle management evolution in Elasticsearch Configuring data stream lifecycle Updating the configured lifecycle Implementation details Migrating from ILM to data stream lifecycle Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Simplifying data lifecycle management for data streams - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/data-lifecycle-simplified-for-data-streams","meta_description":"Elasticsearch data streams can now be managed by a data stream property called lifecycle. Learn to set up and update a data stream lifecycle here."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Indexing OneLake data into Elasticsearch - Part 1 Learn to configure OneLake, consume data using Python and index documents in Elasticsearch to then run semantic searches. Integrations Ingestion How To GL By: Gustavo Llermaly On January 23, 2025 Part of Series Indexing OneLake data into Elasticsearch Check out the different ways to ingest data into Elasticsearch and dive into practical examples to try something new. Elasticsearch is packed with new features to help you build the best search solutions for your use case. Start a free trial now. OneLake is a tool that allows you to connect to different Microsoft data sources like Power BI , Data Activator, and Data factory, among others. It enables centralization of data in DataLakes, large-volume repositories that support comprehensive data storage, analysis, and processing. In this article, we’ll learn to configure OneLake, consume data using Python and index documents in Elasticsearch to then run semantic searches. Sometimes you would want to run searches across unstructured data, and structured from different sources and software providers, and create visualizations with Kibana. For this kind of task indexing the documents in Elasticsearch as a central repository becomes extremely helpful. For this example, we’ll use a fake company called Shoestic, an online shoe store. We have the list of products in a structured file (CSV) while some of the products’ datasheets are in an unstructured format (DOCX). The files are stored in OneLake. You can find a Notebook with the complete example (including test documents) here . Steps OneLake initial configuration Connect to OneLake using Python Indexing documents Queries OneLake initial configuration OneLake architecture can be summarized like this: To use OneLake and Microsoft Fabric, we’ll need an Office 365 account. If you don’t have one, you can create a trial account here . Log into Microsoft Fabric using your account. Then, create a workspace called \"ShoesticWorkspace\". Once inside the newly created workspace, create a Lakehouse and name it \"ShoesticDatalake\". The last step will be creating a new folder inside “Files”. Click on “new subfolder” and name it \"ProductsData\". Done! We're ready to begin ingesting our data. Connect to OneLake using Python With our OneLake configured, we can now prepare the Python scripts. Azure has libraries to handle credentials and communicate with OneLake. Installing dependencies Run the following in the terminal to install dependencies The \"azure-identity azure-storage-file-datalake\" library lets us interact with OneLake while \"azure-cli\" access credentials and grant permissions. To read the files’ content to later index it to Elasticsearch, we use python-docx. Saving Microsoft Credentials in our local environment We’ll use \"az login\" to enter our Microsoft account and run: The flag \" --allow-no-subscriptions\" allows us to authenticate to Microsoft Azure without an active subscription. This command will open a browser window in which you’ll have to access your account and then select your account’s subscription number. We’re now ready to start writing the code! Create a file called onelake.py and add the following: _onelake.py_ Uploading files to OneLake In this example, we’ll use a CSV file and some .docx files with info about our shoe store products. Though you can upload them using the UI, we’ll do it with Python. Download the files here . We’ll place the files in a folder /data next to a new python script called upload_files.py : Run the upload script The result should be: Now that we have the files ready, let’s start analyzing and searching our data with Elasticsearch! Indexing documents We’ll be using ELSER as the embedding provider for our vector database so we can run semantic queries. We choose ELSER because it is optimized for Elasticsearch, outperforming most of the competition in out-of-domain retrieval , which means using the model as it is, without fine tuning it for your own data. Configuring ELSER Start by creating the inference endpoint: While loading the model in the background, you can get a 502 Bad Gateway error if you haven’t used ELSER before. In Kibana, you can check the model status at Machine Learning > Trained Models . Wait until the model is deployed before proceeding to the next steps. Index data Now, since we have both structured and unstructured data, we’ll use two different indices with different mappings as well in the Kibana DevTools Console . For our structured sales let’s create the following index: And to index our unstructured data (product datasheets) we'll use: Note: It’s important to use a field with copy_to to also allow running full-text and not just semantic searches on the body field. Reading OneLake files Before we begin, we need to initialize our Elasticsearch client using these commands (with your own Cloud ID and API-key ). Create a python script called indexing.py and add the following lines: Now, run the script: Queries Once the documents have been indexed in Elasticsearch, we can test the semantic queries. In this case, we’ll search for a unique term in some of the products (tag). We’ll run a keyword search against the structured data, and a semantic one against the unstructured data. 1. Keyword search Result: 2. Semantic search: *We excluded embeddings and chunks just for readability. Result: As you can see, when using the keyword search, we got an exact match to one of the tags and in contrast, when we used semantic search, we got a result that matches the meaning in the description, without needing an exact match. Conclusion OneLake makes it easier to consume data from different Microsoft sources and then indexing these documents Elasticsearch allows us to use advanced search tools. In this first part, we learnt how to connect to OneLake and index documents in Elasticsearch. In part two, we’ll make a more robust solution using the Elastic connector framework. Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Steps OneLake initial configuration Connect to OneLake using Python Installing dependencies Saving Microsoft Credentials in our local environment Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Indexing OneLake data into Elasticsearch - Part 1 - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/onelake-ingesting-data-part-1","meta_description":"Learn to configure OneLake, consume data using Python and index documents into Elasticsearch to then run semantic searches."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Building AI Agents with AI SDK and Elastic Do you keep hearing about AI agents, and aren't quite sure what they are or how to build one in TypeScript (or JavaScript)? Join me as I dive into what AI agents are, the possible use cases they can be used for, along with an example Travel Planner Agent built using AI SDK and Elasticsearch. How To CR By: Carly Richmond On March 25, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Do you keep hearing about AI agents, and aren't quite sure what they are or how they connect to Elastic? Here I dive into AI Agents, specifically covering: What is an AI agent? What problems can be solved using AI Agents? An example agent for travel planning, available here in GitHub, using AI SDK , Typescript and Elasticsearch. What is an AI Agent? An AI agent is a software that is able to perform tasks autonomously and take actions on behalf of a human leveraging Artificial Intelligence. It achieves this by combining one or more LLMs with tools (or functions) that you define to perform particular actions. Example actions in these tools could be: Extracting information from databases, sensors, APIs or search engines such as Elasticsearch. Performing complex calculations whose results can be summarized by the LLM. Making key decisions based on various data inputs quickly. Raising necessary alerts and feedback based on the response. What can be done with them? AI Agents could be leveraged for many different use cases in numerous domains based on the type of agent you build. Possible examples include: A utility-based agent to evaluate actions and make recommendations to maximize the gain, such as to suggest films and series to watch based on a person's prior watching history. Model-based agents that make real-time decisions based on input from sensors, such as self-driving cars or automated vacuum cleaners. Learning agents that combine data and machine learning to identify patterns and exceptions in cases such as fraud detection. Utility agents that recommend investment decisions based on a person's risk market and existing portfolio to maximize their return. With my former finance hat on this could expedite such decisions if accuracy, reputational risk and regulatory factors are carefully weighted. Simple chatbots, as seen today, that can access our account information and answer basic questions using language. Example: Travel Planner To better understand what these agents can do, and how to build one using familiar web technologies, let's walk through a simple example of a travel planner written using AI SDK , Typescript and Elasticsearch. Architecture Our example comprises of 5 distinct elements: A tool, named weatherTool that pulls weather data for the location specified by the questioner from Weather API . A fcdoTool tool that provides the current travel status of the destination from the GOV.UK Content API . The flight information is pulled from Elasticsearch using a simple query in tool flightTool . All of the above information is then passed to LLM GPT-4 Turbo . Model choice When building your first AI agent, it can be difficult to figure out which is the right model to use. Resources such as the Hugging Face Open LLM Leaderboard are a good start. But for tool usage guidance you can also check out the Berkeley Function-Calling Leaderboard . In our case, AI SDK specifically recommends using models with strong tool calling capabilities such as gpt-4 or gpt-4-turbo in their Prompt Engineering documentation . Selecting the wrong model, as I found at the start of this project, can lead to the LLM not calling multiple tools in the way you expect, or even compatibility errors as you see below: Prerequisites To run this example, please ensure the prerequisites in the repository README are actioned. Basic Chat Assistant The simplest AI agent that you can create with AI SDK will generate a response from the LLM without any additional grounding context. AI SDK supports numerous JavaScript frameworks as outlined in their documentation. However the AI SDK UI library documentation lists varied support for React, Svelte, Vue.js and SolidJS, with many of the tutorials targeting Next.js. For this reason, our example is written with Next.js. The basic anatomy of any AI SDK chatbot uses the useChat hook to handle requests from the backend route, by default /api/chat/ : The page.tsx file contains our client-side implementation in the Chat component, including the submission, loading and error handling capabilities exposed by the useChat hook . The loading and error handling functionality are optional, but recommended to provide an indication of the state of the request. Agents can take considerable time to respond when compared to simple REST calls, meaning that it's important to keep a user updated on state and prevent key mashing and repeated calls. Because of the client interactivity of this component, I use the use client directive to make sure the component is considered part of the client bundle: The Chat component will maintain the user input via the input property exposed by the hook, and will send the response to the appropriate route on submission. I have used the default handleSubmit method, which will invoke the /ai/chat/ POST route. The handler for this route, located in /ai/chat/route.ts , initializes the connection to the gpt-4-turbo LLM using the OpenAI provider : Note that the above implementation will pull the API key from the environment variable OPENAI_API_KEY by default. If you need to customize the configuration of the openai provider, use the createOpenAI method to override the provider settings. With the above routes, a little help from Showdown to format the GPT Markdown output as HTML, and a bit of CSS magic in the globals.css file, we end up with a simple responsive UI that will generate an itinerary based on the user prompt: Adding tools Adding tools to AI agents is basically creating custom functions that the LLM can use to enhance the response it generates. At this stage I shall add 3 new tools that the LLM can choose to use in generation of an itinerary, as shown in the below diagram: Weather tool While the generated itinerary is a great start, we may want to add additional information that the LLM was not trained on, such as weather. This leads us to write our first tool that can be used not only as input to the LLM, but additional data that allows us to adapt the UI. The created weather tool, for which the full code is shown below, takes a single parameter location that the LLM will pull from the user input. The schema attribute accepts the parameter object using the TypeScript schema validation library Zod and ensures that the correct parameter types are passed. The description attribute allows you to define what the tool does to help the LLM decide if it wants to invoke the tool. You may have guessed that the execute attribute is where we define an asynchronous function with our desired tool logic. Specifically, the location to send to the weather API is passed to our tool function. The response is then transformed into a single JSON object that can be shown on the UI, and also used to generate the itinerary. Given we are only running a single tool at this stage, we don't need to consider sequential or parallel flows. It's simply the case of adding the tools property to the streamText method that handles the LLM output in the original api/chat route: The tool output is provided alongside the messages, which allows us to provide a more complete experience for the user. Each message contains a parts attribute that contains type and state properties. Where these properties are of value tool-invocation and result respectively, we can pull the returned results from the toolInvocation attribute and show them as we wish. The page.tsx source is changed to show the weather summary alongside the generated itinerary: The above will provide the following output to the user: FCO tool The power of AI agents is that the LLM can choose to trigger multiple tools to source relevant information when generating the response. Let's say we want to check the travel guidance for the destination country. A new tool fcdoGuidance , as per the below code, can trigger an API call to the GOV.UK Content API : You will notice that the format is very similar to the weather tool discussed previously. Indeed, to include the tool into the LLM output it's just a case of adding to the tools property and amending the prompt in the /api/chat route: Once the components showing the output for the tool are added to the page, the output for a country where travel is not advised should look something like this: LLMs that support tool calling have the choice not to call a tool unless it feels the need. With gpt-4-turbo both of our tools are being called in parallel. However, prior attempts using llama3.1 would result in a single model being called depending on the input. Flight information tool RAG, or Retrieval Augmented Generation, refers to software architectures where documents from a search engine or database is passed as the context to the LLM to ground the response to the provided set of documents. This architecture allows the LLM to generate a more accurate response based on data it has not been trained on previously. While Agentic RAG processes the documents using a defined set of tools, or alongside vector or hybrid search, it's also possible to utilize RAG as part of a complex flow with traditional lexical search as we do here. To pass the flight information alongside the other tools to the LLM, a final tool flightTool pulls outbound and inbound flights using the provided source and destination from Elasticsearch using the Elasticsearch JavaScript client : This example makes use of the Multi search API to pull the outbound and inbound flights in separate searches, before pulling out the documents using the extractFlights utility method. To use the tool output, we need to amend our prompt and tool collection once more in /ai/chat/route.ts : With the final prompt, all 3 tools will be called to generate an itinerary including flight options: Summary If you weren't 100% confident about what AI agents are, now you do! We've covered that using a simple travel planner example using AI SDK, Typescript and Elasticsearch. It would be possible to extend our planner to add other sources, allow the user to book the trip along with tours, or even generate image banners based on the location (for which support in AI SDK is currently experimental ). If you haven't dived into the code yet, check it out here ! Resources AI SDK Core Documentation AI SDK Core > Tool Calling Elasticsearch JavaScript Client Travel Planner AI Agent | GitHub Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to What is an AI Agent? What can be done with them? Example: Travel Planner Architecture Model choice Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Building AI Agents with AI SDK and Elastic - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/ai-agents-ai-sdk-elasticsearch","meta_description":"Do you keep hearing about AI agents, and aren't quite sure what they are or how to build one in TypeScript (or JavaScript)? Join me as I dive into what AI agents are, the possible use cases they can be used for, along with an example Travel Planner Agent built using AI SDK and Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Elasticsearch open Inference API adds support for Cohere’s Rerank 3 model “Learn about Cohere reranking, how to use Cohere's Rerank 3 model with the Elasticsearch open inference API and Elastic's roadmap for semantic reranking.” Integrations Vector Database Generative AI How To SC MH By: Serena Chou and Max Hniebergall On April 11, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Cohere's Rerank 3 model rerank-english-v3.0 is now available in their Rerank endpoint . As the only vector database included in Cohere’s Rerank 3 launch, Elasticsearch has integrated seamless support for this new model into our open Inference API. So briefly, what is reranking? Rerankers take the ‘top n’ search results from existing vector search and keyword search systems, and provide a semantic boost to those results. With good reranking in place, you have better ‘top n’ results without requiring you to change your model or your data indexes – ultimately providing better search results you can send to large language models (LLMs) as context. Recently, we collaborated with the Cohere team to make it easy for Elasticsearch developers to use Cohere’s embeddings (available in Elasticsearch 8.13 and Serverless!). It is a natural evolution to include Cohere’s incredible reranking capabilities to unlock all of the tools necessary for true refinement of results past the first-stage of retrieval. Cohere’s Rerank 3 model can be added to any existing Elasticsearch retrieval flow without requiring any significant code changes. Given Elastic’s vector database and hybrid search capabilities, users can also bring embeddings from any 3rd party model to Elastic, to use with Rerank 3. Elastic’s approach to hybrid search When looking to implement RAG (Retrieval Augmented Generation), the strategy for retrieval and reranking is a key optimization for customers to ground LLMs and achieve accurate results. Customers have trusted Elastic for years with their private data, and are able to leverage several first-stage retrieval algorithms (e.g. for BM25/keyword, dense, and sparse vector retrieval). More importantly, most real-world search use cases benefit from hybrid search which we have supported since Elasticsearch 8.9 . For mid-stage reranking, we also offer native support for Learning To Rank and query rescore . In this walkthrough, we will focus on Cohere’s last stage reranking capabilities, and will cover Elastic’s mid stage reranking capabilities in a subsequent blog post! Cohere’s approach to reranking Cohere has seen phenomenal results with their new Rerank model. In the testing, Cohere is reporting that reranking models in particular benefit from long context. Chunking for model token limits is a necessary constraint when preparing your document for dense vector retrieval. But with Cohere’s approach for reranking, a considerable benefit to reranking can be seen based on context contained in the full document, rather than a specific chunk within the document. Rerank has a 4k token limit to enable the input of more context to unlock the full relevance benefits of incorporating this model into your Elasticsearch based search system. (i) General retrieval based on BEIR benchmark; accuracy measured as nDCG@10 (ii) Code retrieval based on 6 common code benchmarks; accuracy measured as nDCG@10 (iii) Long context retrieval based on 7 common benchmarks; accuracy measured as nDCG@10 (iv) Semi-structured (JSON) retrieval based on 4 common benchmarks; accuracy measured as nDCG@10 If you’re interested in how to chunk with LangChain and LlamaIndex , we provide chat application reference code, integrations and more in Search Labs and our open source repository . Alternatively, you can leverage Elastic’s passage retrieval capabilities and chunk with ingest pipelines . Building a RAG implementation with Elasticsearch and Cohere Now that you have a general understanding of how these capabilities can be leveraged, let’s jump into an example on building a RAG implementation with Elasticsearch and Cohere. You'll need a Cohere account and some working knowledge of the Cohere Rerank endpoint . If you’re intending to use Cohere’s newest generative model Command R+ familiarize yourself with the Chat endpoint . In Kibana , you'll have access to a console for you to input these next steps in Elasticsearch even without an IDE set up. If you prefer to use a language client - you can revisit these steps in the provided guide . Elasticsearch vector database In an earlier announcement, we had some steps to get you started with the Elasticsearch vector database. You can review the steps to cover ingesting a sample books catalog, and generate embeddings using Cohere’s Embed capabilities by reading the announcement . Alternatively, if you prefer we also provide a tutorial and Jupyter notebook to get you started on this process. Cohere reranking The following section assumes that you’ve ingested data and have issued your first search. This will give you a baseline as to how the search results are ranked with your first dense vector retrieval. The previous announcement concluded with a query issued against the sample books catalog, and, and generated the following results in response to the query string “Snow”. These results are returned in descending order of relevance. You’ll next want to configure an inference endpoint for Cohere Rerank by specifying the Rerank 3 model and API key. Once this inference endpoint is specified, you’ll now be able to rerank your results by passing in the original query used for retrieval, “Snow” along with the documents we just retrieved with the kNN search. Remember, you can repeat this with any hybrid search query as well! To demonstrate this while still using the dev console, we’ll do a little cleanup on the JSON response above. Take the hits from the JSON response and form the following JSON for the input , and then POST to the cohere_rerank endpoint we just configured. And there you have it, your results have been reranked using Cohere's Rerank 3 model. The books corpus that we used to illustrate these capabilities does not contain large passages, and is a relatively simple example. When instrumenting this for your own search experience, we recommend that you follow Cohere’s approach to populate your input with the context from the full documents returned from the first retrieved result set, not just a retrieved chunk within the documents. Elasticsearch’s accelerated roadmap to semantic reranking and retrievers In upcoming versions of Elasticsearch we will continue to build seamless support for mid and final stage rerankers. Our end goal is to enable developers to have the ability to use semantic reranking to improve the results from any search whether it is BM25, dense or sparse vector retrieval, or a combination with hybrid retrieval. To provide this experience, we are building a concept called retrievers into the query DSL. Retrievers will provide an intuitive way to execute semantic reranking, and will also enable direct execution of what you’ve configured in the open inference API in the Elasticsearch stack without relying on you to execute this in your application logic. When incorporating the use of retrievers in the earlier dense vector example, this is how different the reranking experience can be: (i) Elastic’s roadmap: The indexing step is simplified with the addition of Elastic’s future capabilities to automatically chunk indexed data (ii) Elastic’s roadmap: The kNN retriever specifies the model (in this case Cohere’s Rerank 3) that was configured as an inference endpoint (iii) Cohere’s roadmap: The step between sending the resulting data to Cohere’s Command R+ will benefit from a planned feature named extractive snippets which will enable the user to return a relevant chunk of the reranked document to the Command R+ model This was our original kNN dense vector search executed on the books corpus to return the first set of results for “Snow”. As explained in this blog, there are a few steps to retrieve the documents and pass on the correct response to the inference endpoint. At the time of this publication, this logic should be handled in your application code. In the future, retrievers can be configured to use the Cohere rerank inference endpoint directly within a single API call. In this case, the kNN query is exactly the same as my original, but the cleansing of the response before input to the rerank endpoint will no longer be a necessary step. A retriever will know that a kNN query has been executed and seamlessly rerank using the Cohere rerank inference endpoint specified in the configuration. This same principle can be applied to any search, BM25, dense, sparse and hybrid. Retrievers as an enabler of great semantic reranking is on our active and near term roadmap. Cohere’s generative model capabilities Now you’re ready with a semantically reranked set of documents that can be used to ground the responses for the large language model of your choice! We recommend Cohere’s newest generative model Command R+ . When building the full RAG pipeline, in your application code you can easily issue a command to Cohere’s Chat API with the user query and the reranked documents. An example of how this might be achieved in your Python application code can be seen below: This integration with Cohere is offered in Serverless and soon will be available to try in a versioned Elasticsearch release either on Elastic Cloud or on your laptop or self-managed environment. We recommend you use our Elastic Python client v0.2.0 against your Serverless project to get started! Happy reranking! Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Elastic’s approach to hybrid search Cohere’s approach to reranking Building a RAG implementation with Elasticsearch and Cohere Elasticsearch vector database Cohere reranking Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Elasticsearch open Inference API adds support for Cohere’s Rerank 3 model - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-cohere-rerank","meta_description":"“Learn about Cohere reranking, how to use Cohere's Rerank 3 model with the Elasticsearch open inference API and Elastic's roadmap for semantic reranking.”"}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Faster integration tests with real Elasticsearch Learn how to make your automated integration tests for Elasticsearch faster using various techniques for data initialization and performance improvement. How To PP By: Piotr Przybyl On November 13, 2024 Part of Series Integration tests using Elasticsearch Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In part 1 of this series , we explored how writing integration tests that allow us to test our software against real Elasticsearch isn't rocket science. This post will demonstrate various techniques for data initialization and performance improvements. Integration tests: Different purposes, different characteristics Once the testing infrastructure is set up, and the project is already using an integration test framework for at least one test (like we use Testcontainers in our demo project ), adding more tests becomes easy because it doesn't require mocking. For example, if you need to verify that the number of books fetched for the year 1776 is correct, all you have to do is add a test like this: This is all that's needed, provided the dataset used to initialize Elasticsearch already contains the relevant data. The cost of creating such tests is low, and maintaining them is nearly effortless (as it mostly involves updating the Docker image version). No software exists in a vacuum Today, every piece of software we write is connected to other systems. While tests using mocks are excellent for verifying the behavior of the system we're building, integration tests give us confidence that the entire solution works as expected and will continue to do so. This can make it tempting to add more and more integration tests. Integration tests have their costs However, integration tests aren't free. Due to their nature — going beyond in-memory-only setups — they tend to be slower, costing us execution time. It's crucial to balance the benefits (confidence from integration tests) with the costs (test execution time and billing, often translating directly to an invoice from cloud vendors). Instead of limiting the number of tests because they are slow, we can make them run faster. This way, we can maintain the same execution time while adding more tests. The rest of this post will focus on how to achieve that. Let's revisit the example we've been using so far, as it is painfully slow and needs optimization. For this and subsequent experiments, I assume the Elasticsearch Docker image is already pulled, so it won't impact the time. Also, note that this is not a proper benchmark but more of a general guideline. Tests with Elasticsearch can also benefit from performance improvements Elasticsearch is often chosen as a search solution because it performs well. Developers usually take great care to ensure production code is optimized, especially in critical paths. However, tests are often seen as less critical, leading to slow tests that few people want to run. But it doesn't have to be this way. With some simple technical tweaks and a shift in approach, integration tests can run much faster. Let's start with the current integration test suite. It functions as intended, but with only three tests, it takes five and a half minutes — roughly 90 seconds per test — when you run time ./mvnw test '-Dtest=*IntTest*' . Please note your results may vary depending on your hardware, internet speed, etc. Batch me if you can In integration test suites, many performance issues stem from inefficient data initialization. While certain processes may be natural or acceptable in a production flow (such as data arriving as users enter it), these processes may not be optimal for tests, where we need to import data quickly and in bulk. The dataset in our example is about 50 MiB and contains nearly 81,000 valid records. If we process and index each record individually, we end up making 81,000 requests just to prepare data for each test. Instead of a naive loop that indexes documents one by one (as in the main branch): We should use a batch approach , such as with BulkIngester . This allows concurrent indexing requests, with each request sending multiple documents, significantly reducing the number of requests: This simple change reduced the overall testing time to around 3 minutes and 40 seconds, or roughly 73 seconds per test. While this is a nice improvement, we can push further. Stay local We reduced test duration in the previous step by limiting network round-trips. Could there be more network calls we can eliminate without altering the tests themselves? Let's review the current situation: Before each test, we fetch test data from a remote location repeatedly. While the data is fetched, we send it to the Elasticsearch container in bulk. We can improve performance by keeping the data as close to the Elasticsearch container as possible. And what's closer than inside the container itself? One way to import data into Elasticsearch en masse is the _bulk REST API, which we can call using curl . This method allows us to send a large payload (e.g., from a file) written in newline-delimited JSON format. The format looks like this: Make sure the last line is empty. In our case, the file might look like this: Ideally, we can store this test data in a file and include it in the repository, for example, in src/test/resources/ . If that's not feasible, we can generate the file from the original data using a simple script or program. For an example, take a look at CSV2JSONConverter.java in the demo repository. Once we have such a file locally (so we have eliminated network calls to obtain the data), we can tackle the other point, which is: copying the file from the machine where the tests are running into the container running Elasticsearch. It's easy, we can do that using a single method call, withCopyToContainer , when defining the container. So after the change it looks like this: The final step is making a request from within the container to send the data to Elasticsearch. We can do this with curl and the _bulk endpoint by running curl inside the container. While this could be done in the CLI with docker exec , in our @BeforeEach , it becomes elasticsearch.execInContainer , as shown here: Reading from the top, we're making this way a POST request to the _bulk endpoint (and wait for the refresh to complete), authenticated as the user elastic with the default password, accepting the auto-generated and self-signed certificate (which means we don't have to disable SSL/TLL), and the payload is the content of the /tmp/books.ndjson file, which was copied to the container when it was starting. This way, we reduce the need for frequent network calls. Assuming the books.ndjson file is already present on the machine running the tests, the overall duration is reduced to 58 seconds. Less (often) is more In the previous step, we reduced network-related delays in our tests. Now, let's address CPU usage. There's nothing wrong with relying on @Testcontainers and @Container annotations. The key, though, is to understand how they work: when you annotate an instance field with @Container , Testcontainers will start a fresh container for each test. Since container startup isn't free (it takes time and resources), we pay this cost for every test. Starting a fresh container for each test is necessary in some scenarios (e.g., when testing system start behavior), but not in our case. Instead of starting a new container for each test, we can keep the same container and Elasticsearch instance for all tests, as long as we properly reset the container's state before each test. First, make the container a static field. Next, before creating the books index (by defining the mapping) and populating it with documents, delete the existing index if it exists from a previous test. For this reason the setupDataInContainer() should start with something like: As you can see, we can use curl to execute almost any command from within the container. This approach offers two significant advantages: Speed : If the payload (like books.ndjson ) is already inside the container, we eliminate the need to repeatedly copy the same data, drastically improving execution time. Language Independence : Since curl commands aren't tied to the programming language of the tests, they are easier to understand and maintain, even for those who may be more familiar with other tech stacks. While using raw curl calls isn't ideal for production code, it's an effective solution for test setup. Especially when combined with a single container startup, this method reduced my tests execution time to around 30 seconds. It's also worth noting that in the demo project (branch data-init ), which currently has only three integration tests, roughly half of the total duration is spent on container startup. After the initial warm-up, individual tests take about 3 seconds each. Consequently, adding three more tests won't double the overall time to another 30 seconds, but will only increase it by roughly 9-10 seconds. The test execution times, including data initialization, can be observed in the IDE: Summary In this post, I demonstrated several improvements for integration tests using Elasticsearch: Integration tests can run faster without changing the tests themselves — just by rethinking data initialization and container lifecycle management. Elasticsearch should be started only once, rather than for every test. Data initialization is most efficient when the data is as close to Elasticsearch as possible and transmitted efficiently. Although reducing the test dataset size is an obvious optimization (and not covered here), it's sometimes impractical. Therefore, we focused on demonstrating technical methods instead. Overall, we significantly reduced the test suite's duration — from 5.5 minutes to around 30 seconds — lowering costs and speeding up the feedback loop. In the next post , we'll explore more advanced techniques to further reduce execution time in Elasticsearch integration tests. Let us know if your case is using one of the techniques described above or if you have questions on our Discuss forums and the community Slack channel . Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Integration tests: Different purposes, different characteristics No software exists in a vacuum Integration tests have their costs Tests with Elasticsearch can also benefit from performance improvements Batch me if you can Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Faster integration tests with real Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-integration-tests-faster","meta_description":"Learn how to make your automated Elasticsearch integration tests faster using various techniques for data initialization and performance improvement."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Comparing ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard This blog compares ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard. ML Research AP SC By: Aris Papadopoulos and Serena Chou On October 7, 2024 Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. Comparing ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard ELSER (Elastic Learned Sparse EncodeR) is Elastic’s transformer language model for semantic search and is a popular model for anyone interested in utilizing machine learning to elevate the relevance of a traditional search experience or to power a newly designed Retrieval Augmented Generation (RAG) application. ELSER v2 remains in the top-10 models on MTEB for Retrieval when grouping together the multiple flavors of the same competitor family. It is also one of the very few models in the top-10 that was released in 2023, with the majority of the competition having been released in 2024. ELSER timeline First introduced in June of 2023 , and with a second version made generally available in November 2023 , ELSER from day one has been designed to minimize the barrier to semantic search, while significantly improving search relevance, by capturing the context, semantic relationships and user intent in natural language. Among other use cases, this is an incredibly intuitive and valuable addition to RAG applications, as surfacing the most relevant results is critical for generative applications to produce accurate responses based on your own private data and to minimize the probability of hallucinations. ELSER can be used in tandem with the highly scalable, distributed Elasticsearch vector database, the open Inference API, native model management and the full power of the Search AI platform. ELSER is the component that provides the added value of state-of-the-art semantic search for a wide range of use cases and organizations. Because it is a sparse vector model (this will be explained further later in the blog), it is optimized for the Elasticsearch platform and it achieves superior relevance out of domain. When ELSER was first released, it outperformed the competition in out-of-domain retrieval, i.e. without you having to retrain/fine-tune a model on own data, as measured by the industry standard BEIR benchmark. This was a testament to Elastic’s commitment to democratize AI search. ELSER v2 was released in October 2023 and introduced significant performance gains on your preferred price point of operation by adding optimizations for Intel CPUs and by introducing token pruning . Because we know that the other equally important part of democratizing AI search is reducing its cost. As a result we provide two model artifacts: one optimized for Intel CPUs (leveraged by Elastic Cloud) and a cross-platform one. NDCG@10 for BEIR data sets for BM25 and ELSER V2 ELSER customer reception Customers worldwide leverage ELSER today in production search environments, as a testament to the ease of use and the immediate relevance boost that is achievable in a few clicks. Examples of ELSER customer success stories include Consensus , Georgia State University and more. When these customers test ELSER in pilots or initial prototypes, a common question is how does ELSER compare with relevance that can be achieved with traditional keyword (i.e.BM25) retrieval or with the use of a number of other models, including for example OpenAI’s text-embedding-ada-002. To provide the relevant comparison insights, we published a holistic evaluation of ELSER (the generally available version) on MTEB (v1.5.3). MTEB is a collection of tasks and datasets that have been carefully chosen to give a solid comparison framework between NLP models. It was introduced with the following motivation: “Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding Benchmark (MTEB).” ( source paper ). MTEB Leaderboard comparison - What you need to know For a meaningful comparison on MTEB, a number of considerations come into play. First, the number of parameters. The more parameters a model has, the greater its potential, but it also becomes far more resource intensive and costly. Models of similar size (number of parameters) are best for comparison, as models with vastly different numbers of parameters typically serve different purposes in a search architecture. Second, one of the aims of MTEB is to compare models and their variations on a number of different tasks. ELSER was designed specifically to lower the barrier for AI search by offering you state-of-the-art out-of-domain retrieval, so we will focus on the outcomes for the Retrieval task. Retrieval is measured with the ndcg@10 metric. Finally some models appear in multiple flavors, incorporating different numbers of parameters and other differentiations, forming a family. It makes more sense to group them together and compare against the top performer of the family for the task. ELSER on MTEB Leaderboard According to the above, filtering for the classes of up to 250 million parameters (ELSER has 110 million parameters), at the time of writing of this blog and as we are working on ELSER v3, ELSER v2 remains in the top-10 models for Retrieval when grouping together the multiple flavors of the same competitor family. It is also one of the very few models in the top-10 that was released in 2023, with the majority of the competition having been released in 2024. The top of the MTEB list for Retrieval (nDCG@10) for models with <250 million parameters. At the time of writing, ELSER ranks top-10 for the retrieval task. It is one of the very few models in the group that was released in 2023, with the vast majority released in 2024. The list, when filtered as mentioned inline, includes more then 80 models (not grouped) at the time of writing. Elastic’s continued investment in ELSER As mentioned previously, ELSER uses a contextualized sparse vector representation, a design choice that gives it the nice properties mentioned before and all the space for gains and feature extensions in future releases that are already in development. This sets it apart on MTEB, as the vast majority of models on the leaderboard are embeddings, i.e. dense vectors. This is why you will notice a much larger number of dimensions in the corresponding MTEB column for ELSER compared with the other models. ELSER extends BERT’s architecture and expands the output embeddings by retaining the masked language model (MLM) head and adapting it to create and aggregate per-token activation distributions for each input sequence. As a result, the number of dimensions is equal to BERT’s vocabulary, only a fraction of which get activated for a given input sequence. The upcoming ELSER v3 model is currently in development, being trained with the additional use of LLM-generated data, new advanced training recipes and other state-of-the-art and novel strategies as well as support for GPU inference. Conclusion The innovation in this space is outpacing many customers' ability to adopt, test and ensure enterprise quality incorporation of new models into their search applications. Many customers lack holistic insight into the metrics and methodology behind the training of the model artifacts, leading to additional delays in adoption. From the very first introduction of our ELSER model, we have provided transparency into our relevance goals, our evaluation approach for improved relevance and the investments into efficient performance of this model on local, self-managed deployments (even those hosted on laptops!) with capabilities to enable scale for large production grade search experiences. Our full results are now published on the MTEB Leaderboard to provide an additional baseline in comparison to new emerging models. In upcoming versions of ELSER we expect to apply new state of the art retrieval techniques, evaluate new use cases for the model itself, and provide additional infrastructure support for fast GPU powered ELSER inference workloads. Stay tuned! Links https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-1 https://www.elastic.co/search-labs/blog/introducing-elser-v2-part-2 https://www.elastic.co/search-labs/blog/may-2023-launch-information-retrieval-elasticsearch-ai-model Report an issue Related content How To July 26, 2024 Serverless semantic search with ELSER in Python: Exploring Summer Olympic games history This blog shows how to fetch information from an Elasticsearch index, in a natural language expression, using semantic search. We will load previous olympic games data set and then use the ELSER model to perform semantic searches. EK By: Essodjolo Kahanam ML Research June 21, 2023 Introducing Elastic Learned Sparse Encoder: Elastic’s AI model for semantic search Learn about the Elastic Learned Sparse Encoder (ELSER), an AI model for high relevance semantic search across domains. AP GG By: Aris Papadopoulos and Gilad Gal ML Research October 17, 2023 Improving information retrieval in the Elastic Stack: Optimizing retrieval with ELSER v2 Learn how we are reducing the retrieval costs of the Learned Sparse EncodeR (ELSER) v2. TV QH VK By: Thomas Veasey , Quentin Herreros and Valeriy Khakhutskyy Search Relevance ML Research April 3, 2025 Generating filters and facets using ML Exploring the pros and cons of automating the creation of filters and facets in a search experience using ML models vs the classical hard-coded approach. AL By: Andre Luiz ML Research Python February 5, 2025 Implementing clustering workflows in Elastic to enhance search relevance We demonstrate how to integrate custom clustering models into the Elastic Stack by leveraging OpenAI text-ada-002 vectors, streamlining the workflow within Elastic’s ecosystem. GC KS By: Gus Carlock and Kirti Sodhi Jump to Comparing ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard ELSER timeline ELSER customer reception MTEB Leaderboard comparison - What you need to know ELSER on MTEB Leaderboard Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Comparing ELSER for retrieval relevance on the Hugging Face MTEB Leaderboard - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-elser-relevance-mteb-comparison","meta_description":"Discover how Elastic ELSER ranks on the Hugging Face MTEB Leaderboard for retrieval relevance, with insights into the parameters shaping its performance."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to choose the best k and num_candidates for kNN search Learn strategies for selecting the optimal values for `k` and `num_candidates` parameters in kNN search, illustrated with practical examples. How To MK By: Madhusudhan Konda On May 24, 2024 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. How to choose the best k and num_candidates for kNN search? Vector search has emerged as a game-changer in the current generative AI/ML world. It allows us to find similar items based on their semantic meaning rather just exact keyword matches. Elasticsearch's k-Nearest Neighbors (kNN) algorithm is a foundational ML technique for classification and regression tasks. It found a significant place within Elasticsearch's ecosystem with the introduction of vector search capabilities. Introduced in Elasticsearch 8.5, kNN based vector search allows users to perform high-speed similarity searches on dense vector fields. Users can find documents in the index \"closest\" to a given vector by leveraging the kNN algorithm using an underlying specified distance metric such as Euclidean or Cosine similarity. This feature marked a pivotal advancement as it is particularly useful in applications requiring semantic search, recommendations and other use cases such as anomaly detection. The introduction of dense vector fields and k-nearest neighbor (kNN) search functionality in Elasticsearch has opened new horizons for implementing sophisticated search capabilities that go beyond traditional text search. This article delves into strategies for selecting the optimal values for k and num_candidates parameters, illustrated with practical examples using Kibana. kNN search query Elasticsearch provides a kNN search option for nearest-neighbors - something like the following: As the snippet shows, the knn query fetches the relevant results for the query in question (having a movie title as \"Good Ugly\") using vector search. The search is conducted in a multi-dimensional space, producing the closest vectors to the given query vector. From the above query, notice two attributes: num_candidates which is the initial pool of candidates to consider and k , the number of nearest neighbors. kNN critical parameters - k and num_candidates To leverage the kNN feature effectively, one requires a nuanced understanding of the two critical parameters: k - the number of global nearest neighbors to retrieve, and num_candidates - the number of candidate neighbors considered for each shard during the search. Choosing the optimal values for the k and num_candidates involves balancing precision, recall, and performance. These parameters play a crucial role to efficiently handle high-dimensional vector spaces commonly found in machine learning applications. The optimal value for k largely depends on the specific use case. For example, if you're building a recommendation system, a smaller k (e.g., 10-20) might be sufficient to provide relevant recommendations. In contrast, for a use case where you'd want clustering or outlier detection capabilities, you might need a larger k . Note that the higher k value can significantly increase both computation and memory usage, especially with large datasets. It's important to test different values of k to find a balance between result relevance and system resource usage. K: Unveiling the closest neighbors We have an option of choosing the k value as per our requirements. Sometimes, setting up a lower k value receives more or less exactly what you want with the exception that a few results might not make it to the final output. However, setting up a higher k value might broaden your search results in numbers, with a caveat that you may receive diversified results at times. Imagine you're searching for a new book in the vast library of recommendations. k , also known as the number of nearest neighbors, determines how many books you'll be presented with. Think of it as the inner circle of your search results. Let's see how setting the lower and higher k values affects the number of books that the query returns. Setting lower K The lower K setting prioritizes extreme precision - meaning we will receive a handful of books that are the most similar to our query vector. This ensures a high degree of relevance to our specific interests. This might be ideal if you're searching for a book with a very specific theme or writing style. Setting higher K With a larger K value, we will be fetching a broader exploration result set. Note that the results might not be as tightly focused on your exact query. However, you'll encounter a wider range of potentially interesting books. This approach can be valuable for diversifying your reading list and discovering unexpected gems, perhaps. Whenever we say higer or lower values of k , we mean the actual values depends on multiple factors, such as size of the data sets, available computing power and other factors. In some cases, the k=10 might be a large but in others it might be small too. So, do keep a note of the environmnet that this parateter is expected to operate. The num_candidates attribute: Behind the curtain While k determines the final number of books you see, num_candidates plays a crucial role under the hood. It essentially defines the search space per shard – the initial pool of books in a shard from which the most relevant K neighbors are identified. When we issue the query, we are expected to hint Elasticsearch to run the query amongst top \"x\" number of candidates on each shard. For example, say our books index contains 5000 books evenly distributed amongst five primary shards (i.e., ~1000 books per shard). When we are performing a search, obviously choosing all 1000 documents for each shard is neither a viable nor a correct option. Instead, we will be pick up to say 25 documents (which is our num_candidates ) from the 1000 documents. That amounts to 125 documents as our total search space (5 shards times 25 documents each). We will let the kNN query know to choose the 25 documents from each shard and this number is the num_candidates parameter. When the kNN search is executed, the \"coordinator\" node sends the request query to all of the involved shards. The num_candidates documents from each shard will constitute the search space and the top k documents will be fetched from that space. Say, if k is 3, the top 3 documents out of the 25 candidate documents will be selected in each shard and returned to the coordinator node. That is, the coordinator node will receive 15 documents in total from all the involved nodes. These top 15 documents are then ranked to fetch the global top 3 ( k ==3) documents. The process is depicted in the following figure: Here's what num_candidates means for your search: Setting the lower num_candidates This approach might restrict the search space, potentially missing some relevant books that fall outside the initial exploration set. Think of it as surveying a smaller portion of the library's shelves. Setting the higher num_candidates A higher num_candidates value increases the likelihood of finding the true nearest neighbors within our chosen K. It expands the search space - that is - more number of candidates are considered - and hence leads to a slight increase in search time. So, a higher value generally increases accuracy (as the chance of missing relevant vectors decreases) but at the cost of performance. Balancing precision & performance for kNN parameters The optimal values for k and num_candidates depend on a few factors and specific needs. If we prioritize extreme precision with a smaller set of highly relevant results, a lower k with a moderate num_candidates might be ideal. Conversely, if exploration and discovering unexpected books are your goals, a higher K with a larger num_candidates could be more suitable. While there is no hard-and-fast rule to define the \"lower\" or \"higher\" number for the num_candidates , you need to decide this number based on your dataset, computing power and the expected precision. Experimentation to optimize kNN parameters By experimenting with different K and num_candidates combinations and monitoring search results and performance, you can fine-tune your searches to achieve the perfect balance between precision, exploration, and speed. Remember, there's no one-size-fits-all solution – the best approach depends on your unique goals and data characteristics. Practical example: Using kNN for movie recommendations Let's consider an example of movies to create a manual \"simple\" framework for understanding the effect of k and num_candidates attributes while searching for movies. Manual framework Let's understand how we can develop a home grown framework for tweaking the k and num_of_candidates attributes for a kNN search. The mechanics of the framework is as follows: Create a movies index with a couple of dense_vector fields in the mapping to hold our vectorised data. Create an embedding pipeline so each and every movie's title and synopsis fields will be embedded with a multilingual-e5-small model to store vectors. Perform the indexing operation,which goes through the above embedding pipeline. The respective fields will be vectorised Create a search query using kNN feature Tweak the k and num_candidates options as you'd want Let's dig in. Creating an inference pipeline We will need to index data via Kibana - far from ideal - but it will do for this manual framework understanding. However, every movie that gets indexed must have the title and synopsis field vectorised to enable semantic search on our data. We can do this by elegantly creating a inference pipeline processor and attaching it to our batch indexing operation. Let's create an inference pipeline: The inference pipeline movie_embedding_pipeline , as shown above, creates vector fields text embedding for title and synopsis fields. It uses the inbuilt multilingual-e5-small model to create the text embeddings. Creating index mappings We will need to create a mapping with couple of properties as dense_vector fields. The following code snippet does the job: Once the above command gets executed, we have a new movies index with the appropriate dense vector fields, including title_vector.predicted_value and synopsis_vector.predicted_value fields that hold respective vectors. The index mapping parameter was set to false by default up to release 8.10. This has been changed in release 8.11, where the parameter is set to true by default, which makes it unnecessary to specify it. Next step is to ingest the data. Indexing movies We can use _bulk operation to index a set of movies - I'm reusing a dataset that I had created for my Elasticsearch in Action 2nd edition book - which is available here : For completeness, a snippet of the ingestion using the _bulk operation is provided here: Make sure you replace the script with the full dataset. Note that the _bulk operation is suffixed with the pipeline ( ?pipeline=movie_embedding_pipeline ) so the every movie gets passed through this pipeline, thus producing the vectors. As we primed our movies indexed with vector embeddings, it's time to start our experiments on fine tuning k and num_candidates attributes. kNN search As we have vector data in our movies index, we will be using approximate k-nearest neighbor (kNN) search. For example, to recommend movies similar that has father-son sentiment (\"Father and son\" as search query), we'll use a kNN search to find the nearest neighbors: In the given example, the query leverages the top-level kNN search option parameter that directly focuses on finding documents closest to a given query vector. One key difference between this search with knn query at the top level as opposed to query at the top level is that in the former case, the query vector will be generated on-the-fly by a machine learning model. The part in bold is not technically correct. On-the-fly vector generation is only achieved by using query_vector_builder instead of query_vector where you pass in the vector (computed outside of ES) but both the top-level knn search option and the knn search query provide this capability. The script fetches the relevant results based on our search query (which is built using the query_vector_builder block). We are using a random k and num_candidates values set to 5 and 10 respectively. kNN query attributes The above query has a set of attributes that would make up the kNN query. The following information about these attributes will help you understand the query better: The field attribute specifies the field in the index that contains the vector representations of our documents. In this case, title_vector.predicted_value is the field storing the document vectors. The query_vector_builder attribute is where the example significantly diverges from simpler kNN queries. Instead of providing a static query vector, this configuration dynamically generates a query vector using a text embedding model. The model transforms a piece of text (\"Father and son\" in the example) into a vector that represents its semantic meaning. The text_embedding indicates that a text embedding model will be used to generate the query vector. The model_id is the identifier for the pre-trained machine learning model to use, It is the .multilingual-e5-small model in this example. The model_text attribute is the text input that will be converted into a vector by the specified model. Here, it's the words \"Father and son\", which the model will interpret semantically to find similar movie titles. The k is the number of nearest neighbors to retrieve - that is, it determines how many of the most similar documents to return based on the query vector. The num_candidates attribute is the broader set of candidate documents per shard as potential matches to ensure the final results are as accurate as possible. kNN results Executing the kNN basic search script should get us top 5 results - for brevity, I'm providing just the list of the movies. As you can expect, Godfather (both parts) are part of the father-and-son bonding while Pulp Fiction shouldn't have been part of the results (though the query is asking about \"bonding\" - Pulp Fiction is all about the bonding between few people). Now that we have a basic framework setup, we can tweak the parameters appropriate and deduce the approximate settings. Before we tweak the settings, let's understand the optimal setting of k attribute. Choosing optimal K value Choosing the optimal value of k in k-Nearest Neighbors (kNN) algorithms is crucial for attaining the best possible performance on our dataset with minimal errors. However, there isn't a one-size-fits-all answer, as the best k value can depend on a few factors such as specifics of our data and what we are trying to predict. To choose an optimal k value, one must create a custom framework with several strategies and considerations. k = 1: Try running the search query with k=1 as a first step. Make sure you change the input query for each run. The query should give you unreliable results as changing the input query will return incorrect results over time. This leads to a ML pattern called \"overfitting\" where the model becomes overly reliant on the specific data points in the immediate neighborhood. Model, thus, struggles to generalize to unseen examples. k = 5: Run the search query with k=5 and check the predictions. The stability of the search query should ideally improved and you should be getting adequate reliable predictions. You can either incrementally increase the value of k - may be increase in the steps of 5 or x - until you find that sweet spot where you'd find the results for the input queries are pretty much spot on with less number of errors. You can go to extreme values of k too, for example, pick a higher value of k=50 , as discussed below: k = 50: Increase the k value to 50 and check the search results. The errored results most likely outshine the actual/expected predictions. This is when you know that you are hitting the hard boundary of the k value. Larger k values leads to a ML feature called \"underfitting\" - a underfitting in KNN happens when the model is too simplistic and fails to capture the underlying patterns in the data. Choosing the optimal num_candidates value The num_candidates parameter plays a crucial role in finding the optimal balance between search accuracy and performance. Unlike k, which directly influences the number of search results returned, num_candidates determines the size of the initial candidate set from which the final k nearest neighbors are selected. As discussed earlier, the num_candidates parameter defines how many nearest neighbors will be selected on each shard. Adjusting this parameter is essential for ensuring that the search process is both efficient and yields high-quality results. num_candidates = Small Value (e.g., 10): Start with a low value (\"low-value-exploration\") for num_candidates as a preliminary step. The aim is to establish a baseline for performance at this stage. As the candidate bunch is just a handful of candidates, the search will be fast but might miss relevant results - which leads to poor accuracy. This scenario helps us to understand the minimum threshold where the search quality is noticeably compromised. num_candidates = Moderate Value (e.g., 25?): Increase the num_candidates to a moderate value (\"moderate-value-exploration\") and observe the changes in search quality and execution time. A moderate number of candidates is likely to improve the accuracy of the results by considering a wider pool of potential neighbors. As the number of candidates increased, there's going to be cost of resources, be mindful of that. So, keep monitoring the performance metrics closely. However, as the search accuracy increases, perhaps the increase in computational cost could be justifiable. num_candidates = Step Increase: Continue to incrementally increase num_candidates (incremental-increase-exploration), possibly in steps of 20 or 50 (depending on the size of your dataset). Evaluate whether the additional candidates contribute to a meaningful improvement in search accuracy with each of the increments. There will be a a point of diminishing returns where increasing num_candidates further yields little to no improvement in result quality. At the same time you may have noticed, this will strain our resources and significantly impacts performance. num_candidates = High Value (say, 1000, 5000): Experiment with a high value for num_candidates to understand the upper bounds of the impact of choosing the higher settings. There's a possibility of your search accuracy stabilizing or degrading slightly due to the inclusion of less relevant candidates. This may lead to dilute the precision of the final k results. Do note that, as we've been talking about it, the high values of num_candidates will always increase the computational load - thus longer query times and potential resource constraints. Finding the optimal balance We now know how to adjust the k and num_candidates attributes and how our experiments to different settings would change the outcome of search accuracy. The goal is to find a sweet spot where the search results are consistently accurate with lower performance overhead from processing a large candidate set is manageable. Of course, the optimal value will vary depending on the specifics of our data, the dimensionality of the vectors, and other performance requirements. Wrap up The optimal K value lies in finding the sweet spot by experiment and trials. You want to use enough neighbors (K being lower side) to capture the essential patterns but not so many ( k being on the higher side) that the model becomes overly influenced by noise or irrelevant details. You also want to tweak the candidates so that the search results are accurate at a given k value. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to How to choose the best kNN search query kNN critical parameters - k and num_candidates K: Unveiling the closest neighbors Setting lower K Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to choose the best k and num_candidates for kNN search - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-knn-and-num-candidates-strategies","meta_description":"Learn strategies for selecting the optimal values for `k` and `num_candidates` parameters in kNN search, illustrated with practical examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Using Ollama with the Inference API The Ollama API is compatible with the OpenAI API so it's very easy to integrate Ollama with Elasticsearch. Generative AI How To JR By: Jeffrey Rengifo On February 14, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. In this article, we'll learn how to connect local models to the Elasticsearch inference model using Ollama and then ask your documents questions using Playground. Elasticsearch allows users to connect to LLMs using the Open Inference API , supporting providers such as Amazon Bedrock, Cohere, Google AI, Azure AI Studio, HuggingFace - as a service, among others. Ollama is a tool that allows you to download and execute LLM models using your own infrastructure (your local machine/server). Here you can find a list of the available models that are compatible with Ollama. Ollama is a great option if you want to host and test different open source models without having to worry about the different ways each of the models could have to be set up, or about how to create an API to access the model functions as Ollama takes care of everything. Since the Ollama API is compatible with the OpenAI API, we can easily integrate the inference model and create a RAG application using Playground. Prerequisites Elasticsearch 8.17 Kibana 8.17 Python Steps Setting up Ollama LLM server Creating mappings Indexing data Asking questions using Playground Setting up Ollama LLM server We're going to set up a LLM server to connect it to our Playground instance using Ollama. We'll need to: Download and run Ollama. Use ngrok to access your local web server that hosts Ollama over the internet Download and run Ollama To use Ollama, we first need to download it . Ollama offers support for Linux, Windows, and macOS so just download the Ollama version compatible with your OS here. Once Ollama is installed, we can choose a model from this list of supported LLMs. In this example, we'll use the model llama3.2 , a general multilanguage model. In the setup process, you will enable the command line tool for Ollama. Once that’s downloaded you can run the following line: Which will output: Once installed, you can test it with this command: Let's ask a question: With the model running, Ollama enables an API that would run by default on port \"11434\". Let's make a request to that API, following the official documentation : This is the response we got: Note that the specific response for this endpoint is a streaming. Expose endpoint to the internet using ngrok Since our endpoint works in a local environment, it cannot be accessed from another point–like our Elastic Cloud instance–via the internet. ngrok allows us to expose a port offering a public IP. Create an account in ngrok and follow the official setup guide . Once the ngrok agent has been installed and configured, we can expose the port Ollama is using: Note: The header --host-header=\"localhost:11434\" guarantees that the \"Host\" header in the requests matches \"localhost:11434\" Executing this command will return a public link that will work as long as the ngrok and the Ollama server run locally. In \"Forwarding\" we can see that ngrok generated a URL. Save it for later. Let's try making an HTTP request to the endpoint again, now using the ngrok-generated URL: The response should be similar to the previous one. Creating mappings ELSER endpoint For this example, we'll create an inference endpoint using the Elasticsearch inference API . Additionally, we'll use ELSER to generate the embeddings. For this example, let's imagine that you have a pharmacy that sells two types of drugs: Drugs that require a prescription. Drugs that DO NOT require a prescription. This information would be included in the description field of each drug. The LLM must interpret this field, so this is the data mappings we'll use: The field text_description will store the plain text of the descriptions while semantic_field , which is a semantic_text field type, will store the embeddings generated by ELSER. The property copy_to will copy the content from the fields name and text_description into the semantic field so that the embeddings for those fields are generated. Indexing data Now, let's index the data using the _bulk API . Response: Asking questions using Playground Playground is a Kibana tool that allows you to quickly create a RAG system using Elasticsearch indexes and a LLM provider. You can read this article to learn more about it. Connecting the local LLM to Playground We first need to create a connector that uses the public URL we've just created. In Kibana, go to Search>Playground and then click on \"Connect to an LLM\". This action will reveal a menu on the left side of the Kibana interface. There, click on \"OpenAI\". We can now start configuring the OpenAI connector. Go to \"Connector settings\" and for the OpenAI provider, select \"Other (OpenAI Compatible Service)\": Now, let's configure the other fields. For this example, we'll name our model \"medicines-llm\". In the URL field, use the one generated by ngrok ( /v1/chat/completions ). On the \"Default model\" field, select \"llama3.2\". We won't use an API Key so just put any random text to proceed: Click on \"Save\" and add the index medicines by clicking on \"Add data sources\": Great! We now have access to Playground using the LLM we're running locally as RAG engine. Before testing it, let's add more specific instructions to the agent and up the number of documents sent to the model to 10, so that the answer has the most possible documents available. The context field will be semantic_field , which includes the name and description of the drugs, thanks to the copy_to property. Now let's ask the question: Can I buy Clonazepam without a prescription? and see what happens: As expected, we got the correct answer. Next steps The next step is to create your own application! Playground provides a code script in Python that you can run on your machine and customize it to meet your needs. For example, by putting it behind a FastAPI server to create a QA medicines chatbot consumed by your UI. You can find this code by clicking the View code button in the top right section of Playground: And you use the Endpoints & API keys to generate the ES_API_KEY environment variable required in the code. For this particular example the code is the following: To make it work with Ollama, you have to change the OpenAI client to connect to the Ollama server instead of the OpenAI server. You can find the full list of OpenAI examples and compatible endpoints here . And also change the model to llama3.2 when calling the completion method: Let’s add our question: Can I buy Clonazepam without a prescription? To the Elasticsearch query: And also to the completion call with a couple of prints, so we can confirm we are sending the Elasticsearch results as part of the question context: Now let’s run the command pip install -qU elasticsearch openai python main.py You should see something like this: Conclusion In this article, we can see the power and versatility of tools like Ollama when we use them together with the Elasticsearch inference API and Playground. After some simple steps, we had a working RAG application with a chat that used a LLM running in our own infrastructure at zero cost. This also allows us to have more control over resources and sensitive information, besides giving us access to a variety of models for different tasks. Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett Jump to Prerequisites Steps Setting up Ollama LLM server Download and run Ollama Expose endpoint to the internet using ngrok Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using Ollama with the Inference API - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/ollama-with-inference-api","meta_description":"Learn how to integrate Ollama with Elasticsearch using the Ollama API, which is compatible with the OpenAI API, making the integration easier."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Multilingual vector search with the E5 embedding model Here's how multilingual vector search works and how to use Elasticsearch with the multilingual E5 embedding model, including examples. Vector Database Python JD By: Josh Devins On September 12, 2023 Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now. Vector search has taken the search and information retrieval world by storm in recent years. It has the ability to match the semantics of queries with documents, to incorporate context and meaning of text, and provides users with natural language querying abilities like never before. Vector search is a great source of context for prompting large language models (LLMs), and it's powering more and more modern search experiences in the age of Generative AI. This blog goes over multilingual vector search, explains how it works, and how to use Elasticsearch with E5 embeddings models. It also showcases examples of multilingual search across languages. Why multilingual embeddings? When researchers first started working with and training embedding models for vector search, they used the most widely available datasets they could find. These datasets however tended to all be in the English language. Queries were in English, Wikipedia articles indexed were in English, and quickly the non-English speaking world took notice. Language-specific language models slowly started to pop up for languages like German, French, Chinese, Japanese and so on. However those models only worked within that language. With the power of embeddings, we also have the ability to train models which embed multiple languages into the same \"embedding space\", using a single model. You can think of an embedding space as a language agnostic, mathematical representation (dense vector) of the concepts that sentences (queries or passages) represent where embeddings close to each other in the embedding space have similar semantic meaning. Since we can embed text, images and audio into an embedding space, why not embed multiple languages into the same embedding space? This is the idea behind multilingual embedding models. With aligned training datasets — datasets containing similar sentences in different languages — it's possible to make the model learn not the translation of words between languages, but the relationships and meaning underlying each sentence irrespective of language. This is a true cross-lingual model, capable of working with pairs of text in any of the languages it was trained on. Now let's see how to use these aligned multilingual models. Let's consider a few examples For this exercise, we'll map sentences from English and German into the same part of the embedding space, when they have the same underlying meaning. Let's say I have the following sentences that I'd like to index and search over. For the non-German speakers out there, we've provided the direct English translation of the German sentences. 😉 id=doc1, language=en, passage=\"I sat on the bank of the river today.\" id=doc2, language=de, passage=\"Ich bin heute zum Flussufer gegangen.\" (English: \"I walked to the riverside today.\") id=doc3, language=en, passage=\"I walked to the bank today to deposit money.\" id=doc4, language=de, passage=\"Ich saß heute bei der Bank und wartete auf mein Geld.\" (English: \"I sat at the bank today waiting for my money.\") In the example queries that follow, we show how multilingual embeddings can overcome some of the challenges that traditional lexical retrieval faces for multilingual search . Typically we talk about vector search overcoming the limitations of lexical search's semantic mismatch and vocabulary mismatch. Semantic mismatch is the case where the tokens (words) we use in the query have the same form as in the indexed documents, but different meanings. For example the \"bank\" of a river doesn't have the same meaning as a \"bank\" that holds money. With vocabulary mismatch, we're faced with the tokens being different, but the underlying concept or meaning is similar to meaning represented in the document. We may search for \"ATM\" which doesn't appear in any document, but is closely related to a \"bank that holds money\". In addition to these two improvements over lexical search, multilingual (cross-lingual) embeddings add language independence, allowing query and passage to be in different languages. For a deeper look into how vector search works and how it fits with traditional lexical search, have a look at this webinar: How vector databases power AI search . Let's try a few search examples now and see how this works. Example 1 Query : \"riverside\" (German: \"Flussufer\") Results : id=doc1, language=en, passage=\"I sat on the bank of the river today.\" id=doc2, language=de, passage=\"Ich bin heute zum Flussufer gegangen.\" (English: \"I walked to the riverside today.\") In this example, the translation of \"riverside\" is \"Flussufer\" in German. The semantic meaning however matches the English phrase \"bank of the river\", as well as the German keyword \"Flussufer\", so we match on both documents. Example 2 Query : \"Geldautomat\" (English: \"ATM\") Results : id=doc4, language=de, passage=\"Ich saß heute bei der Bank und wartete auf mein Geld.\" (English: \"I sat at the bank today waiting for my money.\") id=doc3, language=en, passage=\"I walked to the bank today to deposit money.\" In this example, the translation of \"Geldautomat\" is \"ATM\" in English. Neither \"Geldautomat\" nor \"ATM\" appear as keywords in any of the documents, however the semantic meaning is close to both the English phrase \"bank … money\", and the German phrase \"Bank … Geld\". In this case, the context matters and the query is referring to the kind of bank that holds money, and not the bank of a river, so we match only on the documents that refer to that kind of \"bank\", but we do so across languages based on the semantic meaning and not on keywords. Example 3a Query : \"movement\" Results : id=doc3, language=en, passage=\"I walked to the bank today to deposit money.\" id=doc2, language=de, passage=\"Ich bin heute zum Flussufer gegangen.\" (English: \"I walked to the riverside today.\") In this example, we're searching for the kind of motion represented in the text. We're interested in motion or walking and not sitting or being stationary in one place. As such, the closest documents are represented by the German word \"gegangen\" (English: \"have gone to\") and the English word \"walked\". Example 3b Query : \"stillness\" Results : id=doc4, language=de, passage=\"Ich saß heute bei der Bank und wartete auf mein Geld.\" (English: \"I sat at the bank today waiting for my money.\") id=doc1, language=en, passage=\"I sat on the bank of the river today.\" If we invert the query from Example 3a and look for \"stillness\" or lack of movement, we get the \"opposite\" results. Multilingual E5 embedding model In December 2022, Microsoft released a new general-purpose embedding model called E5, or E mb E ddings from bidir E ctional E ncoder r E presentations. (I know, naming things is hard.) This model was trained on a special, English-only curated dataset called CCPairs, and introduced a few new methods to their training process. The model quickly shot to the top of numerous benchmarks, and after the success of that model, they set their sights on non-English. In addition to embedding models for English, Microsoft later trained a variant of their E5 models on multilingual text, using a variety of multilingual datasets, but with the same overall process as their English counterparts. This showed that their training process was a large part of what helped produce such good English language embeddings, and this success was transferred to multilingual embeddings. In some English-only benchmarks, the multilingual embeddings are even better than other embeddings trained only on English datasets! For those interested, check out the MTEB retrieval benchmark for more details. As has become common practice for embedding models, the E5 family comes in three sizes, allowing users to make tradeoff decisions between effectiveness and efficiency for their particular use-case and budgets. Effectiveness of embeddings refers to how good they are at a task, as measured on a specific dataset. For semantic search this is a retrieval task and is measured using a search relevance metric like nDCG@10 or MRR@10. Efficiency of embeddings and embedding models is influenced by: How many dimensions the vectors are that the model produces, which impacts the storage needs (on disk and in memory) and how fast are they to search for. How large the embedding model is (number of parameters), which impacts the inference latency or the time it takes to create the embeddings at both ingest and search time. Below we can see the three multilingual E5 models and their characteristics, with effectiveness measured on a multilingual benchmark Mr. TyDi (see, naming is hard). For a baseline and as a comparison, we've included the BM25 (lexical search) effectiveness scores on Mr. TyDi, as reported by the E5 authors . Effectiveness: Avg. MRR@10 Efficiency: dimensions Efficiency: parameters BM25 33.3 n/a n/a multilingual-e5-small 64.4 384 118M multilingual-e5-base 65.9 768 278M multilingual-e5-large 70.5 1024 560M Elasticsearch for multilingual vector search with E5 Elasticsearch enables you to generate, store, and search vector embeddings. We've seen an introduction to multilingual embeddings in general, and we know a little bit about E5. Let's take a look at how to actually wire all this together into a search experience with Elasticsearch. This blog has an accompanying notebook which shows all the code in detail with the examples above, using Elasticsearch end-to-end. Here's a quick outline of what's required: Create an Elastic Cloud deployment with one ML node of size 8GB or larger (or use any Elasticsearch cluster with ML nodes) Setup the multilingual-e5-base embedding model in Elasticsearch to embed text at ingest via an inference processor Create an index and ingest documents into an ANN index for approximate kNN search Query the ANN index using a query_vector_builder Let's have a look now at a few code snippets from the notebook for each step. Setup With an Elastic Cloud cluster created or another Elasticsearch cluster ready, we can upload the embedding model using the eland library. Now that the model has been uploaded to the cluster and is ready for inference, we can create the ingest pipeline which contains an inference processor to perform the embedding of the text field of our choosing. When using Enterprise Search features such as the web crawler , you can manage ingest pipelines through the Kibana UI as well. Indexing For the simple examples above, we use just a very simple index mapping, but hopefully it gives you an idea of what your mapping might look like too. With an index created from the above mapping, we're ready to ingest documents. You can use whatever ingest method you'd like, as long as the ingest pipeline that we created at the beginning is referenced (or set as default for your index). Note that as with other embedding models, E5 does have a token limit (512 tokens or about 400 words) so longer text will need to be chunked into individual passages — for example with LangChain or another tool — before being ingested. Here's what our example documents look like. Search The documents have been indexed and embeddings created, so we're ready to search! And that's it! With the above steps, and the complete code from the notebook , you can build yourself a multilingual semantic search experience completely within Elasticsearch. Note of caution: E5 models were trained with instructions prefixed to text before embedding it. This means that when you want to embed text for semantic search, you must prefix the query with \"query: \" and indexed passages with \"passage: \". For further details and other use-cases requiring different prefixing, please refer to the FAQ in the multilingual-e5-base model card . Conclusion In this blog and the accompanying notebook , we've shown how multilingual vector search works, and how to use Elasticsearch with E5 embeddings models. We've motivated this by showing examples of multilingual search across languages, but in fact the same E5 embedding model can be used within a single language as well. For example if you have just a German corpus of text, you can freely use the same model and the same approach to search that corpus with just German queries. It's all the same model, and the same embedding space in the end! Try out the notebook , and be sure to spin up a Cloud cluster of your own to try multilingual semantic search with E5 on the language and dataset of your choice. If you have any questions or want to discuss multilingual semantic search, join us and the entire Elastic community in our discussion forum . Report an issue Related content Vector Database May 13, 2025 Mapping embeddings to Elasticsearch field types: semantic_text, dense_vector, sparse_vector Discussing how and when to use semantic_text, dense_vector, or sparse_vector, and how they relate to embedding generation. AL By: Andre Luiz Vector Database How To April 23, 2025 How to implement Better Binary Quantization (BBQ) into your use case and why you should Exploring why you would implement Better Binary Quantization (BBQ) in your use case and how to do it. SF JG By: Sachin Frayne and Jessica Garson Integrations Python +1 April 21, 2025 Using LlamaIndex Workflows with Elasticsearch Learn how to create an Elasticsearch-based step for your LlamaIndex workflow. JR By: Jeffrey Rengifo Integrations Python +1 April 24, 2025 Using AutoGen with Elasticsearch Learn to create an Elasticsearch tool for your agents with AutoGen. JR By: Jeffrey Rengifo Vector Database April 15, 2025 Elasticsearch BBQ vs. OpenSearch FAISS: Vector search performance comparison A performance comparison between Elasticsearch BBQ and OpenSearch FAISS. US By: Ugo Sangiorgi Jump to Why multilingual embeddings? Let's consider a few examples Example 1 Example 2 Example 3a Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Multilingual vector search with the E5 embedding model - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/multilingual-vector-search-e5-embedding-model","meta_description":"Here's how multilingual vector search works and how to use Elasticsearch with the multilingual E5 embedding model, including examples."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to increase primary shard count in Elasticsearch Exploring methods for increasing primary shard count in Elasticsearch. How To KB By: Kofi Bartlett On April 17, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. It is not possible to increase the primary shard number of an existing index, meaning an index must be recreated if you want to increase the primary shard count. There are 2 methods that are generally used in these situations: the _reindex API and the _split API. The _split API is often a faster method than the _reindex API. Indexing must be stopped before both operations, otherwise, the source_index and target_index document counts will differ. Method 1 – using the split API The split API is used to create a new index with the desired number of primary shards by copying the settings and mapping an existing index. The desired number of primary shards can be set during creation. The following settings should be checked before implementing the split API: The source index must be read-only. This means that the indexing process needs to be stopped. The number of primary shards in the target index must be a multiple of the number of primary shards in the source index. For example, if the source index has 5 primary shards, the target index primary shards can be set to 10,15,20, and so on. Note: If only the primary shard number needs to be changed, the split API is preferred as it is much faster than the Reindex API. Implementing the split API Create a test index: The source index must be read-only in order to be split: Settings and mappings will be copied automatically from the source index: You can check the progress with: Since settings and mappings are copied from the source indices, the target index is read-only. Let’s enable the write operation for the target index: Check the source and target index docs.count before deleting the original index: Index name and alias name can’t be the same. You need to delete the source index and add the source index name as an alias to the target index: After adding the test_split_source alias to the test_split_target index, you should test it with: Method 2 – using the reindex API By creating a new index with the Reindex API, any number of primary shard counts can be given. After creating a new index with the intended number of primary shards, all data in the source index can be re-indexed to this new index. In addition to the split API features, the data can be manipulated using the ingest_pipeline in the reindex AP. With the ingest pipeline, only the specified fields that fit the filter will be indexed into the target index using the query. The data content can be changed using a painless script, and multiple indices can be merged into a single index. Implementing the reindex API Create a test reindex: Copy the settings and mappings from the source index: Create a target index with settings, mappings, and the desired shard count: *Note: setting number_of_replicas: 0 and refresh_interval: -1 will increase reindexing speed. Start the reindex process. Setting requests_per_second=-1 and slices=auto will tune the reindex speed. You will see the task_id when you run the reindex API. Copy that and check with _tasks API: Update the settings after reindexing has finished: Check the source and target index docs.count before deleting the original index, it should be the same: The index name and alias name can’t be the same. Delete the source index and add the source index name as an alias to the target index: After adding the test_split_source alias to the test_split_target index, test it using: Summary If you want to increase the primary shard count of an existing index, you need to recreate the settings and mappings to a new index. There are 2 primary methods for doing so: the reindex API and the split API. Active indexing must be stopped before using either method. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to Method 1 – using the split API Implementing the split API Method 2 – using the reindex API Implementing the reindex API Summary Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to increase primary shard count in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-increase-primary-shard-count","meta_description":"Exploring methods for increasing primary shard count in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Using Amazon Nova models in Elasticsearch Learn how to use models from the Amazon Nova family in Elasticsearch. How To AL By: Andre Luiz On April 2, 2025 Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! Elasticsearch is packed with new features to help you build the best search solutions for your use case. Dive into our sample notebooks to learn more, start a free cloud trial , or try Elastic on your local machine now. In this article, we will discuss Amazon's AI model family, Amazon Nova, and learn how to use it alongside Elasticsearch. About Amazon Nova Amazon Nova is a family of Amazon artificial intelligence models, available on Amazon Bedrock and designed to offer high performance and cost efficiency. These models operate with text, image and video inputs, generate textual outputs, and are optimized for different accuracy, speed and cost needs. Amazon Nova main models Amazon Nova Micro: Focused exclusively on text, this is a fast and cost-effective model, ideal for translation, reasoning, code completion and solving mathematical problems. Its generation exceeds 200 tokens per second, making it ideal for applications that require instant responses. Amazon Nova Lite: A low-cost multimodal model capable of quickly processing images, videos and texts. It stands out for its speed and accuracy, being indicated for interactive and high-volume applications where cost is a relevant factor. Amazon Nova Pro: The most advanced option, combining high accuracy, speed and cost efficiency. Ideal for complex tasks such as video summarization, questions and answers, software development and AI agents. Expert reviews attest to its excellence in textual and visual comprehension, as well as its ability to follow instructions and execute automated workflows. Amazon Nova models are suitable for a variety of applications, from content creation and data analysis to software development and AI-powered process automation. Below, we’ll demonstrate how to use Amazon Nova models in conjunction with Elasticsearch for automated product review analysis. What we will do: Create an endpoint via Inference API, integrating Amazon Bedrock with Elasticsearch. Create a pipeline using the Inference Processor, which will make calls to the Inference API endpoint. Index product reviews and automatically generate an analysis of the reviews using the pipeline. Analyze the results of the integration. Creating an Endpoint in the Inference API First, we configure the Inference API to integrate Amazon Bedrock with Elasticsearch. We define Amazon Nova Lite, id amazon.nova-lite-v1:0 , as the model to use since it offers a balance between speed, accuracy, and cost. Note: You will need valid credentials to use Amazon Bedrock. You can see the documentation for obtaining access keys here : Creating the review analysis pipeline Now, we create a processing pipeline that will use the Inference Processor to execute a review analysis prompt. This prompt will send the review data to Amazon Nova Lite, which will perform: Sentiment classification (positive, negative, or neutral). Review summarization. Keywords generation. Authenticity measurement (authentic | suspicious | generic). Indexing reviews Now, we index product reviews using the Bulk API. The pipeline created earlier will be automatically applied, adding the analysis generated by the Nova model to the indexed documents. Querying and analyzing the results Finally, we run a query to see how the Amazon Nova Lite model analyzes and classifies the reviews. By running GET products/_search, we get the documents already enriched with the fields generated from the review content. The model identifies the predominant sentiment (positive, neutral, or negative), generates concise summaries, extracts relevant keywords, and estimates the authenticity of each review. These fields help understand the customer’s opinion without having to read the full text. To interpret the results, we look at: Sentiment, which indicates the consumer’s overall perception of the product. The summary, which highlights the main points mentioned. Keywords, which can be used to group similar reviews or identify feedback patterns. Authenticity, which signals whether the review seems trustworthy. This is useful for curation or moderation. Final Thoughts The integration between Amazon Nova Lite and Elasticsearch demonstrated how language models can transform raw reviews into structured and valuable information. By processing the reviews through a pipeline, we were able to extract sentiment, authenticity, summaries, and keywords automatically and consistently. The results show that the model can understand the context of the reviews, classify user opinions, and highlight the most relevant points of each experience. This creates a much richer dataset that can be leveraged to improve search capabilities. Report an issue Related content Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo How To May 9, 2025 Deleting a field from a document in Elasticsearch Exploring methods for deleting a field from a document in Elasticsearch. KB By: Kofi Bartlett How To May 16, 2025 How to optimize Elasticsearch disk space and usage Explaining how to prevent and handle cases when disk is too full (over utilization) and when the disk capacity is underutilized. KB By: Kofi Bartlett Jump to About Amazon Nova Amazon Nova main models Creating an Endpoint in the Inference API Creating the review analysis pipeline Indexing reviews Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Using Amazon Nova models in Elasticsearch - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/amazon-nova-models-elasticsearch","meta_description":"Learn how to use models from the Amazon Nova family in Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Configurable chunking settings for inference API endpoints Elasticsearch open inference API extends support for configurable chunking for document ingestion with semantic text fields. Integrations How To R By: Daniel Rubinstein On March 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. The Elasticsearch Inference API allows users to utilize machine learning models across a variety of providers to perform inference operations. One common use case of this API is to support semantic text fields used for semantic search within an index. As the size of a document’s data increases, creating an embedding on the whole of the data will yield less accurate results. Some inference models also have limitations on the size of inputs that can be processed. As such, the inference API utilizes a process called chunking to break down large documents being ingested into an index into smaller and more manageable subsections of the original data called chunks. The inference operations are then run against each of the individual chunks and the inference results for each chunk are stored within the index. In this blog, we’ll go over the chunking strategies, explain how Elasticsearch chunks text and how to configure chunking settings for an inference endpoint. What can I configure with chunking settings? From 8.16, users can now select from 2 strategies for generating chunks, each with their own configurable properties. Word based chunking strategy Configurable values provided by the user: (required) max_chunk_size : The maximum number of words in a chunk. (required) overlap : The number of overlapping words for chunks. Note: This can not be defined as more than half of the max_chunk_size . Word based chunking splits input data into chunks with word counts up to the provided max_chunk_size . This strategy will always fill a chunk to the maximum size before building the next chunk unless it reaches the end of the input data. Each chunk after the first will have a number of words overlapping from the previous chunk based on the provided overlap value. The purpose of this overlap is to increase inference accuracy by preventing useful context for inference results from being split across chunks. Sentence based chunking strategy Configurable values provided by the user: (required) max_chunk_size : The maximum number of words in a chunk. (required) sentence_overlap : The number of overlapping sentences for chunks. Note: This can only be defined as 0 or 1. Sentence based chunking will split input data into chunks containing full sentences. Chunks will contain only complete sentences, except when a sentence is longer than max_chunk_size , in which case it will be split across chunks. Each chunk after the first will have a number of sentences from the previous chunk overlapping based on the provided sentence_overlap value. Note: If no chunking settings are provided when creating an inference endpoint after 8.16, the default chunking settings will use a sentence strategy with max_chunk_size of 250 and a sentence_overlap of 1. For inference endpoints created before 8.16, the default chunking settings will use a word strategy with a max_chunk_size of 250 and an overlap of 1. How do I select a chunking strategy? There is no one-size-fits-all solution for the best chunking strategy. The best chunking strategy will vary based on the documents being ingested, the underlying model being used and any compute constraints you have. We recommend taking a subset of your corpus and some example queries and seeing how changing the strategy, chunk size and overlap affects your use case. For example, you might parameter sweep over different chunk overlaps and lengths and measure the time to ingest, the impact on search latency and the relevance of the top results for each query. The following are a few guidelines to help when starting out with configurable chunking: Picking a chunking strategy Generally, a sentence based chunking strategy works well to minimize context loss. However, it can often result in more chunks being generated as the process prioritizes keeping sentences intact over maximally filling each chunk. As such, an optimized word based chunking strategy may produce fewer chunks, which are more efficient to ingest and search. Picking a chunk size The chunk size should be selected to minimize useful contextual information from being split across chunks while retaining chunk topic coherence. Typically, chunks as close as possible to the maximum sequence length the model supports work better. However, long chunks are more likely to contain a mixture of topics that are less well represented. Picking a chunk overlap As the overlap between chunks increases, the number of chunks generated does as well. Similar to chunk size, you'll want to select an overlap that helps to minimize the chance of splitting important context across chunks subject to your compute constraints. Typically, more overlap, up to half the typical chunk length, results in better retrieval quality but comes at an increased cost. How does Elasticsearch chunk text? Elasticsearch uses the ICU4J library to detect word and sentence boundaries . Word boundaries are identified by following a series of rules, not just the presence of a whitespace character. For written languages that do not use whitespace, such as Chinese or Japanese, dictionary lookups are used to detect word boundaries. Sentence boundaries are similarly identified by following a series of rules, not just the presence of a period character. This ensures that sentence boundaries are accurately identified across languages in which sentence structures and sentence breaking characteristics may vary. Finally, we note that sometimes chunks benefit from long range context, which can't be retained by any simple chunking strategy. In these cases, if you are prepared to pay the cost, chunks can be enriched with additional generated context. For more details, see this discussion. How do I configure chunking settings for an inference endpoint? Pre-requisites Before configuring chunking settings, ensure that you have met the following requirements: You have a valid enterprise license. If you are configuring chunking settings for an inference endpoint connecting to any third-party integration, you have set up any necessary permissions to access these services (e.g. , created accounts, retrieved API keys, etc.). For the purposes of this guide, we will be configuring chunking settings for an inference endpoint using Elastic’s ELSER model , for which the only requirement is having a valid enterprise license. To find the information required to create an inference endpoint for a third-party integration, see the create inference endpoint API documentation . Step 1: Configure chunking settings during inference endpoint creation Step 2: Link the inference endpoint to a semantic text field in an index Step 3: Ingest a document into the index Ingest the document into the index created above by calling the index document API: The generated chunks and their corresponding inference results can be seen stored in the document in the index under the key chunks within the _inference_fields metafield. To see the stored chunks, you can search for all documents in the index with the search API : The chunks can be seen in the response. Before 8.18, the chunks were stored as full-chunk text values. From 8.18, the chunks are stored as a list of character offset values: Get started with configurable chunking today! For more information on utilizing this feature, view the documentation on configuring chunking . Try out this notebook to get started with configurable chunking settings: Configuring Chunking Settings For Inference Endpoints . Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to What can I configure with chunking settings? Word based chunking strategy Sentence based chunking strategy How do I select a chunking strategy? Picking a chunking strategy Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Configurable chunking settings for inference API endpoints - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/elasticsearch-chunking-inference-api-endpoints","meta_description":"Explore Elasticsearch chunking strategies, learn how Elasticsearch chunks text, and how to configure chunking settings for an inference endpoint."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog How to ingest data to Elasticsearch through Airbyte Using Airbyte to ingest data into Elasticsearch. Integrations How To AL By: Andre Luiz On March 14, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Airbyte is a data integration tool that allows you to move information from various sources to different destinations in an automated and scalable way. It enables you to extract data from APIs, databases, and other systems and load it into platforms such as Elasticsearch, which offers advanced search and efficient analysis. In this article, we will explain how to configure Airbyte to ingest data into Elasticsearch, covering key concepts, prerequisites, and step-by-step integration. Airbyte fundamental concepts Airbyte has several essential concepts for its use. Below, we highlight the main ones: Sources: Defines the origin of the data that will be extracted. Destinations: Defines where the data will be sent and stored. Connections: Configures the relationship between the source and the destination, including the synchronization frequency. Airbyte integration with Elasticsearch In this demonstration, we will perform an integration where data stored in an S3 bucket will be migrated to an Elasticsearch index. We will show how to configure the source (S3) and destination (Elasticsearch) in Airbyte. Prerequisites To follow this demonstration, the following prerequisites must be met: Create a bucket in AWS, where the JSON files containing the data will be stored. Install Airbyte locally using Docker. Create an Elasticsearch cluster in Elastic Cloud to store the ingested data. Below, we will detail each of these steps. Installing Airbyte Airbyte can be run locally using Docker or in the cloud, where there are costs associated with usage. For this demonstration, we will use the local version with Docker. The installation may take a few minutes. After following the installation instructions, Airbyte will be available at: http://localhost:8000. After logging in, we can start configuring the integration. Creating the bucket In this step, you’ll need an AWS account to create an S3 bucket. Additionally, it is essential to set the correct permissions by creating a policy and an IAM user to allow access to the bucket. In the bucket, we will upload JSON files containing different log records, which will later be migrated to Elasticsearch. The file logs have this content: Below are the files loaded into the bucket: Elastic Cloud configuration To make the demonstration easier, we will use Elastic Cloud. If you do not have an account yet, you can create a free trial account here: Elastic Cloud Registration . After configuring the deployment in Elastic Cloud, you will need to obtain: The URL of the Elasticsearch server. A user to access Elasticsearch. To obtain the URL, go to Deployments > My deployment, in application, find Elasticsearch and click on ‘Copy endpoint.‘ To create the user, follow the steps below: Access Kibana > Stack Management > Users. Create a new user with the superuser role. Fill in the fields to create the user. Now that we have everything set up, we can start configuring the connectors in Airbyte. Configuring source connector In this step, we will create the source connector for S3. To do this, we will access the Airbyte interface and select the Source option in the menu. Then, we will search for the S3 connector. Below, we detail the steps required to configure the connector: Access Airbyte and go to the Sources menu. Search for and select the S3 connector. Configure the following parameters: Source Name: Define a name for the data source. Delivery Method: Select Replicate Records (recommended for structured data). Data Format: Choose JSON Format. Stream Name: Define the name of the index in Elasticsearch. Bucket Name: Enter the name of the bucket in AWS. AWS Access Key and AWS Secret Key: Enter the access credentials. Click on Set up source and wait for validation. Configuration destination connector In this step, we will configure the destination connector, which will be Elasticsearch. To do this, we will access the menu and select the Destination option. Then, we will search for Elasticsearch and click on the returned result. Now, we will proceed with the configuration of this connection: Access Airbyte and go to the Destinations menu. Search and select the Elasticsearch connector. Configure the following parameters: Authentication Method: Choose Username/Password. Username and Password: Use the credentials created in Kibana. Server Endpoint: Paste the URL copied from Elastic Cloud. Click on Set up destination and wait for validation. Creating the Source and Destination connection Once the Source and Destination have been created, the connection between them will be created, thus completing the creation of the integration. Below are the instructions for creating the connection: 1. In the menu, go to Connections and click on Create First Connection. 2. On the next screen, you will be able to select an existing Source or create a new one. Since we already have a Source created, we will select Source S3. 3. The next step will be to select the destination. Since we have already created the Elasticsearch connector, it will be selected to finalize the configuration. In the next step, it will be necessary to define the Sync Mode and which schema will be used. Since only the log schema was created, it will be the only option available for selection. 4. We will move on to the Configure Connection step. Here, we can define the name of the connection and the frequency of the integration execution. The frequency can be configured in three ways: Cron : Runs the syncs based on the user-defined cron expression (e.g 0 0 15 * * ?, At 15:00 every day); Scheduled : Runs the syncs at the specified time interval (e.g. every 24 hours, every 2 hours); Manual : Run the syncs manually. For this demonstration, we will select the Manual option. Finally, by clicking on Set up Connection , the connection between the Source and the Destination will be established. Synchronizing Data from S3 to Elasticsearch When you return to the Connections screen, you can see the connection that was created. To execute the process, simply click on Sync. From that moment on, the migration of data from S3 to Elasticsearch will begin. If everything goes smoothly, you will get the synced status. Visualizing data in Kibana Now, we will go to Kibana to analyze the data and check if it was indexed correctly. In the Kibana Discovery section, we will create a Data View called logs. With this, we will be able to explore the data existing only in the logs index, which was created after the synchronization. Now, we can visualize the indexed data and perform analyses on it. This way, we validated the entire migration flow using Airbyte, where we loaded the data present in the bucket and indexed it in Elasticsearch. Conclusion Airbyte proved to be an efficient tool for data integration, allowing us to connect several sources and destinations in an automated way. In this tutorial, we demonstrated how to ingest data from an S3 bucket to an Elasticsearch index, highlighting the main steps of the process. This approach facilitates the ingestion of large volumes of data and allows analyses within Elasticsearch, such as complex searches, aggregations, and data visualizations. References Quickstart Airbyte: https://docs.airbyte.com/using-airbyte/getting-started/oss-quickstart#part-1-install-abctl Core concepts: https://docs.airbyte.com/using-airbyte/core-concepts/ Report an issue Related content Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Developer Experience Javascript +1 May 15, 2025 Elasticsearch in JavaScript the proper way, part I Explaining how to create a production-ready Elasticsearch backend in JavaScript. JR By: Jeffrey Rengifo Jump to Airbyte fundamental concepts Airbyte integration with Elasticsearch Prerequisites Installing Airbyte Creating the bucket Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"How to ingest data to Elasticsearch through Airbyte - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/airbyte-elasticsearch-ingest-data","meta_description":"Learn how to use Airbyte to ingest data into Elasticsearch. This blog covers the main concepts, prerequisites, and step-by-step integration."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog Playground: Experiment with RAG using Bedrock Anthropic Models and Elasticsearch in minutes Explore Elastic's Playground and learn how to use it to experiment with RAG applications using Bedrock Anthropic Models & Elasticsearch. Vector Database Generative AI Developer Experience Integrations How To JM AT By: Joe McElroy and Aditya Tripathi On July 10, 2024 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Playground is a low-code interface that enables developers to iterate and build production RAG applications by A/B testing LLMs, tuning prompts, and chunking data. With support for Amazon Bedrock, Playground brings you a wider selection of foundation models from Amazon, Anthropic, and other leading providers. Developers using Amazon Bedrock and Elasticsearch can refine retrieval to ground answers with private or proprietary data, indexed into one or more Elasticsearch indices. A/B test LLMs & retrieval with Playground inference via Amazon Bedrock The playground interface allows you to experiment and A/B test different LLMs from leading model providers such as Amazon and Anthropic. However, picking a model is only a part of the problem. Developers must also consider how to retrieve relevant search results to closely match a model’s context window size (i.e. the number of tokens a model can process). Retrieving text passages longer than the context window can lead to truncation, therefore loss of information. Text that is smaller than the context window may not embed correctly, making the representation inaccurate. The next bit of complexity may arise from having to combine retrieval from different data sources. Playground brings together a number of Elasticsearch features into a simple, yet powerful interface for tuning RAG workflows: work with a growing list of model sources, including Amazon Bedrock, for choosing the best LLM for your needs use semantic_text , for tuning chunking strategies to fit data and context window size use retrievers to add multi-stage retrieval pipelines (including re-raking) After tuning the context sent to the models to desired production standards, you can export the code and finalize your application with your Python Elasticsearch language client or LangChain Python integration. Today’s announcement, brings access to hosted models on Amazon Bedrock through the Open Inference API integration, and the ability to use the new semantic_text field type. We really hope you enjoy this experience! Playground takes all these composable elements and brings to you a true developer toolset for rapid iteration and development to match the pace that developers need. Using Elastic's Playground Within Kibana (the Elasticsearch UI), navigate to “Playground” from the navigation page on the left-hand side. To start, you need to connect to a model provider to bring your LLM of choice. Playground supports chat completion models such as Anthropic through Amazon Bedrock. This blog provides detailed steps and instructions to connect and configure the playground experience. Once you have connected an LLM and chosen an Elasticsearch index, you can start asking questions about information in your index. The LLM will provide answers based on context from your data. Connect an LLM of choice and Elasticsearch indices with private proprietary information Instantly chat with your data and assess responses from models such as Anthropic Claude 3 Haiku in this example Review and customize text and retriever queries to indices that store vector embeddings Using retrievers and hybrid search for the best context Elastic’s hybrid search helps you build the best context windows. Effective context windows are built from various types of vectorized and plain text data that can be spread across multiple indices. Developers can now take advantage of new query retrievers to simplify query creation. Three new retrievers are available from version 8.14 and on Elastic Cloud Serverless , and implementing hybrid search normalized with RRF is one unified query away. You can store vectorized data and use a kNN retriever, or add metadata and context to create a hybrid search query. Soon, you can also add semantic reranking to further improve search results. Use Playground to ship conversational search—quickly Building conversational search experiences can involve many approaches, and the choices can be paralyzing, especially given the pace of innovation in new reranking and retrieval techniques, both of which apply to RAG applications. With our playground, those choices are simplified and intuitive, even with the vast array of capabilities available to the developer. Our approach is unique in enabling hybrid search as a predominant pillar of the construction immediately, with an intuitive understanding of the shape of the selected and chunked data and amplified access across multiple external providers of LLMs. Earlier this year, Elastic was awarded the AWS Generative AI Competency , a distinction given to very few AWS partners that provide differentiating generative AI tools. Elastic’s approach to adding Bedrock support for the playground experience is guided by the same principle – to bring new and innovative capabilities to Elastic Cloud on AWS developers. Build, test, fun with Playground Head over to Playground docs to get started today! Explore Search Labs on GitHub for new cookbooks and integrations for providers such as Cohere, Anthropic, Azure OpenAI, and more. Report an issue Related content Developer Experience Inside Elastic May 22, 2025 How we rebuilt autocomplete for ES|QL How we rearchitected an autocomplete engine for ES|QL to support language evolution instead of resisting it. DT By: Drew Tate Integrations May 21, 2025 First to hybrid search: with Elasticsearch and Semantic Kernel Hybrid search capabilities are now available in the .NET Elasticsearch Semantic Kernel connector. Learn how to get started in this blog post. EZ FB By: Enrico Zimuel and Florian Bernd Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Integrations How To May 21, 2025 Get set, build: Red Hat OpenShift AI applications powered by Elasticsearch vector database The Elasticsearch vector database is now supported by the ‘AI Generation with LLM and RAG’ Validated Pattern. This blog walks you through how to get started. TP By: Tom Potoma Developer Experience Javascript +1 May 19, 2025 Elasticsearch in JavaScript the proper way, part II Reviewing production best practices and explaining how to run the Elasticsearch Node.js client in Serverless environments. JR By: Jeffrey Rengifo Jump to A/B test LLMs & retrieval with Playground inference via Amazon Bedrock Using Elastic's Playground Using retrievers and hybrid search for the best context Use Playground to ship conversational search—quickly Build, test, fun with Playground Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"Playground: Experiment with RAG using Bedrock Anthropic Models and Elasticsearch in minutes - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/rag-playground-bedrock-anthropic-models","meta_description":"Explore Elastic's Playground and learn how to use it to experiment with RAG applications using Bedrock Anthropic Models & Elasticsearch."}
+{"text":"Tutorials Examples Integrations Blogs Start free trial Blog ChatGPT and Elasticsearch revisited: Part 2 - The UI Abides This blog expands on Part 1 by introducing a fully functional web UI for our RAG-based search system. By the end, you'll have a working interface that ties the retrieval, search, and generation process together—while keeping things easy to tweak and explore. Generative AI JV By: Jeff Vestal On February 21, 2025 Elasticsearch has native integrations to industry leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics , or building prod-ready apps Elastic Vector Database . To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now. Don't want to read the whole thing? No problem, go clone the app and get searching! In Part 1, we walked through setting up our search index, using the Open Crawler to crawl Elastic blog content, configured an Inference API to an LLM, and tested our RAG setup with Elastic’s Playground in Kibana. Now, in Part 2, I’ll make good on a promise from the end of that blog by returning with a functional web UI! This guide will walk through: How the app works. The key controls and customization options available to users. Enhancements that improve search, retrieval, and response generation. What the app does At a high level, this app takes a user’s search query or question and follows these steps: Retrieves relevant documents using hybrid search—combining text matching and semantic search. Displays matching document snippets along with links to their full content. Builds a prompt using the retrieved documents and predefined instructions. Generates a response from an LLM, providing grounding documents from Elasticsearch results. Provides controls to modify the generated prompt and response from the LLM. Exploring the UI controls The application provides several controls for refining search and response generation. Here’s a breakdown of the key features: 1. Search box Users enter a query just like a search engine. The query is processed using both lexical and vector search. 2. Generated response panel Displays the LLM-generated response based on retrieved documents. Sources used to generate the response are listed for reference. Includes an Expand/Collapse toggle to adjust panel size. 3. Elasticsearch results panel Shows the top-ranked retrieved documents from Elasticsearch. Includes document titles, highlights, and direct links to the original content. Helps users see which documents influenced the LLM’s response. 4. Source filtering controls Users can select which data sources to use for retrieval after the initial search. This allows users to focus on specific domains of content. 5. Source filtering controls Users can select if the LLM can use its training to generate a response outside of the grounding context. Opens up the possibility of expanded answers beyond what is passed to the LLM. 6. Number of sources selector Allows users to adjust how many top results are passed to the LLM. Increasing sources often improves response grounding, but too many can incur unnecessary token costs. 7. Chunk vs. document toggle Determines whether grounding is done with the full document or relevant chunked . Chunking improves search granularity by breaking long texts into manageable sections. 8. LLM prompt paned Allows users to view the complete prompt passed to the LLM to generate the response. Helps users better understand how an answer was generated. App architecture The application is a Next.js web app that provides a user interface for interacting with a RAG-based search system . End-to-end data flow of a Next.js-powered UI application This architecture eliminates the need for a separate backend service, leveraging Next.js API routes for seamless search and LLM processing integration. Code snippets Let's look at a few sections of code that are the most relevant to this app and may be useful if you want to modify it to work with different datasets. ES query The Elasticsearch query is pretty straightforward. /app/api/search/route.ts By using a hybrid retriever, we should allow for matching on searches that are more keyword-based and natural language questions, which are becoming the norm for people to use in their searches these days. You'll notice we are using the highlight functionality in this query. This allows us to easily provide a relevant summary of a matched document in the Elasticsearch Results section. It also allows us to use matching chunks for grounding when we are building the prompt for the LLM, and chunk is selected as the grounding option. Extracting the ES results Next, we need to extract the results from Elasticsearch /app/api/search/route.ts We extract the search results ( hits ) from the Elasticsearch response, ensuring they exist and are in an expected array format. If the results are missing or incorrectly formatted, we log an error and return a 500 status. Parse the hits We have our results but we need to parse out the results into a format we can use to display results to the user in the UI and to build our prompt for the LLM. /app/api/search/route.ts There are a couple of key things happening in this code block. We use the highlight value from the top hit of `semantic_body` as a snippet for each ES doc displayed. Depending on the user's selection, we store the prompt context using either the `semantic_body` as the chunk or the full `body` as the body We extract the `title`. We extract the URL to the blog and ensure it is formatted correctly so users can click on it to visit the blog. Lab sources for clicking The last processing we do is to parse out the aggregation values /app/api/search/route.ts We do this to have a clickable list of the various \"Labs\" where the results came from. This way, users can select only the sources they want included, and when they hit search again, the Labs they have checked will be used as a filter. Managing state The `SearchInterface` component is the core component in the app. It uses React state hooks to manage all the data and configurations. /components/SearchInterface.tsx The first three lines here are used to track the search results from Elasticsearch, the generated response from the LLM, and the generated prompt used to instruct the LLM. The last two are used to track user settings from the UI. Selecting the number of sources to include in grounding with the LLM and if the LLM should be grounded with just matching chunks or the full blog article. Handling search queries When the user hits the submit button, handleSearch takes over. /components/SearchInterface.tsx This function sends the queries to /api/search (shown in snippets above) including the user's source selection, grounding settings, and API Credentials. The response is parsed and stored in state, which triggers UI updates. Source Extraction After fetching the results, we create a sources object. /components/SearchInterface.tsx This will later be passed to the LLM as part of the prompt. The LLM is instructed to cite any sources it uses to generate its response. Constructing & sending the LLM prompt The prompt is dynamically created based on the user's settings and includes grounding documents from Elasticsearch. /components/SearchInterface.tsx By default, we instruct the LLM to only use the provided grounding documents in its answer. However, we do provide a setting to allow the LLM to use its own training \"knowledge\" to construct a wider response. When it is allowed to use its own training, it is further instructed to append a warning to the end of the response. We instruct the LLM to cite the provided documents as sources, but only ones that it actually uses. We give some instructions on how the response should be formatted for readability in the UI. Finally, we pass it to /api/llm for processing Streaming the AI response With the documents from Elasticsearch parsed and returned to the front end immediately, we call the LLM to generate a response to the user's question. /components/SearchInterface.tsx There are a lot of lines here but essentially this part of the code calls /api/llm (covered below) and handles the streaming response. We want the LLM response to stream back to the UI as it is generated, so we parse each event as it is returned allowing the UI to dynamically update. We have to decode the stream, do a little cleanup, and update resultText with the newly received text. Calling the LLM We are calling the LLM using Elasticsearch's Inference API. This allows us to centralize the management of our data in Elasticsearch. /app/api/llm/route.ts This bit of code is pretty straightforward. We send the request to the streaming inference API Endpoint we created as part of the setup (see below under Getting Things Running), and then we stream the response back. Handling the streams We need to read the stream as it comes in chunk-by-chunk. /app/api/llm/route.ts Here we decode the streamed LLM response chunk-by-chunk and forwards each decoded part to the frontend in real-time. Getting things running Now that we've reviewed some of the key parts of the code, let's get things actually installed and up and running. Completion Inference API If you don't have a completion Inference API configured in Elasticsearch , you'll need to do that so we can generate a response to the user's question. I used Azure OpenAI with the gpt-4o model, but you should be able to use other services. The key is that it must be a service that the Stream Inference API supports. The individual service_settings depends on which service and model you use. Refer to the Inference API docs for more info. Clone If you have the GitHub CLI installed and configured, you can clone the UI repo into the directory of your choice. You can also download the zip file at then unzip it. Install dependencies Follow the step in the readme file in the repo to install the dependencies. Start the development server We are going to just run in development mode. This runs with live reloading and debugging. There is a production mode that runs an optimized production build for deployment. To start in dev mode run: This should start the server up, and if there are no errors, you should see something similar to: If you have something else running using port 3000 your app will start using the next available port. Just look at the output to see what port it uses. If you want it to run on a specific port, say 4000 you can do so by running: To the UI Once the app is running, you can try out different configurations to see what works best for you. Connection settings The first thing you need to do before using the app is set up your connection credentials. To do that click the gear icon ⚙️ in the upper right. When the box pops up, input your API Key and Elasticsearch URL. Defaults To get started simply ask a question or type in a search query into the search bar. Leave everything else as is. Learning about Hybrid Search The app will query Elasticsearch for the top relevant docs, in the above example about rrf, using rrf! The docs with a short snippet, the blog title, and a clickable URL will be returned to the user. Search All the Things! The top three chunks will be combined with a prompt and sent to the LLM. The generated response will be streamed back. This appears to be a thorough response! Bowl your own game Once the initial search results and generated response are displayed, the user can follow up and make a couple of changes to the settings. Lab sources All the blog sites that are part of the index searched will be listed under Lab Sources . If you were to add additional sites or sources to the index we created in part one with the Open Crawler, they would show up here. You can select only the sources you want to be considered for search results and click search again. The subsequent search will use the checked sources as a filter on the Elasticsearch query. Answer source One of the advantages we talk about with RAG is providing grounding documents to the LLM. This helps cut down on hallucinations (nothing's perfect). However, you may want to allow the LLM to use its training and other \"knowledge\" outside of the grounding docs to generate a response. Unchecking Context Only will allow the LLM this freedom. The LLM should provide a warning at the end of the response, letting you know if it journeyed outside the groundings. As with many things LLM, this isn't guaranteed. Either way, use caution with these responses. Number of sources We default to using three chunks as grounding information for the LLM. Increasing the number of context chunks sometimes gives the LLM more information to generate its response. Sometimes, simply providing the whole blog about a specialized topic works best. It depends on how spread out the topic is. Some topics are covered in many blogs in various ways. So, giving more sources can provide a richer answer. Something more esoteric may only be covered once so extra chunks may not help. Chunk or doc Regarding the number of sources, throwing everything at the LLM is not usually the best way to generate an answer. While most blogs are relatively short compared to many other document sources, say health insurance policy documents, throwing a long document at an LLM has several downsides. First, if the relevant information is only included in two paragraphs and you include twenty, you pay for eighteen paragraphs of useless tokens. Second, that useless information slows down the LLM's generated response. Generally, stick with chunks unless you have a good reason to send whole documents, blogs in this case. Here's the bridge-- Hopefully, this walkthrough has helped ensure you're not out of your element when it comes to setting up a UI that provides semantically retrieved documents from Elasticsearch and generated answers from an LLM. There certainly are a lot of features we could add and settings we could tweak to make the experience even better here. But it's a good start. The great thing about providing code to the community is that you're free to take it and customize and tweak it if you are happy. Be on the lookout for part 3, where we will instrument the app using Open Telemetry! Report an issue Related content Integrations Generative AI May 20, 2025 Spring AI and Elasticsearch as your vector database Building a complete AI application using Spring AI and Elasticsearch. JL PK LT By: Josh Long , Philipp Krenn and Laura Trotta Generative AI How To April 25, 2025 ​​Build a powerful RAG workflow using LangGraph and Elasticsearch In this blog, we will show you how to configure and customize the LangGraph Retrieval Agent Template with Elasticsearch to build a powerful RAG workflow for efficient data retrieval and AI-driven responses. NS By: Neha Saini Generative AI How To March 31, 2025 RAG vs. Fine Tuning, a practical approach Comparing RAG and fine-tuning tools with the practical example of an e-commerce chatbot. TM By: Tomás Murúa Generative AI How To March 26, 2025 Parse PDF text and table data with Azure AI Document Intelligence Learn how to parse PDF documents that contain text and table data with Azure AI Document Intelligence. JW By: James Williams Vector Database Search Relevance +1 March 12, 2025 Unifying Elastic vector database and LLM functions for intelligent query Leverage LLM functions for query parsing and Elasticsearch search templates to translate complex user requests into structured, schema-based searches for highly accurate results. SM By: Sunile Manjee Jump to What the app does Exploring the UI controls App architecture Code snippets ES query Show more Share Ready to build state of the art search experiences? Sufficiently advanced search isn’t achieved with the efforts of one. Elasticsearch is powered by data scientists, ML ops, engineers, and many more who are just as passionate about search as your are. Let’s connect and work together to build the magical search experience that will get you the results you want. Try it yourself Subscribe to newsletter Elasticsearch Labs is the one-stop destination for developers to learn how to easily utilize Elasticsearch to build advanced search experiences including generative AI, embedding models, reranking capabilities and more. Let's connect Menu Tutorials Examples Integrations Blogs Search Additional Resources Elasticsearch API Reference Elastic.co Change theme Change theme Sitemap RSS 2025. Elasticsearch B.V. All Rights Reserved.","title":"ChatGPT and Elasticsearch revisited: Part 2 - The UI Abides - Elasticsearch Labs","url":"https://www.elastic.co/search-labs/blog/chatgpt-elasticsearch-rag-app-ui","meta_description":"This blog expands on Part 1 by introducing a fully functional web UI for our RAG-based search system. \n\nBy the end, you'll have a working interface that ties the retrieval, search, and generation process together—while keeping things easy to tweak and explore."}
+{"text":"Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Manage TLS encryption / Self-managed / Supported SSL/TLS versions by JDK version Self Managed Elasticsearch relies on your JDK’s implementation of SSL and TLS. Different JDK versions support different versions of SSL, and this may affect how Elasticsearch operates. Note This support applies when running on the default JSSE provider in the JDK. JVMs that are configured to use a FIPS 140-2 security provider might have a custom TLS implementation, which might support TLS protocol versions that differ from this list. Check your security provider’s release notes for information on TLS support. SSLv3 SSL v3 is supported on all Elasticsearch compatible JDKs but is disabled by default. See Enabling additional SSL/TLS versions on your JDK . TLSv1 TLS v1.0 is supported on all Elasticsearch compatible JDKs but is disabled by default. See Enabling additional SSL/TLS versions on your JDK . TLSv1.1 TLS v1.1 is supported on all Elasticsearch compatible JDKs but is disabled by default. See Enabling additional SSL/TLS versions on your JDK . TLSv1.2 TLS v1.2 is supported on all Elasticsearch compatible JDKs . It is enabled by default on all JDKs that are supported by Elasticsearch, including the bundled JDK. TLSv1.3 TLS v1.3 is supported on all Elasticsearch compatible JDKs . It is enabled by default on all JDKs that are supported by Elasticsearch, including the bundled JDK. Enabling additional SSL/TLS versions on your JDK The set of supported SSL/TLS versions for a JDK is controlled by a java security properties file that is installed as part of your JDK. This configuration file lists the SSL/TLS algorithms that are disabled in that JDK. Complete these steps to remove a TLS version from that list and use it in your JDK. Locate the configuration file for your JDK. Copy the jdk.tls.disabledAlgorithms setting from that file, and add it to a custom configuration file within the Elasticsearch configuration directory. In the custom configuration file, remove the value for the TLS version you want to use from jdk.tls.disabledAlgorithms . Configure Elasticsearch to pass a custom system property to the JDK so that your custom configuration file is used. Locate the configuration file for your JDK For the Elasticsearch bundled JDK , the configuration file is in a sub directory of the Elasticsearch home directory ( $ES_HOME ): Linux: $ES_HOME/jdk/conf/security/java.security Windows: $ES_HOME/jdk/conf/security/java.security macOS: $ES_HOME/jdk.app/Contents/Home/conf/security/java.security For JDK11 or later , the configuration file is within the conf/security directory of the Java installation. If $JAVA_HOME points to the home directory of the JDK that you use to run Elasticsearch, then the configuration file will be in: $JAVA_HOME/conf/security/java.security Copy the disabledAlgorithms setting Within the JDK configuration file is a line that starts with jdk.tls.disabledAlgorithms= . This setting controls which protocols and algorithms are disabled in your JDK. The value of that setting will typically span multiple lines. For example, in OpenJDK 21 the setting is: jdk.tls.disabledAlgorithms=SSLv3, TLSv1, TLSv1.1, DTLSv1.0, RC4, DES, \\ MD5withRSA, DH keySize < 1024, EC keySize < 224, 3DES_EDE_CBC, anon, NULL, \\ ECDH Create a new file in your in your Elasticsearch configuration directory named es.java.security . Copy the jdk.tls.disabledAlgorithms setting from the JDK’s default configuration file into es.java.security . You do not need to copy any other settings. Enable required TLS versions Edit the es.java.security file in your Elasticsearch configuration directory, and modify the jdk.tls.disabledAlgorithms setting so that any SSL or TLS versions that you wish to use are no longer listed. For example, to enable TLSv1.1 on OpenJDK 21 (which uses the jdk.tls.disabledAlgorithms settings shown previously), the es.java.security file would contain the previously disabled TLS algorithms except TLSv1.1 : jdk.tls.disabledAlgorithms=SSLv3, TLSv1, DTLSv1.0, RC4, DES, \\ MD5withRSA, DH keySize < 1024, EC keySize < 224, 3DES_EDE_CBC, anon, NULL, \\ ECDH Enable your custom security configuration To enable your custom security policy, add a file in the jvm.options.d directory within your Elasticsearch configuration directory. To enable your custom security policy, create a file named java.security.options within the jvm.options.d directory of your Elasticsearch configuration directory, with this content: -Djava.security.properties=/path/to/your/es.java.security Enabling TLS versions in Elasticsearch SSL/TLS versions can be enabled and disabled within Elasticsearch via the ssl.supported_protocols settings . Elasticsearch will only support the TLS versions that are enabled by the underlying JDK. If you configure ssl.supported_protocols to include a TLS version that is not enabled in your JDK, then it will be silently ignored. Similarly, a TLS version that is enabled in your JDK, will not be used unless it is configured as one of the ssl.supported_protocols in Elasticsearch. Previous Mutual authentication Next Enabling cipher suites for stronger encryption Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Enabling additional SSL/TLS versions on your JDK Locate the configuration file for your JDK Copy the disabledAlgorithms setting Enable required TLS versions Enable your custom security configuration Enabling TLS versions in Elasticsearch Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Supported SSL/TLS versions by JDK version | Elastic Docs","url":"https://www.elastic.co/docs/deploy-manage/security/supported-ssltls-versions-by-jdk-version","meta_description":"Elasticsearch relies on your JDK’s implementation of SSL and TLS. Different JDK versions support different versions of SSL, and this may affect how Elasticsearch..."}
+{"text":"Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / SQL / Functions and Operators / String Functions Elastic Stack Serverless Functions for performing string manipulation. ASCII ASCII(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the ASCII code value of the leftmost character of string_exp as an integer. SELECT ASCII('Elastic'); ASCII('Elastic') ---------------- 69 BIT_LENGTH BIT_LENGTH(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the length in bits of the string_exp input expression. SELECT BIT_LENGTH('Elastic'); BIT_LENGTH('Elastic') --------------------- 56 CHAR CHAR(code) Input : integer expression between 0 and 255 . If null , negative, or greater than 255 , the function returns null . Output : string Description : Returns the character that has the ASCII code value specified by the numeric input. SELECT CHAR(69); CHAR(69) --------------- E CHAR_LENGTH CHAR_LENGTH(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the length in characters of the input, if the string expression is of a character data type; otherwise, returns the length in bytes of the string expression (the smallest integer not less than the number of bits divided by 8). SELECT CHAR_LENGTH('Elastic'); CHAR_LENGTH('Elastic') ---------------------- 7 CONCAT CONCAT( string_exp1, string_exp2) Input : string expression. Treats null as an empty string. string expression. Treats null as an empty string. Output : string Description : Returns a character string that is the result of concatenating string_exp1 to string_exp2 . The resulting string cannot exceed a byte length of 1 MB. SELECT CONCAT('Elasticsearch', ' SQL'); CONCAT('Elasticsearch', ' SQL') ------------------------------- Elasticsearch SQL INSERT INSERT( source, start, length, replacement) Input : string expression. If null , the function returns null . integer expression. If null , the function returns null . integer expression. If null , the function returns null . string expression. If null , the function returns null . Output : string Description : Returns a string where length characters have been deleted from source , beginning at start , and where replacement has been inserted into source , beginning at start . The resulting string cannot exceed a byte length of 1 MB. SELECT INSERT('Elastic ', 8, 1, 'search'); INSERT('Elastic ', 8, 1, 'search') ---------------------------------- Elasticsearch LCASE LCASE(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns a string equal to that in string_exp , with all uppercase characters converted to lowercase. SELECT LCASE('Elastic'); LCASE('Elastic') ---------------- elastic LEFT LEFT( string_exp, count) Input : string expression. If null , the function returns null . integer expression. If null , the function returns null . If 0 or negative, the function returns an empty string. Output : string Description : Returns the leftmost count characters of string_exp . SELECT LEFT('Elastic',3); LEFT('Elastic',3) ----------------- Ela LENGTH LENGTH(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the number of characters in string_exp , excluding trailing blanks. SELECT LENGTH('Elastic '); LENGTH('Elastic ') -------------------- 7 LOCATE LOCATE( pattern, source [, start]<3> ) Input : string expression. If null , the function returns null . string expression. If null , the function returns null . integer expression; optional. If null , 0 , 1 , negative, or not specified, the search starts at the first character position. Output : integer Description : Returns the starting position of the first occurrence of pattern within source . The optional start specifies the character position to start the search with. If the pattern is not found within source , the function returns 0 . SELECT LOCATE('a', 'Elasticsearch'); LOCATE('a', 'Elasticsearch') ---------------------------- 3 SELECT LOCATE('a', 'Elasticsearch', 5); LOCATE('a', 'Elasticsearch', 5) ------------------------------- 10 LTRIM LTRIM(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns the characters of string_exp , with leading blanks removed. SELECT LTRIM(' Elastic'); LTRIM(' Elastic') ------------------- Elastic OCTET_LENGTH OCTET_LENGTH(string_exp) Input : string expression. If null , the function returns null . Output : integer Description : Returns the length in bytes of the string_exp input expression. SELECT OCTET_LENGTH('Elastic'); OCTET_LENGTH('Elastic') ----------------------- 7 POSITION POSITION( string_exp1, string_exp2) Input : string expression. If null , the function returns null . string expression. If null , the function returns null . Output : integer Description : Returns the position of the string_exp1 in string_exp2 . The result is an exact numeric. SELECT POSITION('Elastic', 'Elasticsearch'); POSITION('Elastic', 'Elasticsearch') ------------------------------------ 1 REPEAT REPEAT( string_exp, count) Input : string expression. If null , the function returns null . integer expression. If 0 , negative, or null , the function returns null . Output : string Description : Returns a character string composed of string_exp repeated count times. The resulting string cannot exceed a byte length of 1 MB. SELECT REPEAT('La', 3); REPEAT('La', 3) ---------------- LaLaLa REPLACE REPLACE( source, pattern, replacement) Input : string expression. If null , the function returns null . string expression. If null , the function returns null . string expression. If null , the function returns null . Output : string Description : Search source for occurrences of pattern , and replace with replacement . The resulting string cannot exceed a byte length of 1 MB. SELECT REPLACE('Elastic','El','Fant'); REPLACE('Elastic','El','Fant') ------------------------------ Fantastic RIGHT RIGHT( string_exp, count) Input : string expression. If null , the function returns null . integer expression. If null , the function returns null . If 0 or negative, the function returns an empty string. Output : string Description : Returns the rightmost count characters of string_exp . SELECT RIGHT('Elastic',3); RIGHT('Elastic',3) ------------------ tic RTRIM RTRIM(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns the characters of string_exp with trailing blanks removed. SELECT RTRIM('Elastic '); RTRIM('Elastic ') ------------------- Elastic SPACE SPACE(count) Input : integer expression. If null or negative, the function returns null . Output : string Description : Returns a character string consisting of count spaces. The resulting string cannot exceed a byte length of 1 MB. SELECT SPACE(3); SPACE(3) --------------- STARTS_WITH STARTS_WITH( source, pattern) Input : string expression. If null , the function returns null . string expression. If null , the function returns null . Output : boolean value Description : Returns true if the source expression starts with the specified pattern, false otherwise. The matching is case sensitive. SELECT STARTS_WITH('Elasticsearch', 'Elastic'); STARTS_WITH('Elasticsearch', 'Elastic') -------------------------------- true SELECT STARTS_WITH('Elasticsearch', 'ELASTIC'); STARTS_WITH('Elasticsearch', 'ELASTIC') -------------------------------- false SUBSTRING SUBSTRING( source, start, length) Input : string expression. If null , the function returns null . integer expression. If null , the function returns null . integer expression. If null , the function returns null . Output : string Description : Returns a character string that is derived from source , beginning at the character position specified by start for length characters. SELECT SUBSTRING('Elasticsearch', 0, 7); SUBSTRING('Elasticsearch', 0, 7) -------------------------------- Elastic TRIM TRIM(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns the characters of string_exp , with leading and trailing blanks removed. SELECT TRIM(' Elastic ') AS trimmed; trimmed -------------- Elastic UCASE UCASE(string_exp) Input : string expression. If null , the function returns null . Output : string Description : Returns a string equal to that of the input, with all lowercase characters converted to uppercase. SELECT UCASE('Elastic'); UCASE('Elastic') ---------------- ELASTIC Previous Mathematical Functions Next Type Conversion Functions Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page ASCII BIT_LENGTH CHAR CHAR_LENGTH CONCAT INSERT LCASE LEFT LENGTH LOCATE LTRIM OCTET_LENGTH POSITION REPEAT REPLACE RIGHT RTRIM SPACE STARTS_WITH SUBSTRING TRIM UCASE Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"String Functions | Elastic Docs","url":"https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-functions-string","meta_description":"Functions for performing string manipulation. Output: integer Description: Returns the ASCII code value of the leftmost character of string_exp as an..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / REST APIs / Elasticsearch API compatibility Elastic Stack To help REST clients mitigate the impact of non-compatible (breaking) API changes, Elasticsearch provides a per-request, opt-in API compatibility mode. Elasticsearch REST APIs are generally stable across versions. However, some improvements require changes that are not compatible with previous versions. When an API is targeted for removal or is going to be changed in a non-compatible way, the original API is deprecated for one or more releases. Using the original API triggers a deprecation warning in the logs. This enables you to review the deprecation logs and take the appropriate actions before upgrading. However, in some cases it is difficult to identify all places where deprecated APIs are being used. This is where REST API compatibility can help. When you request REST API compatibility, Elasticsearch attempts to honor the previous REST API version. Elasticsearch attempts to apply the most compatible URL, request body, response body, and HTTP parameters. For compatible APIs, this has no effect— it only impacts calls to APIs that have breaking changes from the previous version. An error can still be returned in compatibility mode if Elasticsearch cannot automatically resolve the incompatibilities. Important REST API compatibility does not guarantee the same behavior as the prior version. It instructs Elasticsearch to automatically resolve any incompatibilities so the request can be processed instead of returning an error. REST API compatibility should be a bridge to smooth out the upgrade process, not a long term strategy. REST API compatibility is only honored across one major version: honor 8.x requests/responses from 9.x. When you submit requests using REST API compatibility and Elasticsearch resolves the incompatibility, a message is written to the deprecation log with the category \"compatible_api\". Review the deprecation log to identify any gaps in usage and fully supported features. Requesting REST API compatibility REST API compatibility is implemented per request via the Accept and/or Content-Type headers. For example: Accept: \"application/vnd.elasticsearch+json;compatible-with=8\" Content-Type: \"application/vnd.elasticsearch+json;compatible-with=8\" The Accept header is always required and the Content-Type header is only required when a body is sent with the request. The following values are valid when communicating with a 8.x or 9.x Elasticsearch server: \"application/vnd.elasticsearch+json;compatible-with=8\" \"application/vnd.elasticsearch+yaml;compatible-with=8\" \"application/vnd.elasticsearch+smile;compatible-with=8\" \"application/vnd.elasticsearch+cbor;compatible-with=8\" The officially supported Elasticsearch clients can enable REST API compatibility for all requests. To enable REST API compatibility for all requests received by Elasticsearch set the environment variable ELASTIC_CLIENT_APIVERSIONING to true. REST API compatibility workflow To leverage REST API compatibility during an upgrade from the last 8.x to 9.0.0: Upgrade your Elasticsearch clients to the latest 8.x version and enable REST API compatibility. Use the Upgrade Assistant to review all critical issues and explore the deprecation logs. Some critical issues might be mitigated by REST API compatibility. Resolve all critical issues before proceeding with the upgrade. Upgrade Elasticsearch to 9.0.0. Review the deprecation logs for entries with the category compatible_api . Review the workflow associated with the requests that relied on compatibility mode. Upgrade your Elasticsearch clients to 9.x and resolve compatibility issues manually where needed. Previous Common options Next API examples Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Requesting REST API compatibility REST API compatibility workflow Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Elasticsearch API compatibility | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/rest-apis/compatibility","meta_description":"To help REST clients mitigate the impact of non-compatible (breaking) API changes, Elasticsearch provides a per-request, opt-in API compatibility mode..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Index lifecycle actions / Wait for snapshot Phases allowed: delete. Waits for the specified SLM policy to be executed before removing the index. This ensures that a snapshot of the deleted index is available. Options policy (Required, string) Name of the SLM policy that the delete action should wait for. Example PUT _ilm/policy/my_policy { \"policy\": { \"phases\": { \"delete\": { \"actions\": { \"wait_for_snapshot\" : { \"policy\": \"slm-policy-name\" } } } } } } Previous Unfollow Next REST APIs Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Options Example Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Wait for snapshot | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/index-lifecycle-actions/ilm-wait-for-snapshot","meta_description":"Phases allowed: delete. Waits for the specified SLM policy to be executed before removing the index. This ensures that a snapshot of the deleted index..."}
+{"text":"Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / SQL / SQL Client Applications / Qlik Sense Desktop Elastic Stack Serverless You can use the Elasticsearch ODBC driver to access Elasticsearch data from Qlik Sense Desktop. Important Elastic does not endorse, promote or provide support for this application; for native Elasticsearch integration in this product, reach out to its vendor. Prerequisites Qlik Sense Desktop November 2018 or higher Elasticsearch SQL ODBC driver A preconfigured User or System DSN (see Configuration section on how to configure a DSN). Data loading To use the Elasticsearch SQL ODBC Driver to load data into Qlik Sense Desktop perform the following steps in sequence. Create new app Once the application is launched, you’ll first need to click on the Create new app button: Name app …then give it a name, Open app …and then open it: Add data to your app Start configuring the source to load data from in the newly created app: Load from ODBC You’ll be given a choice of sources to select. Click on the ODBC icon: Choose DSN In the Create new connection (ODBC) dialog, click on the DSN name that you have previously configured for your Elasticsearch instance: Provide a username and password in the respective fields, if authentication is enabled on your instance and if these are not already part of the DSN. Press the Create button. Select source table The application will now connect to the Elasticsearch instance and query the catalog information, presenting you with a list of tables that you can load data from: Visualize the data Press on the Add data button and customize your data visualization: Previous MicroStrategy Desktop Next SQuirreL SQL Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Prerequisites Data loading Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Qlik Sense Desktop | Elastic Docs","url":"https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-client-apps-qlik","meta_description":"You can use the Elasticsearch ODBC driver to access Elasticsearch data from Qlik Sense Desktop. Qlik Sense Desktop November 2018 or higher, Elasticsearch..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / … / REST APIs / API examples / Suggester examples Elastic Stack The suggest feature suggests similar looking terms based on a provided text by using a suggester. The suggest request part is defined alongside the query part in a _search request. If the query part is left out, only suggestions are returned. Note For the most up-to-date details, refer to Search APIs . Several suggestions can be specified per request. Each suggestion is identified with an arbitrary name. In the example below two suggestions are requested. Both my-suggest-1 and my-suggest-2 suggestions use the term suggester, but have a different text . POST _search { \"suggest\": { \"my-suggest-1\" : { \"text\" : \"tring out Elasticsearch\", \"term\" : { \"field\" : \"message\" } }, \"my-suggest-2\" : { \"text\" : \"kmichy\", \"term\" : { \"field\" : \"user.id\" } } } } The following suggest response example includes the suggestion response for my-suggest-1 and my-suggest-2 . Each suggestion part contains entries. Each entry is effectively a token from the suggest text and contains the suggestion entry text, the original start offset and length in the suggest text and if found an arbitrary number of options. { \"_shards\": ... \"hits\": ... \"took\": 2, \"timed_out\": false, \"suggest\": { \"my-suggest-1\": [ { \"text\": \"tring\", \"offset\": 0, \"length\": 5, \"options\": [ {\"text\": \"trying\", \"score\": 0.8, \"freq\": 1 } ] }, { \"text\": \"out\", \"offset\": 6, \"length\": 3, \"options\": [] }, { \"text\": \"elasticsearch\", \"offset\": 10, \"length\": 13, \"options\": [] } ], \"my-suggest-2\": ... } } Each options array contains an option object that includes the suggested text, its document frequency and score compared to the suggest entry text. The meaning of the score depends on the used suggester. The term suggester's score is based on the edit distance. Global suggest text To avoid repetition of the suggest text, it is possible to define a global text. In the following example the suggest text is defined globally and applies to the my-suggest-1 and my-suggest-2 suggestions. POST _search { \"suggest\": { \"text\" : \"tring out Elasticsearch\", \"my-suggest-1\" : { \"term\" : { \"field\" : \"message\" } }, \"my-suggest-2\" : { \"term\" : { \"field\" : \"user\" } } } } The suggest text can in the above example also be specified as suggestion specific option. The suggest text specified on suggestion level override the suggest text on the global level. Term suggester The term suggester suggests terms based on edit distance. The provided suggest text is analyzed before terms are suggested. The suggested terms are provided per analyzed suggest text token. The term suggester doesn't take the query into account that is part of request. Common suggest options include: text The suggest text. The suggest text is a required option that needs to be set globally or per suggestion. field The field to fetch the candidate suggestions from. This is a required option that either needs to be set globally or per suggestion. analyzer The analyzer to analyse the suggest text with. Defaults to the search analyzer of the suggest field. size The maximum corrections to be returned per suggest text token. sort Defines how suggestions should be sorted per suggest text term. Two possible values: score : Sort by score first, then document frequency and then the term itself. frequency : Sort by document frequency first, then similarity score and then the term itself. suggest_mode The suggest mode controls what suggestions are included or controls for what suggest text terms, suggestions should be suggested. Three possible values can be specified: missing : Only provide suggestions for suggest text terms that are not in the index (default). popular : Only suggest suggestions that occur in more docs than the original suggest text term. always : Suggest any matching suggestions based on terms in the suggest text. Phrase suggester The term suggester provides a very convenient API to access word alternatives on a per token basis within a certain string distance. The API allows accessing each token in the stream individually while suggest-selection is left to the API consumer. Yet, often pre-selected suggestions are required in order to present to the end-user. The phrase suggester adds additional logic on top of the term suggester to select entire corrected phrases instead of individual tokens weighted based on ngram-language models. In practice this suggester will be able to make better decisions about which tokens to pick based on co-occurrence and frequencies. In general the phrase suggester requires special mapping up front to work. The phrase suggester examples on this page need the following mapping to work. The reverse analyzer is used only in the last example. PUT test { \"settings\": { \"index\": { \"number_of_shards\": 1, \"analysis\": { \"analyzer\": { \"trigram\": { \"type\": \"custom\", \"tokenizer\": \"standard\", \"filter\": [\"lowercase\",\"shingle\"] }, \"reverse\": { \"type\": \"custom\", \"tokenizer\": \"standard\", \"filter\": [\"lowercase\",\"reverse\"] } }, \"filter\": { \"shingle\": { \"type\": \"shingle\", \"min_shingle_size\": 2, \"max_shingle_size\": 3 } } } } }, \"mappings\": { \"properties\": { \"title\": { \"type\": \"text\", \"fields\": { \"trigram\": { \"type\": \"text\", \"analyzer\": \"trigram\" }, \"reverse\": { \"type\": \"text\", \"analyzer\": \"reverse\" } } } } } } POST test/_doc?refresh=true {\"title\": \"noble warriors\"} POST test/_doc?refresh=true {\"title\": \"nobel prize\"} Once you have the analyzers and mappings set up you can use the phrase suggester in the same spot you'd use the term suggester: POST test/_search { \"suggest\": { \"text\": \"noble prize\", \"simple_phrase\": { \"phrase\": { \"field\": \"title.trigram\", \"size\": 1, \"gram_size\": 3, \"direct_generator\": [ { \"field\": \"title.trigram\", \"suggest_mode\": \"always\" } ], \"highlight\": { \"pre_tag\": \"<em>\", \"post_tag\": \"</em>\" } } } } } The response contains suggestions scored by the most likely spelling correction first. In this case we received the expected correction \"nobel prize\". { \"_shards\": ... \"hits\": ... \"timed_out\": false, \"took\": 3, \"suggest\": { \"simple_phrase\" : [ { \"text\" : \"noble prize\", \"offset\" : 0, \"length\" : 11, \"options\" : [ { \"text\" : \"nobel prize\", \"highlighted\": \"<em>nobel</em> prize\", \"score\" : 0.48614594 }] } ] } } Basic phrase suggest API parameters include: field The name of the field used to do n-gram lookups for the language model, the suggester will use this field to gain statistics to score corrections. This field is mandatory. gram_size Sets max size of the n-grams (shingles) in the field . If the field doesn't contain n-grams (shingles), this should be omitted or set to 1 . Note that Elasticsearch tries to detect the gram size based on the specified field . If the field uses a shingle filter, the gram_size is set to the max_shingle_size if not explicitly set. real_word_error_likelihood The likelihood of a term being misspelled even if the term exists in the dictionary. The default is 0.95 , meaning 5% of the real words are misspelled. confidence The confidence level defines a factor applied to the input phrases score which is used as a threshold for other suggest candidates. Only candidates that score higher than the threshold will be included in the result. For instance a confidence level of 1.0 will only return suggestions that score higher than the input phrase. If set to 0.0 the top N candidates are returned. The default is 1.0 . max_errors The maximum percentage of the terms considered to be misspellings in order to form a correction. This method accepts a float value in the range [0..1) as a fraction of the actual query terms or a number >=1 as an absolute number of query terms. The default is set to 1.0 , meaning only corrections with at most one misspelled term are returned. Note that setting this too high can negatively impact performance. Low values like 1 or 2 are recommended; otherwise the time spend in suggest calls might exceed the time spend in query execution. separator The separator that is used to separate terms in the bigram field. If not set the whitespace character is used as a separator. size The number of candidates that are generated for each individual query term. Low numbers like 3 or 5 typically produce good results. Raising this can bring up terms with higher edit distances. The default is 5 . analyzer Sets the analyzer to analyze to suggest text with. Defaults to the search analyzer of the suggest field passed via field . shard_size Sets the maximum number of suggested terms to be retrieved from each individual shard. During the reduce phase, only the top N suggestions are returned based on the size option. Defaults to 5 . text Sets the text / query to provide suggestions for. highlight Sets up suggestion highlighting. If not provided then no highlighted field is returned. If provided must contain exactly pre_tag and post_tag , which are wrapped around the changed tokens. If multiple tokens in a row are changed the entire phrase of changed tokens is wrapped rather than each token. collate Checks each suggestion against the specified query to prune suggestions for which no matching docs exist in the index. The collate query for a suggestion is run only on the local shard from which the suggestion has been generated from. The query must be specified and it can be templated. Refer to Search templates . The current suggestion is automatically made available as the {{suggestion}} variable, which should be used in your query. You can still specify your own template params — the suggestion value will be added to the variables you specify. Additionally, you can specify a prune to control if all phrase suggestions will be returned; when set to true the suggestions will have an additional option collate_match , which will be true if matching documents for the phrase was found, false otherwise. The default value for prune is false . POST test/_search { \"suggest\": { \"text\" : \"noble prize\", \"simple_phrase\" : { \"phrase\" : { \"field\" : \"title.trigram\", \"size\" : 1, \"direct_generator\" : [ { \"field\" : \"title.trigram\", \"suggest_mode\" : \"always\", \"min_word_length\" : 1 } ], \"collate\": { \"query\": { \"source\" : { \"match\": { \"{{field_name}}\" : \"{{suggestion}}\" } } }, \"params\": {\"field_name\" : \"title\"}, \"prune\": true } } } } } This query will be run once for every suggestion. The {{suggestion}} variable will be replaced by the text of each suggestion. An additional field_name variable has been specified in params and is used by the match query. All suggestions will be returned with an extra collate_match option indicating whether the generated phrase matched any document. Smoothing models The phrase suggester supports multiple smoothing models to balance weight between infrequent grams (grams (shingles) are not existing in the index) and frequent grams (appear at least once in the index). The smoothing model can be selected by setting the smoothing parameter to one of the following options. Each smoothing model supports specific properties that can be configured. stupid_backoff A simple backoff model that backs off to lower order n-gram models if the higher order count is 0 and discounts the lower order n-gram model by a constant factor. The default discount is 0.4 . Stupid Backoff is the default model. laplace A smoothing model that uses an additive smoothing where a constant (typically 1.0 or smaller) is added to all counts to balance weights. The default alpha is 0.5 . linear_interpolation A smoothing model that takes the weighted mean of the unigrams, bigrams, and trigrams based on user supplied weights (lambdas). Linear Interpolation doesn't have any default values. All parameters ( trigram_lambda , bigram_lambda , unigram_lambda ) must be supplied. POST test/_search { \"suggest\": { \"text\" : \"obel prize\", \"simple_phrase\" : { \"phrase\" : { \"field\" : \"title.trigram\", \"size\" : 1, \"smoothing\" : { \"laplace\" : { \"alpha\" : 0.7 } } } } } } Candidate generators The phrase suggester uses candidate generators to produce a list of possible terms per term in the given text. A single candidate generator is similar to a term suggester called for each individual term in the text. The output of the generators is subsequently scored in combination with the candidates from the other terms for suggestion candidates. Currently only one type of candidate generator is supported, the direct_generator . The phrase suggest API accepts a list of generators under the key direct_generator ; each of the generators in the list is called per term in the original text. Direct generators The parameters that direct generators support include: field The field to fetch the candidate suggestions from. This is a required option that either needs to be set globally or per suggestion. size The maximum corrections to be returned per suggest text token. suggest_mode The suggest mode controls what suggestions are included on the suggestions generated on each shard. All values other than always can be thought of as an optimization to generate fewer suggestions to test on each shard and are not rechecked when combining the suggestions generated on each shard. Thus missing will generate suggestions for terms on shards that do not contain them even if other shards do contain them. Those should be filtered out using confidence . Three possible values can be specified: missing : Only generate suggestions for terms that are not in the shard. This is the default. popular : Only suggest terms that occur in more docs on the shard than the original term. always : Suggest any matching suggestions based on terms in the suggest text. max_edits The maximum edit distance candidate suggestions can have in order to be considered as a suggestion. Can only be a value between 1 and 2. Any other value results in a bad request error being thrown. Defaults to 2. prefix_length The number of minimal prefix characters that must match in order be a candidate suggestions. Defaults to 1. Increasing this number improves spellcheck performance. Usually misspellings don't occur in the beginning of terms. min_word_length The minimum length a suggest text term must have in order to be included. Defaults to 4. max_inspections A factor that is used to multiply with the shard_size in order to inspect more candidate spelling corrections on the shard level. Can improve accuracy at the cost of performance. Defaults to 5. min_doc_freq The minimal threshold in number of documents a suggestion should appear in. This can be specified as an absolute number or as a relative percentage of number of documents. This can improve quality by only suggesting high frequency terms. Defaults to 0f and is not enabled. If a value higher than 1 is specified, then the number cannot be fractional. The shard level document frequencies are used for this option. max_term_freq The maximum threshold in number of documents in which a suggest text token can exist in order to be included. Can be a relative percentage number (e.g., 0.4) or an absolute number to represent document frequencies. If a value higher than 1 is specified, then fractional can not be specified. Defaults to 0.01f. This can be used to exclude high frequency terms — which are usually spelled correctly — from being spellchecked. This also improves the spellcheck performance. The shard level document frequencies are used for this option. pre_filter A filter (analyzer) that is applied to each of the tokens passed to this candidate generator. This filter is applied to the original token before candidates are generated. post_filter A filter (analyzer) that is applied to each of the generated tokens before they are passed to the actual phrase scorer. The following example shows a phrase suggest call with two generators: the first one is using a field containing ordinary indexed terms, and the second one uses a field that uses terms indexed with a reverse filter (tokens are index in reverse order). This is used to overcome the limitation of the direct generators to require a constant prefix to provide high-performance suggestions. The pre_filter and post_filter options accept ordinary analyzer names. POST test/_search { \"suggest\": { \"text\" : \"obel prize\", \"simple_phrase\" : { \"phrase\" : { \"field\" : \"title.trigram\", \"size\" : 1, \"direct_generator\" : [ { \"field\" : \"title.trigram\", \"suggest_mode\" : \"always\" }, { \"field\" : \"title.reverse\", \"suggest_mode\" : \"always\", \"pre_filter\" : \"reverse\", \"post_filter\" : \"reverse\" } ] } } } } pre_filter and post_filter can also be used to inject synonyms after candidates are generated. For instance for the query captain usq we might generate a candidate usa for the term usq , which is a synonym for america . This allows us to present captain america to the user if this phrase scores high enough. Completion suggester The completion suggester provides auto-complete/search-as-you-type functionality. This is a navigational feature to guide users to relevant results as they are typing, improving search precision. It is not meant for spell correction or did-you-mean functionality like the term or phrase suggesters. Ideally, auto-complete functionality should be as fast as a user types to provide instant feedback relevant to what a user has already typed in. Hence, completion suggester is optimized for speed. The suggester uses data structures that enable fast lookups, but are costly to build and are stored in-memory. Mapping To use the completion suggester , map the field from which you want to generate suggestions as type completion . This indexes the field values for fast completions. PUT music { \"mappings\": { \"properties\": { \"suggest\": { \"type\": \"completion\" } } } } The parameters that are accepted by completion fields include: analyzer The index analyzer to use, defaults to simple . search_analyzer The search analyzer to use, defaults to value of analyzer . preserve_separators Preserves the separators, defaults to true . If disabled, you could find a field starting with Foo Fighters , if you suggest for foof . preserve_position_increments Enables position increments, defaults to true . If disabled and using stopwords analyzer, you could get a field starting with The Beatles , if you suggest for b . Note : You could also achieve this by indexing two inputs, Beatles and The Beatles , no need to change a simple analyzer, if you are able to enrich your data. max_input_length Limits the length of a single input, defaults to 50 UTF-16 code points. This limit is only used at index time to reduce the total number of characters per input string in order to prevent massive inputs from bloating the underlying datastructure. Most use cases won't be influenced by the default value since prefix completions seldom grow beyond prefixes longer than a handful of characters. Indexing You index suggestions like any other field. A suggestion is made of an input and an optional weight attribute. An input is the expected text to be matched by a suggestion query and the weight determines how the suggestions will be scored. Indexing a suggestion is as follows: PUT music/_doc/1?refresh { \"suggest\" : { \"input\": [ \"Nevermind\", \"Nirvana\" ], \"weight\" : 34 } } The supported parameters include: input The input to store, this can be an array of strings or just a string. This field is mandatory. Note This value cannot contain the following UTF-16 control characters: \\u0000 (null) \\u001f (information separator one) \\u001e (information separator two) weight A positive integer or a string containing a positive integer, which defines a weight and allows you to rank your suggestions. This field is optional. You can index multiple suggestions for a document as follows: PUT music/_doc/1?refresh { \"suggest\": [ { \"input\": \"Nevermind\", \"weight\": 10 }, { \"input\": \"Nirvana\", \"weight\": 3 } ] } You can use the following shorthand form. Note that you can not specify a weight with suggestion(s) in the shorthand form. PUT music/_doc/1?refresh { \"suggest\" : [ \"Nevermind\", \"Nirvana\" ] } Querying Suggesting works as usual, except that you have to specify the suggest type as completion . Suggestions are near real-time, which means new suggestions can be made visible by refresh and documents once deleted are never shown. This request: POST music/_search?pretty { \"suggest\": { \"song-suggest\": { \"prefix\": \"nir\", \"completion\": { \"field\": \"suggest\" } } } } Prefix used to search for suggestions Type of suggestions Name of the field to search for suggestions in It returns this response: { \"_shards\" : { \"total\" : 1, \"successful\" : 1, \"skipped\" : 0, \"failed\" : 0 }, \"hits\": ... \"took\": 2, \"timed_out\": false, \"suggest\": { \"song-suggest\" : [ { \"text\" : \"nir\", \"offset\" : 0, \"length\" : 3, \"options\" : [ { \"text\" : \"Nirvana\", \"_index\": \"music\", \"_id\": \"1\", \"_score\": 1.0, \"_source\": { \"suggest\": [\"Nevermind\", \"Nirvana\"] } } ] } ] } } Important _source metadata field must be enabled, which is the default behavior, to enable returning _source with suggestions. The configured weight for a suggestion is returned as _score . The text field uses the input of your indexed suggestion. Suggestions return the full document _source by default. The size of the _source can impact performance due to disk fetch and network transport overhead. To save some network overhead, filter out unnecessary fields from the _source using source filtering to minimize _source size. Note that the _suggest endpoint doesn't support source filtering but using suggest on the _search endpoint does: POST music/_search { \"_source\": \"suggest\", \"suggest\": { \"song-suggest\": { \"prefix\": \"nir\", \"completion\": { \"field\": \"suggest\", \"size\": 5 } } } } Filter the source to return only the suggest field Name of the field to search for suggestions in Number of suggestions to return Which should look like: { \"took\": 6, \"timed_out\": false, \"_shards\": { \"total\": 1, \"successful\": 1, \"skipped\": 0, \"failed\": 0 }, \"hits\": { \"total\": { \"value\": 0, \"relation\": \"eq\" }, \"max_score\": null, \"hits\": [] }, \"suggest\": { \"song-suggest\": [ { \"text\": \"nir\", \"offset\": 0, \"length\": 3, \"options\": [ { \"text\": \"Nirvana\", \"_index\": \"music\", \"_id\": \"1\", \"_score\": 1.0, \"_source\": { \"suggest\": [ \"Nevermind\", \"Nirvana\" ] } } ] } ] } } The supported parameters for a basic completion suggester query include: field The name of the field on which to run the query (required). size The number of suggestions to return (defaults to 5 ). skip_duplicates Whether duplicate suggestions should be filtered out (defaults to false ). Note The completion suggester considers all documents in the index. See Context suggester for an explanation of how to query a subset of documents instead. Note In case of completion queries spanning more than one shard, the suggest is executed in two phases, where the last phase fetches the relevant documents from shards, implying executing completion requests against a single shard is more performant due to the document fetch overhead when the suggest spans multiple shards. To get best performance for completions, it is recommended to index completions into a single shard index. In case of high heap usage due to shard size, it is still recommended to break index into multiple shards instead of optimizing for completion performance. Skip duplicate suggestions Queries can return duplicate suggestions coming from different documents. It is possible to modify this behavior by setting skip_duplicates to true. When set, this option filters out documents with duplicate suggestions from the result. POST music/_search?pretty { \"suggest\": { \"song-suggest\": { \"prefix\": \"nor\", \"completion\": { \"field\": \"suggest\", \"skip_duplicates\": true } } } } Warning When set to true, this option can slow down search because more suggestions need to be visited to find the top N. Fuzzy queries The completion suggester also supports fuzzy queries — this means you can have a typo in your search and still get results back. POST music/_search?pretty { \"suggest\": { \"song-suggest\": { \"prefix\": \"nor\", \"completion\": { \"field\": \"suggest\", \"fuzzy\": { \"fuzziness\": 2 } } } } } Suggestions that share the longest prefix to the query prefix will be scored higher. The fuzzy query can take specific fuzzy parameters. For example: fuzziness The fuzziness factor, defaults to AUTO . See Fuzziness for allowed settings. transpositions If set to true , transpositions are counted as one change instead of two, defaults to true min_length Minimum length of the input before fuzzy suggestions are returned, defaults 3 prefix_length Minimum length of the input, which is not checked for fuzzy alternatives, defaults to 1 unicode_aware If true , all measurements (like fuzzy edit distance, transpositions, and lengths) are measured in Unicode code points instead of in bytes. This is slightly slower than raw bytes, so it is set to false by default. Note If you want to stick with the default values, but still use fuzzy, you can either use fuzzy: {} or fuzzy: true . Regex queries The completion suggester also supports regex queries meaning you can express a prefix as a regular expression POST music/_search?pretty { \"suggest\": { \"song-suggest\": { \"regex\": \"n[ever|i]r\", \"completion\": { \"field\": \"suggest\" } } } } The regex query can take specific regex parameters. For example: flags Possible flags are ALL (default), ANYSTRING , COMPLEMENT , EMPTY , INTERSECTION , INTERVAL , or NONE . See regexp-syntax for their meaning max_determinized_states Regular expressions are dangerous because it's easy to accidentally create an innocuous looking one that requires an exponential number of internal determinized automaton states (and corresponding RAM and CPU) for Lucene to execute. Lucene prevents these using the max_determinized_states setting (defaults to 10000). You can raise this limit to allow more complex regular expressions to execute. Context suggester The completion suggester considers all documents in the index, but it is often desirable to serve suggestions filtered and/or boosted by some criteria. For example, you want to suggest song titles filtered by certain artists or you want to boost song titles based on their genre. To achieve suggestion filtering and/or boosting, you can add context mappings while configuring a completion field. You can define multiple context mappings for a completion field. Every context mapping has a unique name and a type. There are two types: category and geo . Context mappings are configured under the contexts parameter in the field mapping. Note It is mandatory to provide a context when indexing and querying a context enabled completion field. The maximum allowed number of completion field context mappings is 10. The following example defines types, each with two context mappings for a completion field: PUT place { \"mappings\": { \"properties\": { \"suggest\": { \"type\": \"completion\", \"contexts\": [ { \"name\": \"place_type\", \"type\": \"category\" }, { \"name\": \"location\", \"type\": \"geo\", \"precision\": 4 } ] } } } } PUT place_path_category { \"mappings\": { \"properties\": { \"suggest\": { \"type\": \"completion\", \"contexts\": [ { \"name\": \"place_type\", \"type\": \"category\", \"path\": \"cat\" }, { \"name\": \"location\", \"type\": \"geo\", \"precision\": 4, \"path\": \"loc\" } ] }, \"loc\": { \"type\": \"geo_point\" } } } } Defines a category context named place_type where the categories must be sent with the suggestions. Defines a geo context named location where the categories must be sent with the suggestions. Defines a category context named place_type where the categories are read from the cat field. Defines a geo context named location where the categories are read from the loc field. Note Adding context mappings increases the index size for completion field. The completion index is entirely heap resident, you can monitor the completion field index size using index statistics . Category context The category context allows you to associate one or more categories with suggestions at index time. At query time, suggestions can be filtered and boosted by their associated categories. The mappings are set up like the place_type fields above. If path is defined then the categories are read from that path in the document, otherwise they must be sent in the suggest field like this: PUT place/_doc/1 { \"suggest\": { \"input\": [ \"timmy's\", \"starbucks\", \"dunkin donuts\" ], \"contexts\": { \"place_type\": [ \"cafe\", \"food\" ] } } } These suggestions will be associated with cafe and food category. If the mapping had a path then the following index request would be enough to add the categories: PUT place_path_category/_doc/1 { \"suggest\": [\"timmy's\", \"starbucks\", \"dunkin donuts\"], \"cat\": [\"cafe\", \"food\"] } These suggestions will be associated with cafe and food category. Note If context mapping references another field and the categories are explicitly indexed, the suggestions are indexed with both set of categories. Category query Suggestions can be filtered by one or more categories. The following filters suggestions by multiple categories: POST place/_search?pretty { \"suggest\": { \"place_suggestion\": { \"prefix\": \"tim\", \"completion\": { \"field\": \"suggest\", \"size\": 10, \"contexts\": { \"place_type\": [ \"cafe\", \"restaurants\" ] } } } } } Note If multiple categories or category contexts are set on the query they are merged as a disjunction. This means that suggestions match if they contain at least one of the provided context values. Suggestions with certain categories can be boosted higher than others. The following example filters suggestions by categories and additionally boosts suggestions associated with some categories: POST place/_search?pretty { \"suggest\": { \"place_suggestion\": { \"prefix\": \"tim\", \"completion\": { \"field\": \"suggest\", \"size\": 10, \"contexts\": { \"place_type\": [ { \"context\": \"cafe\" }, { \"context\": \"restaurants\", \"boost\": 2 } ] } } } } } The context query filter suggestions associated with categories cafe and restaurants and boosts the suggestions associated with restaurants by a factor of 2 In addition to accepting category values, a context query can be composed of multiple category context clauses. The parameters that are supported for a category context clause include: context The value of the category to filter/boost on. This is mandatory. boost The factor by which the score of the suggestion should be boosted, the score is computed by multiplying the boost with the suggestion weight, defaults to 1 prefix Whether the category value should be treated as a prefix or not. For example, if set to true , you can filter category of type1 , type2 and so on, by specifying a category prefix of type . Defaults to false Note If a suggestion entry matches multiple contexts the final score is computed as the maximum score produced by any matching contexts. Geo location context A geo context allows you to associate one or more geo points or geohashes with suggestions at index time. At query time, suggestions can be filtered and boosted if they are within a certain distance of a specified geo location. Internally, geo points are encoded as geohashes with the specified precision. Geo mapping In addition to the path setting, geo context mapping accepts settings such as: precision This defines the precision of the geohash to be indexed and can be specified as a distance value ( 5m , 10km etc.), or as a raw geohash precision ( 1 .. 12 ). Defaults to a raw geohash precision value of 6 . Note The index time precision setting sets the maximum geohash precision that can be used at query time. Indexing geo contexts geo contexts can be explicitly set with suggestions or be indexed from a geo point field in the document via the path parameter, similar to category contexts. Associating multiple geo location context with a suggestion, will index the suggestion for every geo location. The following indexes a suggestion with two geo location contexts: PUT place/_doc/1 { \"suggest\": { \"input\": \"timmy's\", \"contexts\": { \"location\": [ { \"lat\": 43.6624803, \"lon\": -79.3863353 }, { \"lat\": 43.6624718, \"lon\": -79.3873227 } ] } } } Geo location query Suggestions can be filtered and boosted with respect to how close they are to one or more geo points. The following filters suggestions that fall within the area represented by the encoded geohash of a geo point: POST place/_search { \"suggest\": { \"place_suggestion\": { \"prefix\": \"tim\", \"completion\": { \"field\": \"suggest\", \"size\": 10, \"contexts\": { \"location\": { \"lat\": 43.662, \"lon\": -79.380 } } } } } } Note When a location with a lower precision at query time is specified, all suggestions that fall within the area will be considered. If multiple categories or category contexts are set on the query they are merged as a disjunction. This means that suggestions match if they contain at least one of the provided context values. Suggestions that are within an area represented by a geohash can also be boosted higher than others, as shown by the following: POST place/_search?pretty { \"suggest\": { \"place_suggestion\": { \"prefix\": \"tim\", \"completion\": { \"field\": \"suggest\", \"size\": 10, \"contexts\": { \"location\": [ { \"lat\": 43.6624803, \"lon\": -79.3863353, \"precision\": 2 }, { \"context\": { \"lat\": 43.6624803, \"lon\": -79.3863353 }, \"boost\": 2 } ] } } } } } The context query filters for suggestions that fall under the geo location represented by a geohash of (43.662, -79.380) with a precision of 2 and boosts suggestions that fall under the geohash representation of (43.6624803, -79.3863353) with a default precision of 6 by a factor of 2 Note If a suggestion entry matches multiple contexts the final score is computed as the maximum score produced by any matching contexts. In addition to accepting context values, a context query can be composed of multiple context clauses. The parameters that are supported for a geo context clause include: context A geo point object or a geo hash string to filter or boost the suggestion by. This is mandatory. boost The factor by which the score of the suggestion should be boosted, the score is computed by multiplying the boost with the suggestion weight, defaults to 1 precision The precision of the geohash to encode the query geo point. This can be specified as a distance value ( 5m , 10km etc.), or as a raw geohash precision ( 1 .. 12 ). Defaults to index time precision level. neighbours Accepts an array of precision values at which neighbouring geohashes should be taken into account. precision value can be a distance value ( 5m , 10km etc.) or a raw geohash precision ( 1 .. 12 ). Defaults to generating neighbours for index time precision level. Note The precision field does not result in a distance match. Specifying a distance value like 10km only results in a geohash precision value that represents tiles of that size. The precision will be used to encode the search geo point into a geohash tile for completion matching. A consequence of this is that points outside that tile, even if very close to the search point, will not be matched. Reducing the precision, or increasing the distance, can reduce the risk of this happening, but not entirely remove it. Returning the type of the suggester Sometimes you need to know the exact type of a suggester in order to parse its results. The typed_keys parameter can be used to change the suggester's name in the response so that it will be prefixed by its type. Considering the following example with two suggesters term and phrase : POST _search?typed_keys { \"suggest\": { \"text\" : \"some test mssage\", \"my-first-suggester\" : { \"term\" : { \"field\" : \"message\" } }, \"my-second-suggester\" : { \"phrase\" : { \"field\" : \"message\" } } } } In the response, the suggester names will be changed to respectively term#my-first-suggester and phrase#my-second-suggester , reflecting the types of each suggestion: { \"suggest\": { \"term#my-first-suggester\": [ { \"text\": \"some\", \"offset\": 0, \"length\": 4, \"options\": [] }, { \"text\": \"test\", \"offset\": 5, \"length\": 4, \"options\": [] }, { \"text\": \"mssage\", \"offset\": 10, \"length\": 6, \"options\": [ { \"text\": \"message\", \"score\": 0.8333333, \"freq\": 4 } ] } ], \"phrase#my-second-suggester\": [ { \"text\": \"some test mssage\", \"offset\": 0, \"length\": 16, \"options\": [ { \"text\": \"some test message\", \"score\": 0.030227963 } ] } ] }, ... } The name my-first-suggester now contains the term prefix. The name my-second-suggester now contains the phrase prefix. Previous The shard request cache Next Profile search requests Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Global suggest text Term suggester Phrase suggester Smoothing models Candidate generators Direct generators Completion suggester Mapping Indexing Querying Skip duplicate suggestions Fuzzy queries Regex queries Context suggester Category context Geo location context Returning the type of the suggester Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Suggester examples | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/rest-apis/search-suggesters","meta_description":"The suggest feature suggests similar looking terms based on a provided text by using a suggester. The suggest request part is defined alongside the query..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / REST APIs / Elasticsearch API conventions Elastic Stack Serverless The Elasticsearch REST APIs are exposed over HTTP. Except where noted, the following conventions apply across all APIs. Content-type requirements The type of the content sent in a request body must be specified using the Content-Type header. The value of this header must map to one of the supported formats that the API supports. Most APIs support JSON, YAML, CBOR, and SMILE. The bulk and multi-search APIs support NDJSON, JSON, and SMILE; other types will result in an error response. When using the source query string parameter, the content type must be specified using the source_content_type query string parameter. Elasticsearch only supports UTF-8-encoded JSON. Elasticsearch ignores any other encoding headings sent with a request. Responses are also UTF-8 encoded. X-Opaque-Id HTTP header You can pass an X-Opaque-Id HTTP header to track the origin of a request in Elasticsearch logs and tasks. If provided, Elasticsearch surfaces the X-Opaque-Id value in the: Response of any request that includes the header Task management API response Slow logs Deprecation logs For the deprecation logs, Elasticsearch also uses the X-Opaque-Id value to throttle and deduplicate deprecation warnings. See Deprecation logs throttling . The X-Opaque-Id header accepts any arbitrary value. However, we recommend you limit these values to a finite set, such as an ID per client. Don’t generate a unique X-Opaque-Id header for every request. Too many unique X-Opaque-Id values can prevent Elasticsearch from deduplicating warnings in the deprecation logs. traceparent HTTP header Elasticsearch also supports a traceparent HTTP header using the official W3C trace context spec . You can use the traceparent header to trace requests across Elastic products and other services. Because it’s only used for traces, you can safely generate a unique traceparent header for each request. If provided, Elasticsearch surfaces the header’s trace-id value as trace.id in the: JSON Elasticsearch server logs Slow logs Deprecation logs For example, the following traceparent value would produce the following trace.id value in the above logs. `traceparent`: 00-0af7651916cd43dd8448eb211c80319c-b7ad6b7169203331-01 `trace.id`: 0af7651916cd43dd8448eb211c80319c GET and POST requests A number of Elasticsearch GET APIs— most notably the search API— support a request body. While the GET action makes sense in the context of retrieving information, GET requests with a body are not supported by all HTTP libraries. All Elasticsearch GET APIs that require a body can also be submitted as POST requests. Alternatively, you can pass the request body as the source query string parameter when using GET. Cron expressions A cron expression is a string of the following form: <seconds> <minutes> <hours> <day_of_month> <month> <day_of_week> [year] Elasticsearch uses the cron parser from the Quartz Job Scheduler . For more information about writing Quartz cron expressions, see the Quartz CronTrigger Tutorial . All schedule times are in coordinated universal time (UTC); other timezones are not supported. Tip You can use the elasticsearch-croneval command line tool to validate your cron expressions. Cron expression elements All elements are required except for year . See Cron special characters for information about the allowed special characters. <seconds> (Required) Valid values: 0 - 59 and the special characters , - * / <minutes> (Required) Valid values: 0 - 59 and the special characters , - * / <hours> (Required) Valid values: 0 - 23 and the special characters , - * / <day_of_month> (Required) Valid values: 1 - 31 and the special characters , - * / ? L W <month> (Required) Valid values: 1 - 12 , JAN - DEC , jan - dec , and the special characters , - * / <day_of_week> (Required) Valid values: 1 - 7 , SUN - SAT , sun - sat , and the special characters , - * / ? L # <year> (Optional) Valid values: 1970 - 2099 and the special characters , - * / Cron special characters * Selects every possible value for a field. For example, * in the hours field means \"every hour\". ? No specific value. Use when you don’t care what the value is. For example, if you want the schedule to trigger on a particular day of the month, but don’t care what day of the week that happens to be, you can specify ? in the day_of_week field. - A range of values (inclusive). Use to separate a minimum and maximum value. For example, if you want the schedule to trigger every hour between 9:00 a.m. and 5:00 p.m., you could specify 9-17 in the hours field. , Multiple values. Use to separate multiple values for a field. For example, if you want the schedule to trigger every Tuesday and Thursday, you could specify TUE,THU in the day_of_week field. / Increment. Use to separate values when specifying a time increment. The first value represents the starting point, and the second value represents the interval. For example, if you want the schedule to trigger every 20 minutes starting at the top of the hour, you could specify 0/20 in the minutes field. Similarly, specifying 1/5 in day_of_month field will trigger every 5 days starting on the first day of the month. L Last. Use in the day_of_month field to mean the last day of the month— day 31 for January, day 28 for February in non-leap years, day 30 for April, and so on. Use alone in the day_of_week field in place of 7 or SAT , or after a particular day of the week to select the last day of that type in the month. For example 6L means the last Friday of the month. You can specify LW in the day_of_month field to specify the last weekday of the month. Avoid using the L option when specifying lists or ranges of values, as the results likely won’t be what you expect. W Weekday. Use to specify the weekday (Monday-Friday) nearest the given day. As an example, if you specify 15W in the day_of_month field and the 15th is a Saturday, the schedule will trigger on the 14th. If the 15th is a Sunday, the schedule will trigger on Monday the 16th. If the 15th is a Tuesday, the schedule will trigger on Tuesday the 15th. However if you specify 1W as the value for day_of_month , and the 1st is a Saturday, the schedule will trigger on Monday the 3rd— it won’t jump over the month boundary. You can specify LW in the day_of_month field to specify the last weekday of the month. You can only use the W option when the day_of_month is a single day— it is not valid when specifying a range or list of days. # Nth XXX day in a month. Use in the day_of_week field to specify the nth XXX day of the month. For example, if you specify 6#1 , the schedule will trigger on the first Friday of the month. Note that if you specify 3#5 and there are not 5 Tuesdays in a particular month, the schedule won’t trigger that month. Examples Setting daily triggers 0 5 9 * * ? Trigger at 9:05 a.m. UTC every day. 0 5 9 * * ? 2020 Trigger at 9:05 a.m. UTC every day during the year 2020. Restricting triggers to a range of days or times 0 5 9 ? * MON-FRI Trigger at 9:05 a.m. UTC Monday through Friday. 0 0-5 9 * * ? Trigger every minute starting at 9:00 a.m. UTC and ending at 9:05 a.m. UTC every day. Setting interval triggers 0 0/15 9 * * ? Trigger every 15 minutes starting at 9:00 a.m. UTC and ending at 9:45 a.m. UTC every day. 0 5 9 1/3 * ? Trigger at 9:05 a.m. UTC every 3 days every month, starting on the first day of the month. Setting schedules that trigger on a particular day 0 1 4 1 4 ? Trigger every April 1st at 4:01 a.m. UTC. 0 0,30 9 ? 4 WED Trigger at 9:00 a.m. UTC and at 9:30 a.m. UTC every Wednesday in the month of April. 0 5 9 15 * ? Trigger at 9:05 a.m. UTC on the 15th day of every month. 0 5 9 15W * ? Trigger at 9:05 a.m. UTC on the nearest weekday to the 15th of every month. 0 5 9 ? * 6#1 Trigger at 9:05 a.m. UTC on the first Friday of every month. Setting triggers using last 0 5 9 L * ? Trigger at 9:05 a.m. UTC on the last day of every month. 0 5 9 ? * 2L Trigger at 9:05 a.m. UTC on the last Monday of every month. 0 5 9 LW * ? Trigger at 9:05 a.m. UTC on the last weekday of every month. Date math support in index and index alias names Date math name resolution lets you to search a range of time series indices or index aliases rather than searching all of your indices and filtering the results. Limiting the number of searched indices reduces cluster load and improves search performance. For example, if you are searching for errors in your daily logs, you can use a date math name template to restrict the search to the past two days. Most APIs that accept an index or index alias argument support date math. A date math name takes the following form: <static_name{date_math_expr{date_format|time_zone}}> Where: static_name Static text date_math_expr Dynamic date math expression that computes the date dynamically date_format Optional format in which the computed date should be rendered. Defaults to yyyy.MM.dd . Format should be compatible with java-time https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html time_zone Optional time zone. Defaults to UTC . Note Pay attention to the usage of small vs capital letters used in the date_format . For example: mm denotes minute of hour, while MM denotes month of year. Similarly hh denotes the hour in the 1-12 range in combination with AM/PM , while HH denotes the hour in the 0-23 24-hour range. Date math expressions are resolved locale-independent. Consequently, it is not possible to use any other calendars than the Gregorian calendar. You must enclose date math names in angle brackets. If you use the name in a request path, special characters must be URI encoded. For example: # PUT /<my-index-{now/d}> PUT /%3Cmy-index-%7Bnow%2Fd%7D%3E Percent encoding of date math characters The special characters used for date rounding must be URI encoded as follows: < %3C > %3E / %2F { %7B } %7D | %7C + %2B : %3A , %2C The following example shows different forms of date math names and the final names they resolve to given the current time is 22nd March 2024 noon UTC. Expression Resolves to <logstash-{now/d}> logstash-2024.03.22 <logstash-{now/M}> logstash-2024.03.01 <logstash-{now/M{yyyy.MM}}> logstash-2024.03 <logstash-{now/M-1M{yyyy.MM}}> logstash-2024.02 <logstash-{now/d{yyyy.MM.dd&#124;+12:00}}> logstash-2024.03.23 To use the characters { and } in the static part of a name template, escape them with a backslash \\ , for example: <elastic\\\\{ON\\\\}-{now/M}> resolves to elastic{{ON}}-2024.03.01 The following example shows a search request that searches the Logstash indices for the past three days, assuming the indices use the default Logstash index name format, logstash-YYYY.MM.dd . # GET /<logstash-{now/d-2d}>,<logstash-{now/d-1d}>,<logstash-{now/d}>/_search GET /%3Clogstash-%7Bnow%2Fd-2d%7D%3E%2C%3Clogstash-%7Bnow%2Fd-1d%7D%3E%2C%3Clogstash-%7Bnow%2Fd%7D%3E/_search { \"query\" : { \"match\": { \"test\": \"data\" } } } Multi-target syntax Most APIs that accept a <data-stream> , <index> , or <target> request path parameter also support multi-target syntax . In multi-target syntax, you can use a comma-separated list to run a request on multiple resources, such as data streams, indices, or aliases: test1,test2,test3 . You can also use glob-like wildcard ( * ) expressions to target resources that match a pattern: test* or *test or te*t or *test* . You can exclude targets using the - character: test*,-test3 . Important Aliases are resolved after wildcard expressions. This can result in a request that targets an excluded alias. For example, if test3 is an index alias, the pattern test*,-test3 still targets the indices for test3 . To avoid this, exclude the concrete indices for the alias instead. You can also exclude clusters from a list of clusters to search using the - character: remote*:*,-remote1:*,-remote4:* will search all clusters with an alias that starts with \"remote\" except for \"remote1\" and \"remote4\". Note that to exclude a cluster with this notation you must exclude all of its indexes. Excluding a subset of indexes on a remote cluster is currently not supported. For example, this will throw an exception: remote*:*,-remote1:logs* . Multi-target APIs that can target indices support the following query string parameters: ignore_unavailable (Optional, Boolean) If false , the request returns an error if it targets a missing or closed index. Defaults to false . allow_no_indices (Optional, Boolean) If false , the request returns an error if any wildcard expression, index alias , or _all value targets only missing or closed indices. This behavior applies even if the request targets other open indices. For example, a request targeting foo*,bar* returns an error if an index starts with foo but no index starts with bar . expand_wildcards (Optional, string) Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values, such as open,hidden . Valid values are: all Match any data stream or index, including hidden ones. open Match open, non-hidden indices. Also matches any non-hidden data stream. closed Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed. hidden Match hidden data streams and hidden indices. Must be combined with open , closed , or both. none Wildcard patterns are not accepted. The defaults settings for the above parameters depend on the API being used. Some multi-target APIs that can target indices also support the following query string parameter: ignore_throttled (Optional, Boolean) If true , concrete, expanded or aliased indices are ignored when frozen. Defaults to true . Deprecated in 7.16.0 This parameter was deprecated in 7.16.0. Note APIs with a single target, such as the get document API , do not support multi-target syntax. Hidden data streams and indices For most APIs, wildcard expressions do not match hidden data streams and indices by default. To match hidden data streams and indices using a wildcard expression, you must specify the expand_wildcards query parameter. Alternatively, querying an index pattern starting with a dot, such as .watcher_hist* , will match hidden indices by default. This is intended to mirror Unix file-globbing behavior and provide a smoother transition path to hidden indices. You can create hidden data streams by setting data_stream.hidden to true in the stream’s matching index template . You can hide indices using the index.hidden index setting. The backing indices for data streams are hidden automatically. Some features, such as machine learning, store information in hidden indices. Global index templates that match all indices are not applied to hidden indices. System indices Elasticsearch modules and plugins can store configuration and state information in internal system indices . You should not directly access or modify system indices as they contain data essential to the operation of the system. Important Direct access to system indices is deprecated and will no longer be allowed in a future major version. To view system indices within cluster: GET _cluster/state/metadata?filter_path=metadata.indices.*.system Warning When overwriting current cluster state, system indices should be restored as part of their feature state . Node specification Some cluster-level APIs may operate on a subset of the nodes which can be specified with node filters. For example, task management , node stats , and node info APIs can all report results from a filtered set of nodes rather than from all nodes. Node filters are written as a comma-separated list of individual filters, each of which adds or removes nodes from the chosen subset. Each filter can be one of the following: _all , to add all nodes to the subset. _local , to add the local node to the subset. _master , to add the currently-elected master node to the subset. a node ID or name, to add this node to the subset. an IP address or hostname, to add all matching nodes to the subset. a pattern, using * wildcards, which adds all nodes to the subset whose name, address, or hostname matches the pattern. master:true , data:true , ingest:true , voting_only:true , ml:true , or coordinating_only:true , which respectively add to the subset all master-eligible nodes, all data nodes, all ingest nodes, all voting-only nodes, all machine learning nodes, and all coordinating-only nodes. master:false , data:false , ingest:false , voting_only:false , ml:false , or coordinating_only:false , which respectively remove from the subset all master-eligible nodes, all data nodes, all ingest nodes, all voting-only nodes, all machine learning nodes, and all coordinating-only nodes. a pair of patterns, using * wildcards, of the form attrname:attrvalue , which adds to the subset all nodes with a custom node attribute whose name and value match the respective patterns. Custom node attributes are configured by setting properties in the configuration file of the form node.attr.attrname: attrvalue . Node filters run in the order in which they are given, which is important if using filters that remove nodes from the set. For example, _all,master:false means all the nodes except the master-eligible ones. master:false,_all means the same as _all because the _all filter runs after the master:false filter. If no filters are given, the default is to select all nodes. If any filters are specified, they run starting with an empty chosen subset. This means that filters such as master:false which remove nodes from the chosen subset are only useful if they come after some other filters. When used on its own, master:false selects no nodes. Here are some examples of the use of node filters with some cluster APIs : # If no filters are given, the default is to select all nodes GET /_nodes # Explicitly select all nodes GET /_nodes/_all # Select just the local node GET /_nodes/_local # Select the elected master node GET /_nodes/_master # Select nodes by name, which can include wildcards GET /_nodes/node_name_goes_here GET /_nodes/node_name_goes_* # Select nodes by address, which can include wildcards GET /_nodes/10.0.0.3,10.0.0.4 GET /_nodes/10.0.0.* # Select nodes by role GET /_nodes/_all,master:false GET /_nodes/data:true,ingest:true GET /_nodes/coordinating_only:true GET /_nodes/master:true,voting_only:false # Select nodes by custom attribute # (for example, with something like `node.attr.rack: 2` in the configuration file) GET /_nodes/rack:2 GET /_nodes/ra*:2 GET /_nodes/ra*:2* Parameters Rest parameters (when using HTTP, map to HTTP URL parameters) follow the convention of using underscore casing. Request body in query string For libraries that don’t accept a request body for non-POST requests, you can pass the request body as the source query string parameter instead. When using this method, the source_content_type parameter should also be passed with a media type value that indicates the format of the source, such as application/json . REST API version compatibility Major version upgrades often include a number of breaking changes that impact how you interact with Elasticsearch. While we recommend that you monitor the deprecation logs and update applications before upgrading Elasticsearch, having to coordinate the necessary changes can be an impediment to upgrading. You can enable an existing application to function without modification after an upgrade by including API compatibility headers, which tell Elasticsearch you are still using the previous version of the REST API. Using these headers allows the structure of requests and responses to remain the same; it does not guarantee the same behavior. You set version compatibility on a per-request basis in the Content-Type and Accept headers. Setting compatible-with to the same major version as the version you’re running has no impact, but ensures that the request will still work after Elasticsearch is upgraded. To tell Elasticsearch 8.0 you are using the 7.x request and response format, set compatible-with=7 : Content-Type: application/vnd.elasticsearch+json; compatible-with=7 Accept: application/vnd.elasticsearch+json; compatible-with=7 HTTP 429 Too Many Requests status code push back Elasticsearch APIs may respond with the HTTP 429 Too Many Requests status code, indicating that the cluster is too busy to handle the request. When this happens, consider retrying after a short delay. If the retry also receives a 429 Too Many Requests response, extend the delay by backing off exponentially before each subsequent retry. URL-based access control Many users use a proxy with URL-based access control to secure access to Elasticsearch data streams and indices. For multi-search , multi-get , and bulk requests, the user has the choice of specifying a data stream or index in the URL and on each individual request within the request body. This can make URL-based access control challenging. To prevent the user from overriding the data stream or index specified in the URL, set rest.action.multi.allow_explicit_index to false in elasticsearch.yml . This causes Elasticsearch to reject requests that explicitly specify a data stream or index in the request body. Boolean Values All REST API parameters (both request parameters and JSON body) support providing boolean \"false\" as the value false and boolean \"true\" as the value true . All other values will raise an error. Number Values When passing a numeric parameter in a request body, you may use a string containing the number instead of the native numeric type. For example: POST /_search { \"size\": \"1000\" } Integer-valued fields in a response body are described as integer (or occasionally long ) in this manual, but there are generally no explicit bounds on such values. JSON, SMILE, CBOR and YAML all permit arbitrarily large integer values. Do not assume that integer fields in a response body will always fit into a 32-bit signed integer. Byte size units Whenever the byte size of data needs to be specified, e.g. when setting a buffer size parameter, the value must specify the unit, like 10kb for 10 kilobytes. Note that these units use powers of 1024, so 1kb means 1024 bytes. The supported units are: b Bytes kb Kilobytes mb Megabytes gb Gigabytes tb Terabytes pb Petabytes Distance Units Wherever distances need to be specified, such as the distance parameter in the Geo-distance ), the default unit is meters if none is specified. Distances can be specified in other units, such as \"1km\" or \"2mi\" (2 miles). The full list of units is listed below: Mile mi or miles Yard yd or yards Feet ft or feet Inch in or inch Kilometer km or kilometers Meter m or meters Centimeter cm or centimeters Millimeter mm or millimeters Nautical mile NM , nmi , or nauticalmiles Time units Whenever durations need to be specified, e.g. for a timeout parameter, the duration must specify the unit, like 2d for 2 days. The supported units are: d Days h Hours m Minutes s Seconds ms Milliseconds micros Microseconds nanos Nanoseconds Unit-less quantities Unit-less quantities means that they don’t have a \"unit\" like \"bytes\" or \"Hertz\" or \"meter\" or \"long tonne\". If one of these quantities is large we’ll print it out like 10m for 10,000,000 or 7k for 7,000. We’ll still print 87 when we mean 87 though. These are the supported multipliers: k Kilo m Mega g Giga t Tera p Peta Previous REST APIs Next Common options Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Content-type requirements X-Opaque-Id HTTP header traceparent HTTP header GET and POST requests Cron expressions Cron expression elements Cron special characters Examples Date math support in index and index alias names Multi-target syntax Hidden data streams and indices System indices Node specification Parameters Request body in query string REST API version compatibility HTTP 429 Too Many Requests status code push back URL-based access control Boolean Values Number Values Byte size units Distance Units Time units Unit-less quantities Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Elasticsearch API conventions | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/rest-apis/api-conventions","meta_description":"The Elasticsearch REST APIs are exposed over HTTP. Except where noted, the following conventions apply across all APIs. The type of the content sent in..."}
+{"text":"Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / SQL / SQL REST API / Run an async SQL search Elastic Stack Serverless By default, SQL searches are synchronous. They wait for complete results before returning a response. However, results can take longer for searches across large data sets or frozen data . To avoid long waits, run an async SQL search. Set wait_for_completion_timeout to a duration you’d like to wait for synchronous results. POST _sql?format=json { \"wait_for_completion_timeout\": \"2s\", \"query\": \"SELECT * FROM library ORDER BY page_count DESC\", \"fetch_size\": 5 } If the search doesn’t finish within this period, the search becomes async. The API returns: An id for the search. An is_partial value of true , indicating the search results are incomplete. An is_running value of true , indicating the search is still running in the background. For CSV, TSV, and TXT responses, the API returns these values in the respective Async-ID , Async-partial , and Async-running HTTP headers instead. { \"id\": \"FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=\", \"is_partial\": true, \"is_running\": true, \"rows\": [ ] } To check the progress of an async search, use the search ID with the get async SQL search status API . GET _sql/async/status/FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU= If is_running and is_partial are false , the async search has finished with complete results. { \"id\": \"FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=\", \"is_running\": false, \"is_partial\": false, \"expiration_time_in_millis\": 1611690295000, \"completion_status\": 200 } To get the results, use the search ID with the get async SQL search API . If the search is still running, specify how long you’d like to wait using wait_for_completion_timeout . You can also specify the response format . GET _sql/async/FnR0TDhyWUVmUmVtWXRWZER4MXZiNFEad2F5UDk2ZVdTVHV1S0xDUy00SklUdzozMTU=?wait_for_completion_timeout=2s&format=json Change the search retention period By default, Elasticsearch stores async SQL searches for five days. After this period, Elasticsearch deletes the search and its results, even if the search is still running. To change this retention period, use the keep_alive parameter. POST _sql?format=json { \"keep_alive\": \"2d\", \"wait_for_completion_timeout\": \"2s\", \"query\": \"SELECT * FROM library ORDER BY page_count DESC\", \"fetch_size\": 5 } You can use the get async SQL search API’s keep_alive parameter to later change the retention period. The new period starts after the request runs. GET _sql/async/FmdMX2pIang3UWhLRU5QS0lqdlppYncaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQToxOTI=?keep_alive=5d&wait_for_completion_timeout=2s&format=json Use the delete async SQL search API to delete an async search before the keep_alive period ends. If the search is still running, Elasticsearch cancels it. DELETE _sql/async/delete/FmdMX2pIang3UWhLRU5QS0lqdlppYncaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQToxOTI= Store synchronous SQL searches By default, Elasticsearch only stores async SQL searches. To save a synchronous search, specify wait_for_completion_timeout and set keep_on_completion to true . POST _sql?format=json { \"keep_on_completion\": true, \"wait_for_completion_timeout\": \"2s\", \"query\": \"SELECT * FROM library ORDER BY page_count DESC\", \"fetch_size\": 5 } If is_partial and is_running are false , the search was synchronous and returned complete results. { \"id\": \"Fnc5UllQdUVWU0NxRFNMbWxNYXplaFEaMUpYQ05oSkpTc3kwZ21EdC1tbFJXQTo0NzA=\", \"is_partial\": false, \"is_running\": false, \"rows\": ..., \"columns\": ..., \"cursor\": ... } You can get the same results later using the search ID with the get async SQL search API . Saved synchronous searches are still subject to the keep_alive retention period. When this period ends, Elasticsearch deletes the search results. You can also delete saved searches using the delete async SQL search API . Previous Use runtime fields Next SQL Translate API Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Change the search retention period Store synchronous SQL searches Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Run an async SQL search | Elastic Docs","url":"https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-async","meta_description":"By default, SQL searches are synchronous. They wait for complete results before returning a response. However, results can take longer for searches across..."}
+{"text":"Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / Querying and filtering Elastic Stack Serverless You can use Elasticsearch as a basic document store to retrieve documents and their metadata. However, the real power of Elasticsearch comes from its advanced search and analytics capabilities. Elasticsearch makes JSON documents searchable and aggregatable. The documents are stored in an index or data stream , which represent one type of data. Searchable means that you can filter the documents for conditions.** For example, you can filter for data \"within the last 7 days\" or data that \"contains the word Kibana\". Kibana provides many ways for you to construct filters, which are also called queries or search terms. Aggregatable means that you can extract summaries from matching documents.** The simplest aggregation is count , and it is frequently used in combination with the date histogram , to see count over time. The terms aggregation shows the most frequent values. Querying You’ll use a combination of an API endpoint and a query language to interact with your data. Elasticsearch provides a number of query languages . From Query DSL to the newest ES|QL, find the one that's most appropriate for you. You can call Elasticsearch's REST APIs by submitting requests directly from the command line or through the Dev Tools Console in Kibana. From your applications, you can use a client in your programming language of choice. A number of tools are available for you to save, debug, and optimize your queries. If you're just getting started with Elasticsearch, try the hands-on API quickstart to learn how to add data and run basic searches using Query DSL and the _search endpoint. Filtering When querying your data in Kibana, additional options let you filter the results to just the subset you need. Some of these options are common to most Elastic apps. Check Filtering in Kibana for more details on how to recognize and use them in the UI. Previous Explore and analyze Next Query languages Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Querying Filtering Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Querying and filtering | Elastic Docs","url":"https://www.elastic.co/docs/explore-analyze/query-filter","meta_description":"You can use Elasticsearch as a basic document store to retrieve documents and their metadata. However, the real power of Elasticsearch comes from its..."}
+{"text":"Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / Query languages / SQL / Getting Started with SQL Elastic Stack Serverless To start using Elasticsearch SQL, create an index with some data to experiment with: PUT /library/_bulk?refresh {\"index\":{\"_id\": \"Leviathan Wakes\"}} {\"name\": \"Leviathan Wakes\", \"author\": \"James S.A. Corey\", \"release_date\": \"2011-06-02\", \"page_count\": 561} {\"index\":{\"_id\": \"Hyperion\"}} {\"name\": \"Hyperion\", \"author\": \"Dan Simmons\", \"release_date\": \"1989-05-26\", \"page_count\": 482} {\"index\":{\"_id\": \"Dune\"}} {\"name\": \"Dune\", \"author\": \"Frank Herbert\", \"release_date\": \"1965-06-01\", \"page_count\": 604} And now you can execute SQL using the SQL search API : POST /_sql?format=txt { \"query\": \"SELECT * FROM library WHERE release_date < '2000-01-01'\" } Which should return something along the lines of: author | name | page_count | release_date ---------------+---------------+---------------+------------------------ Dan Simmons |Hyperion |482 |1989-05-26T00:00:00.000Z Frank Herbert |Dune |604 |1965-06-01T00:00:00.000Z You can also use the SQL CLI . There is a script to start it shipped in the Elasticsearch bin directory: $ ./bin/elasticsearch-sql-cli From there you can run the same query: sql> SELECT * FROM library WHERE release_date < '2000-01-01'; author | name | page_count | release_date ---------------+---------------+---------------+------------------------ Dan Simmons |Hyperion |482 |1989-05-26T00:00:00.000Z Frank Herbert |Dune |604 |1965-06-01T00:00:00.000Z Previous Overview Next Conventions Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Getting Started with SQL | Elastic Docs","url":"https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-getting-started","meta_description":"To start using Elasticsearch SQL, create an index with some data to experiment with: And now you can execute SQL using the SQL search API: Which should..."}
+{"text":"Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / SQL / SQL REST API / Overview Elastic Stack Serverless The SQL search API accepts SQL in a JSON document, executes it, and returns the results. For example: POST /_sql?format=txt { \"query\": \"SELECT * FROM library ORDER BY page_count DESC LIMIT 5\" } Which returns: author | name | page_count | release_date -----------------+--------------------+---------------+------------------------ Peter F. Hamilton|Pandora's Star |768 |2004-03-02T00:00:00.000Z Vernor Vinge |A Fire Upon the Deep|613 |1992-06-01T00:00:00.000Z Frank Herbert |Dune |604 |1965-06-01T00:00:00.000Z Alastair Reynolds|Revelation Space |585 |2000-03-15T00:00:00.000Z James S.A. Corey |Leviathan Wakes |561 |2011-06-02T00:00:00.000Z Using Kibana Console If you are using Kibana Console (which is highly recommended), take advantage of the triple quotes \"\"\" when creating the query. This not only automatically escapes double quotes ( \" ) inside the query string but also support multi-line as shown below: Previous SQL REST API Next Response Data Formats Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Overview | Elastic Docs","url":"https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-rest-overview","meta_description":"The SQL search API accepts SQL in a JSON document, executes it, and returns the results. For example: Which returns: "}
+{"text":"Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / Query languages / SQL / SQL Limitations Elastic Stack Serverless Large queries may throw ParsingException Extremely large queries can consume too much memory during the parsing phase, in which case the Elasticsearch SQL engine will abort parsing and throw an error. In such cases, consider reducing the query to a smaller size by potentially simplifying it or splitting it into smaller queries. Nested fields in SYS COLUMNS and DESCRIBE TABLE Elasticsearch has a special type of relationship fields called nested fields. In Elasticsearch SQL they can be used by referencing their inner sub-fields. Even though SYS COLUMNS in non-driver mode (in the CLI and in REST calls) and DESCRIBE TABLE will still display them as having the type NESTED , they cannot be used in a query. One can only reference its sub-fields in the form: [nested_field_name].[sub_field_name] For example: SELECT dep.dep_name.keyword FROM test_emp GROUP BY languages; Scalar functions on nested fields are not allowed in WHERE and ORDER BY clauses Elasticsearch SQL doesn’t support the usage of scalar functions on top of nested fields in WHERE and ORDER BY clauses with the exception of comparison and logical operators. For example: SELECT * FROM test_emp WHERE LENGTH(dep.dep_name.keyword) > 5; and SELECT * FROM test_emp ORDER BY YEAR(dep.start_date); are not supported but: SELECT * FROM test_emp WHERE dep.start_date >= CAST('2020-01-01' AS DATE) OR dep.dep_end_date IS NULL; is supported. Multi-nested fields Elasticsearch SQL doesn’t support multi-nested documents, so a query cannot reference more than one nested field in an index. This applies to multi-level nested fields, but also multiple nested fields defined on the same level. For example, for this index: column | type | mapping ----------------------+---------------+------------- nested_A |STRUCT |NESTED nested_A.nested_X |STRUCT |NESTED nested_A.nested_X.text|VARCHAR |KEYWORD nested_A.text |VARCHAR |KEYWORD nested_B |STRUCT |NESTED nested_B.text |VARCHAR |KEYWORD nested_A and nested_B cannot be used at the same time, nor nested_A / nested_B and nested_A.nested_X combination. For such situations, Elasticsearch SQL will display an error message. Paginating nested inner hits When SELECTing a nested field, pagination will not work as expected, Elasticsearch SQL will return at least the page size records. This is because of the way nested queries work in Elasticsearch: the root nested field will be returned and it’s matching inner nested fields as well, pagination taking place on the root nested document and not on its inner hits . Normalized keyword fields keyword fields in Elasticsearch can be normalized by defining a normalizer . Such fields are not supported in Elasticsearch SQL. Array type of fields Array fields are not supported due to the \"invisible\" way in which Elasticsearch handles an array of values: the mapping doesn’t indicate whether a field is an array (has multiple values) or not, so without reading all the data, Elasticsearch SQL cannot know whether a field is a single or multi value. When multiple values are returned for a field, by default, Elasticsearch SQL will throw an exception. However, it is possible to change this behavior through field_multi_value_leniency parameter in REST (disabled by default) or field.multi.value.leniency in drivers (enabled by default). Sorting by aggregation When doing aggregations ( GROUP BY ) Elasticsearch SQL relies on Elasticsearch's composite aggregation for its support for paginating results. However this type of aggregation does come with a limitation: sorting can only be applied on the key used for the aggregation’s buckets. Elasticsearch SQL overcomes this limitation by doing client-side sorting however as a safety measure, allows only up to 65535 rows. It is recommended to use LIMIT for queries that use sorting by aggregation, essentially indicating the top N results that are desired: SELECT * FROM test GROUP BY age ORDER BY COUNT(*) LIMIT 100; It is possible to run the same queries without a LIMIT however in that case if the maximum size ( 10000 ) is passed, an exception will be returned as Elasticsearch SQL is unable to track (and sort) all the results returned. Moreover, the aggregation(s) used in the ORDER BY must be only plain aggregate functions. No scalar functions or operators can be used, and therefore no complex columns that combine two ore more aggregate functions can be used for ordering. Here are some examples of queries that are not allowed : SELECT age, ROUND(AVG(salary)) AS avg FROM test GROUP BY age ORDER BY avg; SELECT age, MAX(salary) - MIN(salary) AS diff FROM test GROUP BY age ORDER BY diff; Using a sub-select Using sub-selects ( SELECT X FROM (SELECT Y) ) is supported to a small degree : any sub-select that can be \"flattened\" into a single SELECT is possible with Elasticsearch SQL. For example: SELECT * FROM (SELECT first_name, last_name FROM emp WHERE last_name NOT LIKE '%a%') WHERE first_name LIKE 'A%' ORDER BY 1; first_name | last_name ---------------+--------------- Alejandro |McAlpine Anneke |Preusig Anoosh |Peyn Arumugam |Ossenbruggen The query above is possible because it is equivalent with: SELECT first_name, last_name FROM emp WHERE last_name NOT LIKE '%a%' AND first_name LIKE 'A%' ORDER BY 1; But, if the sub-select would include a GROUP BY or HAVING or the enclosing SELECT would be more complex than SELECT X FROM (SELECT ...) WHERE [simple_condition] , this is currently un-supported . Using FIRST / LAST aggregation functions in HAVING clause Using FIRST and LAST in the HAVING clause is not supported. The same applies to MIN and MAX when their target column is of type keyword or unsigned_long as they are internally translated to FIRST and LAST . Using TIME data type in GROUP BY or HISTOGRAM Using TIME data type as a grouping key is currently not supported. For example: SELECT count(*) FROM test GROUP BY CAST(date_created AS TIME); On the other hand, it can still be used if it’s wrapped with a scalar function that returns another data type, for example: SELECT count(*) FROM test GROUP BY MINUTE((CAST(date_created AS TIME)); TIME data type is also currently not supported in histogram grouping function. For example: SELECT HISTOGRAM(CAST(birth_date AS TIME), INTERVAL '10' MINUTES) as h, COUNT(*) FROM t GROUP BY h Geo-related functions Since geo_shape fields don’t have doc values these fields cannot be used for filtering, grouping or sorting. By default, geo_points fields are indexed and have doc values. However only latitude and longitude are stored and indexed with some loss of precision from the original values (4.190951585769653E-8 for the latitude and 8.381903171539307E-8 for longitude). The altitude component is accepted but not stored in doc values nor indexed. Therefore calling ST_Z function in the filtering, grouping or sorting will return null . Retrieving using the fields search parameter Elasticsearch SQL retrieves column values using the search API’s fields parameter . Any limitations on the fields parameter also apply to Elasticsearch SQL queries. For example, if _source is disabled for any of the returned fields or at index level, the values cannot be retrieved. Aggregations in the PIVOT clause The aggregation expression in PIVOT will currently accept only one aggregation. It is thus not possible to obtain multiple aggregations for any one pivoted column. Using a subquery in PIVOT 's IN -subclause The values that the PIVOT query could pivot must be provided in the query as a list of literals; providing a subquery instead to build this list is not currently supported. For example, in this query: SELECT * FROM test_emp PIVOT (SUM(salary) FOR languages IN (1, 2)) the languages of interest must be listed explicitly: IN (1, 2) . On the other hand, this example would not work : SELECT * FROM test_emp PIVOT (SUM(salary) FOR languages IN (SELECT languages FROM test_emp WHERE languages <=2 GROUP BY languages)) Previous Reserved keywords Next EQL Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Large queries may throw ParsingException Nested fields in SYS COLUMNS and DESCRIBE TABLE Scalar functions on nested fields are not allowed in WHERE and ORDER BY clauses Multi-nested fields Paginating nested inner hits Normalized keyword fields Array type of fields Sorting by aggregation Using a sub-select Using FIRST/LAST aggregation functions in HAVING clause Using TIME data type in GROUP BY or HISTOGRAM Geo-related functions Retrieving using the fields search parameter Aggregations in the PIVOT clause Using a subquery in PIVOT's IN-subclause Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"SQL Limitations | Elastic Docs","url":"https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-limitations","meta_description":"Extremely large queries can consume too much memory during the parsing phase, in which case the Elasticsearch SQL engine will abort parsing and throw..."}
+{"text":"Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Cross-cluster replication / Set up cross-cluster replication / Create a follower index to replicate a specific index ECE ECK Elastic Cloud Hosted Self Managed When you create a follower index, you reference the remote cluster and the leader index in your remote cluster. To create a follower index from Stack Management in Kibana: Select Cross-Cluster Replication in the side navigation and choose the Follower Indices tab. Choose the cluster (ClusterA) containing the leader index you want to replicate. Enter the name of the leader index, which is kibana_sample_data_ecommerce if you are following the tutorial. Enter a name for your follower index, such as follower-kibana-sample-data . Elasticsearch initializes the follower using the remote recovery process, which transfers the existing Lucene segment files from the leader index to the follower index. The index status changes to Paused . When the remote recovery process is complete, the index following begins and the status changes to Active . When you index documents into your leader index, Elasticsearch replicates the documents in the follower index. × API example You can also use the create follower API to create follower indices. When you create a follower index, you must reference the remote cluster and the leader index that you created in the remote cluster. When initiating the follower request, the response returns before the remote recovery process completes. To wait for the process to complete, add the wait_for_active_shards parameter to your request. PUT /server-metrics-follower/_ccr/follow?wait_for_active_shards=1 { \"remote_cluster\" : \"leader\", \"leader_index\" : \"server-metrics\" } Use the get follower stats API to inspect the status of replication. Previous Configure privileges for cross-cluster replication Next Create an auto-follow pattern to replicate time series indices Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Create a follower index to replicate a specific index | Elastic Docs","url":"https://www.elastic.co/docs/deploy-manage/tools/cross-cluster-replication/ccr-getting-started-follower-index","meta_description":"When you create a follower index, you reference the remote cluster and the leader index in your remote cluster. To create a follower index from Stack..."}
+{"text":"Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Manage TLS encryption / Self-managed / Enabling cipher suites for stronger encryption Self Managed The TLS and SSL protocols use a cipher suite that determines the strength of encryption used to protect the data. You may want to increase the strength of encryption used when using a Oracle JVM; the IcedTea OpenJDK ships without these restrictions in place. This step is not required to successfully use encrypted communication. The Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files enable the use of additional cipher suites for Java in a separate JAR file that you need to add to your Java installation. You can download this JAR file from Oracle’s download page . The JCE Unlimited Strength Jurisdiction Policy Files` are required for encryption with key lengths greater than 128 bits, such as 256-bit AES encryption. After installation, all cipher suites in the JCE are available for use but requires configuration in order to use them. To enable the use of stronger cipher suites with Elasticsearch security features, configure the cipher_suites parameter . Note The JCE Unlimited Strength Jurisdiction Policy Files must be installed on all nodes in the cluster to establish an improved level of encryption strength. Previous Supported SSL/TLS versions by JDK version Next ECK Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Enabling cipher suites for stronger encryption | Elastic Docs","url":"https://www.elastic.co/docs/deploy-manage/security/enabling-cipher-suites-for-stronger-encryption","meta_description":"The TLS and SSL protocols use a cipher suite that determines the strength of encryption used to protect the data. You may want to increase the strength..."}
+{"text":"Docs Release notes Troubleshoot Reference Explore and analyze Get started Solutions and use cases Manage data Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Querying and filtering Query languages Query DSL ES|QL Get started Interfaces ES|QL _query API Kibana Elastic Security Query multiple sources Query multiple indices Query across clusters Examples Task management SQL Overview Getting started Conventions Security SQL REST API Overview Response Data Formats Paginating through a large response Filtering using Elasticsearch Query DSL Columnar results Passing parameters to a query Use runtime fields Run an async SQL search SQL Translate API SQL CLI SQL JDBC API usage SQL ODBC Driver installation Configuration SQL Client Applications DBeaver DbVisualizer Microsoft Excel Microsoft Power BI Desktop Microsoft PowerShell MicroStrategy Desktop Qlik Sense Desktop SQuirreL SQL SQL Workbench/J Tableau Desktop Tableau Server SQL Language Lexical Structure SQL Commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data Types Index patterns Frozen Indices Functions and Operators Comparison Operators Logical Operators Math Operators Cast Operators LIKE and RLIKE Operators Aggregate Functions Grouping Functions Date/Time and Interval Functions and Operators Full-Text Search Functions Mathematical Functions String Functions Type Conversion Functions Geo Functions Conditional Functions And Expressions System Functions Reserved keywords SQL Limitations EQL Example: Detect threats with EQL KQL Lucene query syntax Query tools Saved queries Console Search profiler Grok debugger Playground Aggregations Basics Filtering in Kibana Geospatial analysis Transforming data Overview Setup When to use transforms Generating alerts for transforms Transforms at scale How checkpoints work API quick reference Tutorial: Transforming the eCommerce sample data Examples Painless examples Limitations Elastic Inference Inference integrations Machine learning Setup and security Anomaly detection Finding anomalies Plan your analysis Run a job View the results Forecast future behavior Tutorial Advanced concepts Anomaly detection algorithms Anomaly score explanation Job types Working with anomaly detection at scale Handling delayed data API quick reference How-tos Generating alerts for anomaly detection jobs Aggregating data for faster performance Altering data in your datafeed with runtime fields Customizing detectors with custom rules Detecting anomalous categories of data Performing population analysis Reverting to a model snapshot Detecting anomalous locations in geographic data Mapping anomalies by location Adding custom URLs to machine learning results Anomaly detection jobs from visualizations Exporting and importing machine learning jobs Resources Limitations Function reference Supplied configurations Troubleshooting and FAQ Data frame analytics Overview Finding outliers Predicting numerical values with regression Predicting classes with classification Advanced concepts How data frame analytics analytics jobs work Working with data frame analytics at scale Adding custom URLs to data frame analytics jobs Feature encoding Feature processors Feature importance Loss functions for regression analyses Hyperparameter optimization Trained models API quick reference Resources Limitations NLP Overview Extract information Classify text Search and compare text Deploy trained models Select a trained model Import the trained model and vocabulary Deploy the model in your cluster Try it out Add NLP inference to ingest pipelines API quick reference Built-in NLP models ELSER Elastic Rerank E5 Language identification Compatible third party models Examples End-to-end tutorial Named entity recognition Text embedding and semantic search Limitations ML in Kibana Anomaly detection Data frame analytics AIOps Labs Inference processing Scripting Painless scripting language How to write scripts Scripts, caching, and search speed Dissecting data Grokking grok Access fields in a document Common scripting use cases Field extraction Accessing document fields and special variables Scripting and security Lucene expressions language Advanced scripts using script engines Painless lab AI assistant Discover Explore fields and data with Discover Customize the Discover view Search for relevance Save a search for reuse View field statistics Run a pattern analysis on your log data Run a search session in the background Using ES|QL Dashboards Exploring dashboards Building dashboards Create a dashboard Edit a dashboard Add filter controls Add drilldowns Organize dashboard panels Duplicate a dashboard Import a dashboard Managing dashboards Sharing dashboards Tutorials Create a simple dashboard to monitor website logs Create a dashboard with time series charts Panels and visualizations Supported chart types Visualize Library Manage panels Lens ES|QL Field statistics Custom visualizations with Vega Text panels Image panels Link panels Canvas Edit workpads Present your workpad Tutorial: Create a workpad for monitoring sales Canvas function reference TinyMath functions Maps Build a map to compare metrics by country or region Track, visualize, and alert on assets in real time Map custom regions with reverse geocoding Heat map layer Tile layer Vector layer Vector styling Vector style properties Vector tooltips Plot big data Clusters Display the most relevant documents per entity Point to point Term join Search geographic data Create filters from a map Filter a single layer Search across multiple indices Configure map settings Connect to Elastic Maps Service Import geospatial data Clean your data Tutorial: Index GeoJSON data Troubleshoot Graph Configure Graph Troubleshooting and limitations Legacy editors Aggregation-based TSVB Timelion Find and organize content Data views Saved objects Files Reports Tags Find apps and objects Reporting and sharing Automatically generate reports Troubleshooting CSV PDF/PNG Alerts and cases Alerts Getting started with alerts Set up Create and manage rules View alerts Rule types Index threshold Elasticsearch query Tracking containment Rule action variables Notifications domain allowlist Troubleshooting and limitations Common Issues Event log index Test connectors Maintenance windows Watcher Getting started with Watcher How Watcher works Enable Watcher Watcher UI Encrypting sensitive data in Watcher Inputs Simple input Search input HTTP input Chain input Triggers Schedule trigger Throttling Schedule Types Conditions Always condition Never condition Compare condition Array compare condition Script condition Actions Running an action for each element in an array Adding conditions to actions Email action Webhook action Index action Logging action Slack action PagerDuty action Jira action Transforms Search payload transform Script payload transform Chain payload transform Managing watches Example watches Watching the status of an Elasticsearch cluster Limitations Cases Configure access to cases Open and manage cases Configure case settings Numeral formatting Loading Docs / Explore and analyze / … / Query languages / SQL / Conventions and Terminology Elastic Stack Serverless For clarity, it is important to establish the meaning behind certain words as, the same wording might convey different meanings to different readers depending on one’s familiarity with SQL versus Elasticsearch. Note This documentation while trying to be complete, does assume the reader has basic understanding of Elasticsearch and/or SQL. If that is not the case, continue reading the documentation however take notes and pursue the topics that are unclear either through the main Elasticsearch documentation or through the plethora of SQL material available in the open (there are simply too many excellent resources here to enumerate). As a general rule, Elasticsearch SQL as the name indicates provides a SQL interface to Elasticsearch. As such, it follows the SQL terminology and conventions first, whenever possible. However the backing engine itself is Elasticsearch for which Elasticsearch SQL was purposely created hence why features or concepts that are not available, or cannot be mapped correctly, in SQL appear in Elasticsearch SQL. Last but not least, Elasticsearch SQL tries to obey the principle of least surprise , though as all things in the world, everything is relative. Mapping concepts across SQL and Elasticsearch While SQL and Elasticsearch have different terms for the way the data is organized (and different semantics), essentially their purpose is the same. So let’s start from the bottom; these roughly are: SQL Elasticsearch Description column field In both cases, at the lowest level, data is stored in named entries, of a variety of data types , containing one value. SQL calls such an entry a column while Elasticsearch a field .Notice that in Elasticsearch a field can contain multiple values of the same type (essentially a list) while in SQL, a column can contain exactly one value of said type.Elasticsearch SQL will do its best to preserve the SQL semantic and, depending on the query, reject those that return fields with more than one value. row document Column s and field s do not exist by themselves; they are part of a row or a document . The two have slightly different semantics: a row tends to be strict (and have more enforcements) while a document tends to be a bit more flexible or loose (while still having a structure). table index The target against which queries, whether in SQL or Elasticsearch get executed against. schema implicit In RDBMS, schema is mainly a namespace of tables and typically used as a security boundary. Elasticsearch does not provide an equivalent concept for it. However when security is enabled, Elasticsearch automatically applies the security enforcement so that a role sees only the data it is allowed to (in SQL jargon, its schema ). catalog or database cluster instance In SQL, catalog or database are used interchangeably and represent a set of schemas that is, a number of tables.In Elasticsearch the set of indices available are grouped in a cluster . The semantics also differ a bit; a database is essentially yet another namespace (which can have some implications on the way data is stored) while an Elasticsearch cluster is a runtime instance, or rather a set of at least one Elasticsearch instance (typically running distributed).In practice this means that while in SQL one can potentially have multiple catalogs inside an instance, in Elasticsearch one is restricted to only one . cluster cluster (federated) Traditionally in SQL, cluster refers to a single RDBMS instance which contains a number of catalog s or database s (see above). The same word can be reused inside Elasticsearch as well however its semantic clarified a bit. While RDBMS tend to have only one running instance, on a single machine ( not distributed), Elasticsearch goes the opposite way and by default, is distributed and multi-instance. Further more, an Elasticsearch cluster can be connected to other cluster s in a federated fashion thus cluster means: single cluster<>Multiple Elasticsearch instances typically distributed across machines, running within the same namespace.multiple clustersMultiple clusters, each with its own namespace, connected to each other in a federated setup (see Cross-cluster search ). As one can see while the mapping between the concepts are not exactly one to one and the semantics somewhat different, there are more things in common than differences. In fact, thanks to SQL declarative nature, many concepts can move across Elasticsearch transparently and the terminology of the two likely to be used interchangeably throughout the rest of the material. Previous Getting started Next Security Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Mapping concepts across SQL and Elasticsearch Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Conventions and Terminology | Elastic Docs","url":"https://www.elastic.co/docs/explore-analyze/query-filter/languages/sql-concepts","meta_description":"For clarity, it is important to establish the meaning behind certain words as, the same wording might convey different meanings to different readers depending..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-users If you use file-based user authentication, the elasticsearch-users command enables you to add and remove users, assign user roles, and manage passwords per node. Synopsis bin/elasticsearch-users ([useradd <username>] [-p <password>] [-r <roles>]) | ([list] <username>) | ([passwd <username>] [-p <password>]) | ([roles <username>] [-a <roles>] [-r <roles>]) | ([userdel <username>]) Description If you use the built-in file internal realm, users are defined in local files on each node in the cluster. Usernames and roles must be at least 1 and no more than 1024 characters. They can contain alphanumeric characters ( a-z , A-Z , 0-9 ), spaces, punctuation, and printable symbols in the Basic Latin (ASCII) block . Leading or trailing whitespace is not allowed. Passwords must be at least 6 characters long. For more information, see File-based user authentication . Tip To ensure that Elasticsearch can read the user and role information at startup, run elasticsearch-users useradd as the same user you use to run Elasticsearch. Running the command as root or some other user updates the permissions for the users and users_roles files and prevents Elasticsearch from accessing them. Parameters -a <roles> If used with the roles parameter, adds a comma-separated list of roles to a user. list List the users that are registered with the file realm on the local node. If you also specify a user name, the command provides information for that user. -p <password> Specifies the user’s password. If you do not specify this parameter, the command prompts you for the password. Tip Omit the -p option to keep plaintext passwords out of the terminal session’s command history. passwd <username> Resets a user’s password. You can specify the new password directly with the -p parameter. -r <roles> If used with the useradd parameter, defines a user’s roles. This option accepts a comma-separated list of role names to assign to the user. If used with the roles parameter, removes a comma-separated list of roles from a user. roles Manages the roles of a particular user. You can combine adding and removing roles within the same command to change a user’s roles. useradd <username> Adds a user to your local node. userdel <username> Deletes a user from your local node. Examples The following example adds a new user named jacknich to the file realm. The password for this user is theshining , and this user is associated with the network and monitoring roles. bin/elasticsearch-users useradd jacknich -p theshining -r network,monitoring The following example lists the users that are registered with the file realm on the local node: bin/elasticsearch-users list rdeniro : admin alpacino : power_user jacknich : monitoring,network Users are in the left-hand column and their corresponding roles are listed in the right-hand column. The following example resets the jacknich user’s password: bin/elasticsearch-users passwd jachnich Since the -p parameter was omitted, the command prompts you to enter and confirm a password in interactive mode. The following example removes the network and monitoring roles from the jacknich user and adds the user role: bin/elasticsearch-users roles jacknich -r network,monitoring -a user The following example deletes the jacknich user: bin/elasticsearch-users userdel jacknich Previous elasticsearch-syskeygen Next Curator Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"elasticsearch-users | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/users-command","meta_description":"If you use file-based user authentication, the elasticsearch-users command enables you to add and remove users, assign user roles, and manage passwords..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-syskeygen The elasticsearch-syskeygen command creates a system key file in the elasticsearch config directory. Synopsis bin/elasticsearch-syskeygen [-E <KeyValuePair>] [-h, --help] ([-s, --silent] | [-v, --verbose]) Description The command generates a system_key file, which you can use to symmetrically encrypt sensitive data. For example, you can use this key to prevent Watcher from returning and storing information that contains clear text credentials. See Encrypting sensitive data in Watcher . Important The system key is a symmetric key, so the same key must be used on every node in the cluster. Parameters -E <KeyValuePair> Configures a setting. For example, if you have a custom installation of Elasticsearch, you can use this parameter to specify the ES_PATH_CONF environment variable. -h, --help Returns all of the command parameters. -s, --silent Shows minimal output. -v, --verbose Shows verbose output. Examples The following command generates a system_key file in the default $ES_HOME/config directory: bin/elasticsearch-syskeygen Previous elasticsearch-shard Next elasticsearch-users Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"elasticsearch-syskeygen | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/syskeygen","meta_description":"The elasticsearch-syskeygen command creates a system key file in the elasticsearch config directory. The command generates a system_key file, which you..."}
+{"text":"Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / Users and roles / Cluster or deployment / User roles ECE ECK Elastic Cloud Hosted Self Managed After a user is authenticated , Elastic Stack needs to determine whether the user behind an incoming request is allowed to execute the request. The primary method of authorization in a cluster is role-based access control (RBAC), although Elastic Stack also supports Attribute-based access control (ABAC). Tip If you use Elastic Cloud Enterprise or Elastic Cloud Hosted, then you can also implement RBAC at the level of your Elastic Cloud Enterprise orchestrator or Elastic Cloud organization . If you use Elastic Cloud Serverless, then you can only manage RBAC at the Elastic Cloud organization level . You must authenticate users at the same level where you implement RBAC. For example, if you want to use organization-level roles, than you must authenticate your users at the organization level. How role-based access control works Role-based access control (RBAC) enables you to authorize users by assigning privileges to roles and assigning roles to users or groups. This is the primary way of controlling access to resources in Elastic Stack. × The authorization process revolves around the following constructs: Secured Resource A resource to which access is restricted. Indices, aliases, documents, fields, users, and the Elasticsearch cluster itself are all examples of secured objects. Privilege A named group of one or more actions that a user may execute against a secured resource. Each secured resource has its own sets of available privileges. For example, read is an index privilege that represents all actions that enable reading the indexed/stored data. For a complete list of available privileges, see Elasticsearch privileges . Permissions A set of one or more privileges against a secured resource. Permissions can easily be described in words, here are few examples: read privilege on the products data stream or index manage privilege on the cluster run_as privilege on john user read privilege on documents that match query X read privilege on credit_card field Role A named set of permissions User The authenticated user. Group One or more groups to which a user belongs. Groups are not supported in some realms, such as native, file, or PKI realms. A role has a unique name and identifies a set of permissions that translate to privileges on resources. You can associate a user or group with an arbitrary number of roles. When you map roles to groups, the roles of a user in that group are the combination of the roles assigned to that group and the roles assigned to that user. Likewise, the total set of permissions that a user has is defined by the union of the permissions in all its roles. Set up user authorization using RBAC Review these topics to learn how to configure RBAC in your cluster or deployment: Learn about built-in roles Define your own roles Learn about the Elasticsearch and Kibana privileges you can assign to roles Learn how to control access at the document and field level Assign roles to users The way that you assign roles to users depends on your authentication realm: Native realm : Using Elasticsearch API _security endpoints In Kibana , using the Stack Management > Security > Users page File realm : Using a user_roles file In ECK: As part of a basic authentication secret External realms : By mapping users and groups to roles Advanced topics Learn how to delegate authorization to another realm Learn how to build a custom authorization plugin for unsupported systems or advanced applications Learn how to submit requests on behalf of other users Learn about attribute-based access control Tip User roles are also used to control access to Kibana spaces . Attribute-based access control Attribute-based access control (ABAC) enables you to use attributes to restrict access to documents in search queries and aggregations. For example, you can assign attributes to users and documents, then implement an access policy in a role definition. Users with that role can read a specific document only if they have all the required attributes. For more information, see Document-level attribute-based access control with Elasticsearch . Previous Manage authentication for multiple clusters Next Built-in roles Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page How role-based access control works Set up user authorization using RBAC Assign roles to users Advanced topics Attribute-based access control Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"User roles | Elastic Docs","url":"https://www.elastic.co/docs/deploy-manage/users-roles/cluster-or-deployment-auth/user-roles","meta_description":"After a user is authenticated, Elastic Stack needs to determine whether the user behind an incoming request is allowed to execute the request. The primary..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-setup-passwords Deprecated in 8.0. The elasticsearch-setup-passwords tool is deprecated and will be removed in a future release. To manually reset the password for the built-in users (including the elastic user), use the elasticsearch-reset-password tool, the Elasticsearch change password API, or the User Management features in Kibana. The elasticsearch-setup-passwords command sets the passwords for the built-in users . Synopsis bin/elasticsearch-setup-passwords auto|interactive [-b, --batch] [-h, --help] [-E <KeyValuePair>] [-s, --silent] [-u, --url \"<URL>\"] [-v, --verbose] Description This command is intended for use only during the initial configuration of the Elasticsearch security features. It uses the elastic bootstrap password to run user management API requests. If your Elasticsearch keystore is password protected, before you can set the passwords for the built-in users, you must enter the keystore password. After you set a password for the elastic user, the bootstrap password is no longer active and you cannot use this command. Instead, you can change passwords by using the Management > Users UI in Kibana or the Change Password API . This command uses an HTTP connection to connect to the cluster and run the user management requests. If your cluster uses TLS/SSL on the HTTP layer, the command automatically attempts to establish the connection by using the HTTPS protocol. It configures the connection by using the xpack.security.http.ssl settings in the elasticsearch.yml file. If you do not use the default config directory location, ensure that the ES_PATH_CONF environment variable returns the correct path before you run the elasticsearch-setup-passwords command. You can override settings in your elasticsearch.yml file by using the -E command option. For more information about debugging connection failures, see Setup-passwords command fails due to connection failure . Parameters auto Outputs randomly-generated passwords to the console. -b, --batch If enabled, runs the change password process without prompting the user. -E <KeyValuePair> Configures a standard Elasticsearch or X-Pack setting. -h, --help Shows help information. interactive Prompts you to manually enter passwords. -s, --silent Shows minimal output. -u, --url \"<URL>\" Specifies the URL that the tool uses to submit the user management API requests. The default value is determined from the settings in your elasticsearch.yml file. If xpack.security.http.ssl.enabled is set to true , you must specify an HTTPS URL. -v, --verbose Shows verbose output. Examples The following example uses the -u parameter to tell the tool where to submit its user management API requests: bin/elasticsearch-setup-passwords auto -u \"http://localhost:9201\" Previous elasticsearch-service-tokens Next elasticsearch-shard Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"elasticsearch-setup-passwords | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/setup-passwords","meta_description":"The elasticsearch-setup-passwords command sets the passwords for the built-in users. This command is intended for use only during the initial configuration..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-saml-metadata The elasticsearch-saml-metadata command can be used to generate a SAML 2.0 Service Provider Metadata file. Synopsis bin/elasticsearch-saml-metadata [--realm <name>] [--out <file_path>] [--batch] [--attribute <name>] [--service-name <name>] [--locale <name>] [--contacts] ([--organisation-name <name>] [--organisation-display-name <name>] [--organisation-url <url>]) ([--signing-bundle <file_path>] | [--signing-cert <file_path>][--signing-key <file_path>]) [--signing-key-password <password>] [-E <KeyValuePair>] [-h, --help] ([-s, --silent] | [-v, --verbose]) Description The SAML 2.0 specification provides a mechanism for Service Providers to describe their capabilities and configuration using a metadata file . The elasticsearch-saml-metadata command generates such a file, based on the configuration of a SAML realm in Elasticsearch. Some SAML Identity Providers will allow you to automatically import a metadata file when you configure the Elastic Stack as a Service Provider. You can optionally select to digitally sign the metadata file in order to ensure its integrity and authenticity before sharing it with the Identity Provider. The key used for signing the metadata file need not necessarily be the same as the keys already used in the saml realm configuration for SAML message signing. If your Elasticsearch keystore is password protected, you are prompted to enter the password when you run the elasticsearch-saml-metadata command. Parameters --attribute <name> Specifies a SAML attribute that should be included as a <RequestedAttribute> element in the metadata. Any attribute configured in the Elasticsearch realm is automatically included and does not need to be specified as a commandline option. --batch Do not prompt for user input. --contacts Specifies that the metadata should include one or more <ContactPerson> elements. The user will be prompted to enter the details for each person. -E <KeyValuePair> Configures an Elasticsearch setting. -h, --help Returns all of the command parameters. --locale <name> Specifies the locale to use for metadata elements such as <ServiceName> . Defaults to the JVM’s default system locale. --organisation-display-name <name Specified the value of the <OrganizationDisplayName> element. Only valid if --organisation-name is also specified. --organisation-name <name> Specifies that an <Organization> element should be included in the metadata and provides the value for the <OrganizationName> . If this is specified, then --organisation-url must also be specified. --organisation-url <url> Specifies the value of the <OrganizationURL> element. This is required if --organisation-name is specified. --out <file_path> Specifies a path for the output files. Defaults to saml-elasticsearch-metadata.xml --service-name <name> Specifies the value for the <ServiceName> element in the metadata. Defaults to elasticsearch . --signing-bundle <file_path> Specifies the path to an existing key pair (in PKCS#12 format). The private key of that key pair will be used to sign the metadata file. --signing-cert <file_path> Specifies the path to an existing certificate (in PEM format) to be used for signing of the metadata file. You must also specify the --signing-key parameter. This parameter cannot be used with the --signing-bundle parameter. --signing-key <file_path> Specifies the path to an existing key (in PEM format) to be used for signing of the metadata file. You must also specify the --signing-cert parameter. This parameter cannot be used with the --signing-bundle parameter. --signing-key-password <password> Specifies the password for the signing key. It can be used with either the --signing-key or the --signing-bundle parameters. --realm <name> Specifies the name of the realm for which the metadata should be generated. This parameter is required if there is more than 1 saml realm in your Elasticsearch configuration. -s, --silent Shows minimal output. -v, --verbose Shows verbose output. Examples The following command generates a default metadata file for the saml1 realm: bin/elasticsearch-saml-metadata --realm saml1 The file will be written to saml-elasticsearch-metadata.xml . You may be prompted to provide the \"friendlyName\" value for any attributes that are used by the realm. The following command generates a metadata file for the saml2 realm, with a <ServiceName> of kibana-finance , a locale of en-GB and includes <ContactPerson> elements and an <Organization> element: bin/elasticsearch-saml-metadata --realm saml2 \\ --service-name kibana-finance \\ --locale en-GB \\ --contacts \\ --organisation-name \"Mega Corp. Finance Team\" \\ --organisation-url \"http://mega.example.com/finance/\" Previous elasticsearch-reset-password Next elasticsearch-service-tokens Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"elasticsearch-saml-metadata | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/saml-metadata","meta_description":"The elasticsearch-saml-metadata command can be used to generate a SAML 2.0 Service Provider Metadata file. The SAML 2.0 specification provides a mechanism..."}
+{"text":"Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Cluster or deployment / User authentication / Looking up users without authentication ECE ECK Elastic Cloud Hosted Self Managed Elasticsearch realms exist primarily to support user authentication . Some realms authenticate users with a password (such as the native and ldap realms), and other realms use more complex authentication protocols (such as the saml and oidc realms). In each case, the primary purpose of the realm is to establish the identity of the user who has made a request to the Elasticsearch API. However, some Elasticsearch features need to look up a user without using their credentials. The run_as feature executes requests on behalf of another user. An authenticated user with run_as privileges can perform requests on behalf of another unauthenticated user. The delegated authorization feature links two realms together so that a user who authenticates against one realm can have the roles and metadata associated with a user from a different realm. In each of these cases, a user must first authenticate to one realm and then Elasticsearch will query the second realm to find another user. The authenticated user credentials are used to authenticate in the first realm only, The user in the second realm is retrieved by username, without needing credentials. When Elasticsearch resolves a user using their credentials (as performed in the first realm), it is known as user authentication . When Elasticsearch resolves a user using the username only (as performed in the second realm), it is known as user lookup . See the run_as and delegated authorization documentation to learn more about these features, including which realms and authentication methods support run_as or delegated authorization. In both cases, only the following realms can be used for the user lookup: The reserved, native and file realms always support user lookup. The ldap realm supports user lookup when the realm is configured in user search mode . User lookup is not support when the realm is configured with user_dn_templates . User lookup support in the active_directory realm requires that the realm be configured with a bind_dn and a bind password. The pki , saml , oidc , kerberos and jwt realms do not support user lookup. Note If you want to use a realm only for user lookup and prevent users from authenticating against that realm, you can configure the realm and set authentication.enabled to false The user lookup feature is an internal capability that is used to implement the run_as and delegated authorization features - there are no APIs for user lookup. If you want to test your user lookup configuration, then you can do this with run_as . Use the Authenticate API, authenticate as a superuser (e.g. the builtin elastic user) and specify the es-security-runas-user request header . Note The Get users API and User profiles feature are alternative ways to retrieve information about a Elastic Stack user. Those APIs are not related to the user lookup feature. Previous User profiles Next Controlling the user cache Current version Current version ✓ Previous version (8.18) Edit this page Report an issue Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Looking up users without authentication | Elastic Docs","url":"https://www.elastic.co/docs/deploy-manage/users-roles/cluster-or-deployment-auth/looking-up-users-without-authentication","meta_description":"Elasticsearch realms exist primarily to support user authentication. Some realms authenticate users with a password (such as the native and ldap realms),..."}
+{"text":"Docs Release notes Troubleshoot Reference Reference Get started Solutions and use cases Manage data Explore and analyze Deploy and manage Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Security Fields and object schemas Elastic Security ECS field reference Timeline schema Alert schema Endpoint command reference Detection Rules Overview Observability Fields and object schemas Elasticsearch and index management Configuration Circuit breaker settings Auditing settings Enrich settings Cluster-level shard allocation and routing settings Miscellaneous cluster settings Cross-cluster replication settings Discovery and cluster formation settings Field data cache settings Health Diagnostic settings Index lifecycle management settings Data stream lifecycle settings Index management settings Index recovery settings Indexing buffer settings License settings Local gateway Machine learning settings Inference settings Monitoring settings Node settings Networking settings Node query cache settings Search settings Security settings Shard request cache Snapshot and restore settings Transforms settings Thread pool settings Watcher settings JVM settings Roles Elasticsearch privileges Index settings Data tier allocation General History retention Index block Index recovery prioritization Indexing pressure Mapping limit Merge Path Shard allocation Total shards per node Similarity Slow log Sorting Use index sorting to speed up conjunctions Store Preloading data into the file system cache Time series Translog Index lifecycle actions Allocate Delete Force merge Migrate Read only Rollover Downsample Searchable snapshot Set priority Shrink Unfollow Wait for snapshot REST APIs API conventions Common options Compatibility API examples The refresh parameter Optimistic concurrency control Sort search results Paginate search results Retrieve selected fields Search multiple data streams and indices Collapse search results Filter search results Highlighting Retrieve inner hits Search shard routing Searching with query rules Reciprocal rank fusion Retrievers Reindex data stream Create index from source The shard request cache Suggesters Profile search requests Ranking evaluation Mapping Document metadata fields _doc_count field _field_names field _ignored field _id field _index field _meta field _routing field _source field _tier field Field data types Aggregate metric Alias Arrays Binary Boolean Completion Date Date nanoseconds Dense vector Flattened Geopoint Geoshape Histogram IP Join Keyword Nested Numeric Object Pass-through object Percolator Point Range Rank feature Rank features Rank Vectors Search-as-you-type Semantic text Shape Sparse vector Text Token count Unsigned long Version Mapping parameters analyzer coerce copy_to doc_values dynamic eager_global_ordinals enabled format ignore_above index.mapping.ignore_above ignore_malformed index index_options index_phrases index_prefixes meta fields normalizer norms null_value position_increment_gap properties search_analyzer similarity store subobjects term_vector Elasticsearch audit events Command line tools elasticsearch-certgen elasticsearch-certutil elasticsearch-create-enrollment-token elasticsearch-croneval elasticsearch-keystore elasticsearch-node elasticsearch-reconfigure-node elasticsearch-reset-password elasticsearch-saml-metadata elasticsearch-service-tokens elasticsearch-setup-passwords elasticsearch-shard elasticsearch-syskeygen elasticsearch-users Curator Curator and index lifecycle management ILM Actions ILM or Curator? ILM and Curator! About Origin Features Command-Line Interface (CLI) Application Program Interface (API) License Site Corrections Contributing Installation pip Installation from source Docker Running Curator Command Line Interface Singleton Command Line Interface Exit Codes Configuration Environment Variables Action File Configuration File Actions Alias Allocation Close Cluster Routing Cold2Frozen Create Index Delete Indices Delete Snapshots Forcemerge Index Settings Open Reindex Replicas Restore Rollover Shrink Snapshot Options allocation_type allow_ilm_indices continue_if_exception copy_aliases count delay delete_after delete_aliases skip_flush disable_action extra_settings ignore_empty_list ignore_unavailable include_aliases include_global_state indices key max_age max_docs max_size max_num_segments max_wait migration_prefix migration_suffix name new_index node_filters number_of_replicas number_of_shards partial post_allocation preserve_existing refresh remote_certificate remote_client_cert remote_client_key remote_filters remote_url_prefix rename_pattern rename_replacement repository requests_per_second request_body retry_count retry_interval routing_type search_pattern setting shrink_node shrink_prefix shrink_suffix slices skip_repo_fs_check timeout timeout_override value wait_for_active_shards wait_for_completion wait_for_rebalance wait_interval warn_if_no_indices Filters filtertype age alias allocated closed count empty forcemerged kibana none opened pattern period space state Filter Elements aliases allocation_type count date_from date_from_format date_to date_to_format direction disk_space epoch exclude field intersect key kind max_num_segments pattern period_type range_from range_to reverse source state stats_result timestring threshold_behavior unit unit_count unit_count_pattern use_age value week_starts_on Examples alias allocation close cluster_routing create_index delete_indices delete_snapshots forcemerge index_settings open reindex replicas restore rollover shrink snapshot Frequently Asked Questions Q: How can I report an error in the documentation? Q: Can I delete only certain data from within indices? Q: Can Curator handle index names with strange characters? Clients Eland Installation Data Frames Machine Learning Go Getting started Installation Connecting Typed API Getting started with the API Conventions Running queries Using ES|QL Examples Java Getting started Setup Installation Connecting Using OpenTelemetry API conventions Package structure and namespace clients Method naming conventions Blocking and asynchronous clients Building API objects Lists and maps Variant types Object life cycles and thread safety Creating API objects from JSON data Exceptions Using the Java API client Indexing single documents Bulk: indexing multiple documents Reading documents by id Searching for documents Aggregations ES|QL in the Java client Troubleshooting Missing required property NoSuchMethodError: removeHeader IOReactor errors Serializing without typed keys Could not resolve dependencies NoClassDefFoundError: LogFactory Transport layer REST 5 Client Getting started Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Legacy REST Client Getting started Javadoc Maven Repository Dependencies Shading Initialization Performing requests Reading responses Logging Common configuration Timeouts Number of threads Basic authentication Other authentication methods Encrypted communication Others Node selector Sniffer Javadoc Maven Repository Usage Javadoc and source code External resources Breaking changes policy Release highlights License JavaScript Getting started Installation Connecting Configuration Basic configuration Advanced configuration Creating a child client Testing Integrations Observability Transport TypeScript support API Reference Examples asStream Bulk Exists Get Ignore MSearch Scroll Search Suggest transport.request SQL Update Update By Query Reindex Client helpers Timeout best practices .NET Getting started Installation Connecting Configuration Options on ElasticsearchClientSettings Client concepts Serialization Source serialization Using the .NET Client Aggregation examples Using ES|QL CRUD usage examples Custom mapping examples Query examples Usage recommendations Low level Transport example Troubleshoot Logging Logging with OnRequestCompleted Logging with Fiddler Debugging Audit trail Debug information Debug mode PHP Getting started Installation Connecting Configuration Dealing with JSON arrays and objects in PHP Host Configuration Set retries HTTP Meta Data Enabling the Logger Configure the HTTP client Namespaces Node Pool Operations Index management operations Search operations Indexing documents Getting documents Updating documents Deleting documents Client helpers Iterators ES|QL Python Getting started Installation Connecting Configuration Querying Using with asyncio Integrations Using OpenTelemetry ES|QL and Pandas Examples Elasticsearch Python DSL Configuration Tutorials How-To Guides Examples Migrating from the elasticsearch-dsl package Client helpers Ruby Getting started Installation Connecting Configuration Basic configuration Advanced configuration Integrations Transport Elasticsearch API Using OpenTelemetry Elastic Common Schema (ECS) ActiveModel / ActiveRecord Ruby On Rails Persistence Elasticsearch DSL Examples Client helpers Bulk and Scroll helpers ES|QL Troubleshoot Rust Installation Community-contributed clients Elastic Distributions of OpenTelemetry (EDOT) Quickstart Self-managed Kubernetes Hosts / VMs Docker Elastic Cloud Serverless Kubernetes Hosts and VMs Docker Elastic Cloud Hosted Kubernetes Hosts and VMs Docker Reference Architecture Kubernetes environments Hosts / VMs environments Use cases Kubernetes observability Prerequisites and compatibility Components description Deployment Instrumenting Applications Upgrade Customization LLM observability Compatibility and support Features Collector distributions SDK Distributions Limitations Nomenclature EDOT Collector Download Configuration Default config (Standalone) Default config (Kubernetes) Configure Logs Collection Configure Metrics Collection Customization Components Custom Collector Troubleshooting EDOT SDKs EDOT .NET Setup ASP.NET Console applications .NET worker services Zero-code instrumentation Opinionated defaults Configuration Supported technologies Troubleshooting Migration EDOT Java Setup Kubernetes Setup Runtime attach Setup Configuration Features Supported Technologies Troubleshooting Migration Performance overhead EDOT Node.js Setup Kubernetes Configuration Supported Technologies Metrics Troubleshooting Migration EDOT PHP Setup Limitations Configuration Supported Technologies Troubleshooting Migration Performance overhead EDOT Python Setup Kubernetes Manual Instrumentation Configuration Supported Technologies Troubleshooting Migration Performance Overhead Ingestion tools Fleet and Elastic Agent Restrictions for Elastic Cloud Serverless Migrate from Beats to Elastic Agent Migrate from Auditbeat to Elastic Agent Deployment models What is Fleet Server? Deploy on Elastic Cloud Deploy on-premises and self-managed Deploy Fleet Server on-premises and Elasticsearch on Cloud Deploy Fleet Server on Kubernetes Fleet Server scalability Fleet Server Secrets Secret files guide Monitor a self-managed Fleet Server Install Elastic Agents Install Fleet-managed Elastic Agents Install standalone Elastic Agents Upgrade standalone Elastic Agents Install Elastic Agents in a containerized environment Run Elastic Agent in a container Run Elastic Agent on Kubernetes managed by Fleet Install Elastic Agent on Kubernetes using Helm Example: Install standalone Elastic Agent on Kubernetes using Helm Example: Install Fleet-managed Elastic Agent on Kubernetes using Helm Advanced Elastic Agent configuration managed by Fleet Configuring Kubernetes metadata enrichment on Elastic Agent Run Elastic Agent on GKE managed by Fleet Configure Elastic Agent Add-On on Amazon EKS Run Elastic Agent on Azure AKS managed by Fleet Run Elastic Agent Standalone on Kubernetes Scaling Elastic Agent on Kubernetes Using a custom ingest pipeline with the Kubernetes Integration Environment variables Run Elastic Agent as an EDOT Collector Transform an installed Elastic Agent to run as an EDOT Collector Run Elastic Agent without administrative privileges Install Elastic Agent from an MSI package Installation layout Air-gapped environments Using a proxy server with Elastic Agent and Fleet When to configure proxy settings Proxy Server connectivity using default host variables Fleet managed Elastic Agent connectivity using a proxy server Standalone Elastic Agent connectivity using a proxy server Set the proxy URL of the Elastic Package Registry Uninstall Elastic Agents from edge hosts Start and stop Elastic Agents on edge hosts Elastic Agent configuration encryption Secure connections Configure SSL/TLS for self-managed Fleet Servers Rotate SSL/TLS CA certificates Elastic Agent deployment models with mutual TLS One-way and mutual TLS certifications flow Configure SSL/TLS for the Logstash output Manage Elastic Agents in Fleet Fleet settings Elasticsearch output settings Logstash output settings Kafka output settings Remote Elasticsearch output Considerations when changing outputs Elastic Agents Unenroll Elastic Agents Set inactivity timeout Upgrade Elastic Agents Migrate Elastic Agents Monitor Elastic Agents Elastic Agent health status Add tags to filter the Agents list Enrollment handing for containerized agents Policies Create an agent policy without using the UI Enable custom settings in an agent policy Set environment variables in an Elastic Agent policy Required roles and privileges Fleet enrollment tokens Kibana Fleet APIs Configure standalone Elastic Agents Create a standalone Elastic Agent policy Structure of a config file Inputs Simplified log ingestion Elastic Agent inputs Variables and conditions in input configurations Providers Local Agent provider Host provider Env Provider Filesource provider Kubernetes Secrets Provider Kubernetes LeaderElection Provider Local dynamic provider Docker Provider Kubernetes Provider Outputs Elasticsearch Kafka Logstash SSL/TLS Logging Feature flags Agent download Config file examples Apache HTTP Server Nginx HTTP Server Grant standalone Elastic Agents access to Elasticsearch Example: Use standalone Elastic Agent with Elastic Cloud Serverless to monitor nginx Example: Use standalone Elastic Agent with Elastic Cloud Hosted to monitor nginx Debug standalone Elastic Agents Kubernetes autodiscovery with Elastic Agent Conditions based autodiscover Hints annotations based autodiscover Monitoring Reference YAML Manage integrations Package signatures Add an integration to an Elastic Agent policy View integration policies Edit or delete an integration policy Install and uninstall integration assets View integration assets Set integration-level outputs Upgrade an integration Managed integrations content Best practices for integration assets Data streams Tutorials: Customize data retention policies Scenario 1 Scenario 2 Scenario 3 Tutorial: Transform data with custom ingest pipelines Advanced data stream features Command reference Agent processors Processor syntax add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_sid truncate_fields urldecode APM APM settings APM settings for Elastic Cloud APM settings for Elastic Cloud Enterprise APM Attacher for Kubernetes Instrument and configure pods Add the helm repository to Helm Configure the webhook with a Helm values file Install the webhook with Helm Add a pod template annotation to each pod you want to auto-instrument Watch data flow into the Elastic Stack APM Architecture for AWS Lambda Performance impact and overhead Configuration options Using AWS Secrets Manager to manage APM authentication keys APM agents APM Android agent Getting started Configuration Manual instrumentation Automatic instrumentation Frequently asked questions How-tos Troubleshooting APM .NET agent Set up the APM .NET agent Profiler Auto instrumentation ASP.NET Core .NET Core and .NET 5+ ASP.NET Azure Functions Other .NET applications NuGet packages Entity Framework Core Entity Framework 6 Elasticsearch gRPC SqlClient StackExchange.Redis Azure Cosmos DB Azure Service Bus Azure Storage MongoDB Supported technologies Configuration Configuration on ASP.NET Core Configuration for Windows Services Configuration on ASP.NET Core configuration options Reporter configuration options HTTP configuration options Messaging configuration options Stacktrace configuration options Supportability configuration options All options summary Public API OpenTelemetry bridge Metrics Logs Serilog NLog Manual log correlation Performance tuning Upgrading APM Go agent Set up the APM Go Agent Built-in instrumentation modules Custom instrumentation Context propagation Supported technologies Configuration API documentation Metrics Logs Log correlation OpenTelemetry API OpenTracing API Contributing Upgrading APM iOS agent Supported technologies Set up the APM iOS Agent Configuration Instrumentation APM Java agent Set up the APM Java Agent Manual setup with -javaagent flag Automatic setup with apm-agent-attach-cli.jar Programmatic API setup to self-attach SSL/TLS communication with APM Server Monitoring AWS Lambda Java Functions Supported technologies Configuration Circuit-Breaker Core Datastore HTTP Huge Traces JAX-RS JMX Logging Messaging Metrics Profiling Reporter Serverless Stacktrace Property file reference Tracing APIs Public API OpenTelemetry bridge OpenTracing bridge Plugin API Metrics Logs How to find slow methods Sampling-based profiler API/Code Annotations Configuration-based Overhead and performance tuning Frequently asked questions Community plugins Upgrading APM Node.js agent Set up the Agent Monitoring AWS Lambda Node.js Functions Monitoring Node.js Azure Functions Get started with Express Get started with Fastify Get started with hapi Get started with Koa Get started with Next.js Get started with Restify Get started with TypeScript Get started with a custom Node.js stack Starting the agent Supported technologies Configuration Configuring the agent Configuration options Custom transactions Custom spans API Reference Agent API Transaction API Span API Metrics Logs OpenTelemetry bridge OpenTracing bridge Source map support ECMAScript module support Distributed tracing Message queues Performance Tuning Upgrading Upgrade to v4.x Upgrade to v3.x Upgrade to v2.x Upgrade to v1.x APM PHP agent Set up the APM PHP Agent Supported technologies Configuration Configuration reference Public API APM Python agent Set up the APM Python Agent Django support Flask support Aiohttp Server support Tornado Support Starlette/FastAPI Support Sanic Support Monitoring AWS Lambda Python Functions Monitoring Azure Functions Wrapper Support ASGI Middleware Supported technologies Configuration Advanced topics Instrumenting custom code Sanitizing data How the Agent works Run Tests Locally API reference Metrics OpenTelemetry API Bridge Logs Performance tuning Upgrading Upgrading to version 6 of the agent Upgrading to version 5 of the agent Upgrading to version 4 of the agent APM Ruby agent Set up the APM Ruby agent Getting started with Rails Getting started with Rack Supported technologies Configuration Advanced topics Adding additional context Custom instrumentation API reference Metrics Logs OpenTracing API GraphQL Performance tuning Upgrading APM RUM JavaScript agent Set up the APM Real User Monitoring JavaScript Agent Install the Agent Configure CORS Supported technologies Configuration API reference Agent API Transaction API Span API Source maps Framework-specific integrations React integration Angular integration Vue integration Distributed tracing Breakdown metrics OpenTracing Advanced topics How to interpret long task spans in the UI Using with TypeScript Custom page load transaction names Custom Transactions Performance tuning Upgrading Beats Beats Config file format Namespacing Config file data types Environment variables Reference variables Config file ownership and permissions Command line arguments YAML tips and gotchas Auditbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Auditbeat on Docker Running Auditbeat on Kubernetes Auditbeat and systemd Start Auditbeat Stop Auditbeat Upgrade Auditbeat Configure Modules General settings Project paths Config file reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_session_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags auditbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Modules Auditd Module File Integrity Module System Module System host dataset System login dataset System package dataset System process dataset System socket dataset System user dataset Exported fields Auditd fields Beat fields Cloud provider metadata fields Common fields Docker fields ECS fields File Integrity fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields System fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get Help Debug Understand logged metrics Common problems Auditbeat fails to watch folders because too many files are open Auditbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Filebeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Filebeat on Docker Run Filebeat on Kubernetes Run Filebeat on Cloud Foundry Filebeat and systemd Start Filebeat Stop Filebeat Upgrade How Filebeat works Configure Inputs Multiline messages AWS CloudWatch AWS S3 Azure Event Hub Azure Blob Storage Benchmark CEL Cloud Foundry CometD Container Entity Analytics ETW filestream GCP Pub/Sub Google Cloud Storage HTTP Endpoint HTTP JSON journald Kafka Log MQTT NetFlow Office 365 Management Activity API Redis Salesforce Stdin Streaming Syslog TCP UDP Unified Logs Unix winlog Modules Override input settings General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append cache community_id convert copy_fields decode_base64_field decode_cef decode_csv_fields decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields parse_aws_vpc_flow_log rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags filebeat.reference.yml How to guides Override configuration settings Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Enrich events with geoIP information Deduplicate data Parse data using an ingest pipeline Use environment variables in the configuration Avoid YAML formatting problems Migrate log or container input configurations to filestream How to choose file identity for filestream Migrating from a Deprecated Filebeat Module Modules Modules ActiveMQ module Apache module Auditd module AWS module AWS Fargate module Azure module CEF module Check Point module Cisco module CoreDNS module CrowdStrike module Cyberark PAS module Elasticsearch module Envoyproxy Module Fortinet module Google Cloud module Google Workspace module HAproxy module IBM MQ module Icinga module IIS module Iptables module Juniper module Kafka module Kibana module Logstash module Microsoft module MISP module MongoDB module MSSQL module MySQL module MySQL Enterprise module NATS module NetFlow module Nginx module Office 365 module Okta module Oracle module Osquery module Palo Alto Networks module pensando module PostgreSQL module RabbitMQ module Redis module Salesforce module Set up the OAuth App in the Salesforce Santa module Snyk module Sophos module Suricata module System module Threat Intel module Traefik module Zeek (Bro) Module ZooKeeper module Zoom module Exported fields ActiveMQ fields Apache fields Auditd fields AWS fields AWS CloudWatch fields AWS Fargate fields Azure fields Beat fields Decode CEF processor fields fields CEF fields Checkpoint fields Cisco fields Cloud provider metadata fields Coredns fields Crowdstrike fields CyberArk PAS fields Docker fields ECS fields Elasticsearch fields Envoyproxy fields Fortinet fields Google Cloud Platform (GCP) fields google_workspace fields HAProxy fields Host fields ibmmq fields Icinga fields IIS fields iptables fields Jolokia Discovery autodiscover provider fields Juniper JUNOS fields Kafka fields kibana fields Kubernetes fields Log file content fields logstash fields Lumberjack fields Microsoft fields MISP fields mongodb fields mssql fields MySQL fields MySQL Enterprise fields NATS fields NetFlow fields Nginx fields Office 365 fields Okta fields Oracle fields Osquery fields panw fields Pensando fields PostgreSQL fields Process fields RabbitMQ fields Redis fields s3 fields Salesforce fields Google Santa fields Snyk fields sophos fields Suricata fields System fields threatintel fields Traefik fields Windows ETW fields Zeek fields ZooKeeper fields Zoom fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Error extracting container id while using Kubernetes metadata Can't read log files from network volumes Filebeat isn't collecting lines from a file Too many open file handlers Registry file is too large Inode reuse causes Filebeat to skip lines Log rotation results in lost or duplicate events Open file handlers cause issues with Windows file rotation Filebeat is using too much CPU Dashboard in Kibana is breaking up data fields incorrectly Fields are not indexed or usable in Kibana visualizations Filebeat isn't shipping the last line of a file Filebeat keeps open file handlers of deleted files for a long time Filebeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Heartbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Heartbeat on Docker Running Heartbeat on Kubernetes Heartbeat and systemd Stop Heartbeat Configure Monitors Common monitor options ICMP options TCP options HTTP options Task scheduler General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags heartbeat.reference.yml How to guides Add observer and geo metadata Load the Elasticsearch index template Change the index name Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields Beat fields Synthetics browser metrics fields Cloud provider metadata fields Common heartbeat monitor fields Docker fields ECS fields Host fields HTTP monitor fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Process fields Host lookup fields APM Service fields SOCKS5 proxy fields Monitor state fields Monitor summary fields Synthetics types fields TCP layer fields TLS encryption layer fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems Heartbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected High RSS memory usage due to MADV settings Contribute Metricbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Metricbeat on Docker Run Metricbeat on Kubernetes Run Metricbeat on Cloud Foundry Metricbeat and systemd Start Metricbeat Stop Metricbeat Upgrade Metricbeat How Metricbeat works Event structure Error event structure Key metricbeat features Configure Modules General settings Project paths Config file loading Live reloading Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog translate_ldap_attribute translate_sid truncate_fields urldecode Autodiscover Hints based autodiscover Advanced usage Internal queue Logging HTTP endpoint Regular expression support Instrumentation Feature flags metricbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules ActiveMQ module ActiveMQ broker metricset ActiveMQ queue metricset ActiveMQ topic metricset Aerospike module Aerospike namespace metricset Airflow module Airflow statsd metricset Apache module Apache status metricset AWS module AWS awshealth metricset AWS billing metricset AWS cloudwatch metricset AWS dynamodb metricset AWS ebs metricset AWS ec2 metricset AWS elb metricset AWS kinesis metricset AWS lambda metricset AWS natgateway metricset AWS rds metricset AWS s3_daily_storage metricset AWS s3_request metricset AWS sns metricset AWS sqs metricset AWS transitgateway metricset AWS usage metricset AWS vpn metricset AWS Fargate module AWS Fargate task_stats metricset Azure module Azure app_insights metricset Azure app_state metricset Azure billing metricset Azure compute_vm metricset Azure compute_vm_scaleset metricset Azure container_instance metricset Azure container_registry metricset Azure container_service metricset Azure database_account metricset Azure monitor metricset Azure storage metricset Beat module Beat state metricset Beat stats metricset Benchmark module Benchmark info metricset Ceph module Ceph cluster_disk metricset Ceph cluster_health metricset Ceph cluster_status metricset Ceph mgr_cluster_disk metricset Ceph mgr_cluster_health metricset Ceph mgr_osd_perf metricset Ceph mgr_osd_pool_stats metricset Ceph mgr_osd_tree metricset Ceph mgr_pool_disk metricset Ceph monitor_health metricset Ceph osd_df metricset Ceph osd_tree metricset Ceph pool_disk metricset Cloudfoundry module Cloudfoundry container metricset Cloudfoundry counter metricset Cloudfoundry value metricset CockroachDB module CockroachDB status metricset Consul module Consul agent metricset Containerd module Containerd blkio metricset Containerd cpu metricset Containerd memory metricset Coredns module Coredns stats metricset Couchbase module Couchbase bucket metricset Couchbase cluster metricset Couchbase node metricset CouchDB module CouchDB server metricset Docker module Docker container metricset Docker cpu metricset Docker diskio metricset Docker event metricset Docker healthcheck metricset Docker image metricset Docker info metricset Docker memory metricset Docker network metricset Docker network_summary metricset Dropwizard module Dropwizard collector metricset Elasticsearch module Elasticsearch ccr metricset Elasticsearch cluster_stats metricset Elasticsearch enrich metricset Elasticsearch index metricset Elasticsearch index_recovery metricset Elasticsearch index_summary metricset Elasticsearch ingest_pipeline metricset Elasticsearch ml_job metricset Elasticsearch node metricset Elasticsearch node_stats metricset Elasticsearch pending_tasks metricset Elasticsearch shard metricset Envoyproxy module Envoyproxy server metricset Etcd module Etcd leader metricset Etcd metrics metricset Etcd self metricset Etcd store metricset Google Cloud Platform module Google Cloud Platform billing metricset Google Cloud Platform carbon metricset Google Cloud Platform compute metricset Google Cloud Platform dataproc metricset Google Cloud Platform firestore metricset Google Cloud Platform gke metricset Google Cloud Platform loadbalancing metricset Google Cloud Platform metrics metricset Google Cloud Platform pubsub metricset Google Cloud Platform storage metricset Golang module Golang expvar metricset Golang heap metricset Graphite module Graphite server metricset HAProxy module HAProxy info metricset HAProxy stat metricset HTTP module HTTP json metricset HTTP server metricset IBM MQ module IBM MQ qmgr metricset IIS module IIS application_pool metricset IIS webserver metricset IIS website metricset Istio module Istio citadel metricset Istio galley metricset Istio istiod metricset Istio mesh metricset Istio mixer metricset Istio pilot metricset Istio proxy metricset Jolokia module Jolokia jmx metricset Kafka module Kafka broker metricset Kafka consumer metricset Kafka consumergroup metricset Kafka partition metricset Kafka producer metricset Kibana module Kibana cluster_actions metricset Kibana cluster_rules metricset Kibana node_actions metricset Kibana node_rules metricset Kibana stats metricset Kibana status metricset Kubernetes module Kubernetes apiserver metricset Kubernetes container metricset Kubernetes controllermanager metricset Kubernetes event metricset Kubernetes node metricset Kubernetes pod metricset Kubernetes proxy metricset Kubernetes scheduler metricset Kubernetes state_container metricset Kubernetes state_cronjob metricset Kubernetes state_daemonset metricset Kubernetes state_deployment metricset Kubernetes state_job metricset Kubernetes state_node metricset Kubernetes state_persistentvolumeclaim metricset Kubernetes state_pod metricset Kubernetes state_replicaset metricset Kubernetes state_resourcequota metricset Kubernetes state_service metricset Kubernetes state_statefulset metricset Kubernetes state_storageclass metricset Kubernetes system metricset Kubernetes volume metricset KVM module KVM dommemstat metricset KVM status metricset Linux module Linux conntrack metricset Linux iostat metricset Linux ksm metricset Linux memory metricset Linux pageinfo metricset Linux pressure metricset Linux rapl metricset Logstash module Logstash node metricset Logstash node_stats metricset Memcached module Memcached stats metricset Cisco Meraki module Cisco Meraki device_health metricset MongoDB module MongoDB collstats metricset MongoDB dbstats metricset MongoDB metrics metricset MongoDB replstatus metricset MongoDB status metricset MSSQL module MSSQL performance metricset MSSQL transaction_log metricset Munin module Munin node metricset MySQL module MySQL galera_status metricset galera status MetricSet MySQL performance metricset MySQL query metricset MySQL status metricset NATS module NATS connection metricset NATS connections metricset NATS JetStream metricset NATS route metricset NATS routes metricset NATS stats metricset NATS subscriptions metricset Nginx module Nginx stubstatus metricset openai module openai usage metricset Openmetrics module Openmetrics collector metricset Oracle module Oracle performance metricset Oracle sysmetric metricset Oracle tablespace metricset Panw module Panw interfaces metricset Panw routing metricset Panw system metricset Panw vpn metricset PHP_FPM module PHP_FPM pool metricset PHP_FPM process metricset PostgreSQL module PostgreSQL activity metricset PostgreSQL bgwriter metricset PostgreSQL database metricset PostgreSQL statement metricset Prometheus module Prometheus collector metricset Prometheus query metricset Prometheus remote_write metricset RabbitMQ module RabbitMQ connection metricset RabbitMQ exchange metricset RabbitMQ node metricset RabbitMQ queue metricset RabbitMQ shovel metricset Redis module Redis info metricset Redis key metricset Redis keyspace metricset Redis Enterprise module Redis Enterprise node metricset Redis Enterprise proxy metricset SQL module Host Setup SQL query metricset Stan module Stan channels metricset Stan stats metricset Stan subscriptions metricset Statsd module Metricsets Statsd server metricset SyncGateway module SyncGateway db metricset SyncGateway memory metricset SyncGateway replication metricset SyncGateway resources metricset System module System core metricset System cpu metricset System diskio metricset System entropy metricset System filesystem metricset System fsstat metricset System load metricset System memory metricset System network metricset System network_summary metricset System process metricset System process_summary metricset System raid metricset System service metricset System socket metricset System socket_summary metricset System uptime metricset System users metricset Tomcat module Tomcat cache metricset Tomcat memory metricset Tomcat requests metricset Tomcat threading metricset Traefik module Traefik health metricset uWSGI module uWSGI status metricset vSphere module vSphere cluster metricset vSphere datastore metricset vSphere datastorecluster metricset vSphere host metricset vSphere network metricset vSphere resourcepool metricset vSphere virtualmachine metricset Windows module Windows perfmon metricset Windows service metricset Windows wmi metricset ZooKeeper module ZooKeeper connection metricset ZooKeeper mntr metricset ZooKeeper server metricset Exported fields ActiveMQ fields Aerospike fields Airflow fields Apache fields AWS fields AWS Fargate fields Azure fields Beat fields Beat fields Benchmark fields Ceph fields Cloud provider metadata fields Cloudfoundry fields CockroachDB fields Common fields Consul fields Containerd fields Coredns fields Couchbase fields CouchDB fields Docker fields Docker fields Dropwizard fields ECS fields Elasticsearch fields Envoyproxy fields Etcd fields Google Cloud Platform fields Golang fields Graphite fields HAProxy fields Host fields HTTP fields IBM MQ fields IIS fields Istio fields Jolokia fields Jolokia Discovery autodiscover provider fields Kafka fields Kibana fields Kubernetes fields Kubernetes fields KVM fields Linux fields Logstash fields Memcached fields MongoDB fields MSSQL fields Munin fields MySQL fields NATS fields Nginx fields openai fields Openmetrics fields Oracle fields Panw fields PHP_FPM fields PostgreSQL fields Process fields Prometheus fields Prometheus typed metrics fields RabbitMQ fields Redis fields Redis Enterprise fields SQL fields Stan fields Statsd fields SyncGateway fields System fields Tomcat fields Traefik fields uWSGI fields vSphere fields Windows fields ZooKeeper fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Troubleshoot Get help Debug Understand logged metrics Common problems open /compat/linux/proc: no such file or directory error on FreeBSD Metricbeat collects system metrics for interfaces you didn't configure Metricbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Contribute Packetbeat Quick start Set up and run Directory layout Secrets keystore Command reference Repositories for APT and YUM Run Packetbeat on Docker Packetbeat and systemd Start Packetbeat Stop Packetbeat Upgrade Packetbeat Configure Traffic sniffing Network flows Protocols Common protocol options ICMP DNS HTTP AMQP Cassandra Memcache MySQL PgSQL Thrift MongoDB TLS Redis Processes General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace syslog translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Protocol-Specific Metrics Instrumentation Feature flags packetbeat.reference.yml How to guides Load the Elasticsearch index template Change the index name Load Kibana dashboards Enrich events with geoIP information Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Exported fields AMQP fields Beat fields Cassandra fields Cloud provider metadata fields Common fields DHCPv4 fields DNS fields Docker fields ECS fields Flow Event fields Host fields HTTP fields ICMP fields Jolokia Discovery autodiscover provider fields Kubernetes fields Memcache fields MongoDb fields MySQL fields NFS fields PostgreSQL fields Process fields Raw fields Redis fields SIP fields Thrift-RPC fields Detailed TLS fields Transaction Event fields Measurements (Transactions) fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Use Linux Secure Computing Mode (seccomp) Visualize Packetbeat data in Kibana Customize the Discover page Kibana queries and filters Troubleshoot Get help Debug Understand logged metrics Record a trace Common problems Dashboard in Kibana is breaking up data fields incorrectly Packetbeat doesn't see any packets when using mirror ports Packetbeat Can't capture traffic from Windows loopback interface Packetbeat is missing long running transactions Packetbeat isn't capturing MySQL performance data Packetbeat uses too much bandwidth Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Fields show up as nested JSON in Kibana Contribute Winlogbeat Quick start Set up and run Directory layout Secrets keystore Command reference Start Winlogbeat Stop Winlogbeat Upgrade Configure Winlogbeat General settings Project paths Output Elastic Cloud Hosted Elasticsearch Logstash Kafka Redis File Console Discard Change the output codec Kerberos SSL Index lifecycle management (ILM) Elasticsearch index template Kibana endpoint Kibana dashboards Processors Define processors add_cloud_metadata add_cloudfoundry_metadata add_docker_metadata add_fields add_host_metadata add_id add_kubernetes_metadata add_labels add_locale add_network_direction add_nomad_metadata add_observer_metadata add_process_metadata add_tags append community_id convert copy_fields decode_base64_field decode_duration decode_json_fields decode_xml decode_xml_wineventlog decompress_gzip_field detect_mime_type dissect dns drop_event drop_fields extract_array fingerprint include_fields move_fields rate_limit registered_domain rename replace script syslog timestamp translate_ldap_attribute translate_sid truncate_fields urldecode Internal queue Logging HTTP endpoint Event Processing Metrics Instrumentation winlogbeat.reference.yml How to guides Enrich events with geoIP information Load the Elasticsearch index template Change the index name Load Kibana dashboards Load ingest pipelines Use environment variables in the configuration Parse data using an ingest pipeline Avoid YAML formatting problems Modules PowerShell Module Security Module Sysmon Module Exported fields Beat fields Cloud provider metadata fields Docker fields ECS fields Legacy Winlogbeat alias fields Host fields Jolokia Discovery autodiscover provider fields Kubernetes fields PowerShell module fields Process fields Security module fields Sysmon module fields Winlogbeat fields Monitor Use internal collection Settings for internal collection Use Metricbeat collection Secure Grant users access to secured resources Create a setup user Create a monitoring user Create a publishing user Create a reader user Learn more about privileges, roles, and users Grant access using API keys Secure communication with Elasticsearch Secure communication with Logstash Troubleshoot Get Help Debug Understand logged metrics Common problems Dashboard in Kibana is breaking up data fields incorrectly Bogus computer_name fields are reported in some events Error loading config file Found unexpected or unknown characters Logstash connection doesn't work Publishing to Logstash fails with \"connection reset by peer\" message @metadata is missing in Logstash Not sure whether to use Logstash or Beats SSL client fails to connect to Logstash Monitoring UI shows fewer Beats than expected Dashboard could not locate the index-pattern High RSS memory usage due to MADV settings Not sure how to read from .evtx files Contribute Upgrade Community Beats Contribute Elastic logging plugin for Docker Install and configure Configuration options Usage examples Known problems and limitations Processor reference Append Attachment Bytes Circle Community ID Convert CSV Date Date index name Dissect Dot expander Drop Enrich Fail Fingerprint Foreach Geo-grid GeoIP Grok Gsub HTML strip Inference IP Location Join JSON KV Lowercase Network direction Pipeline Redact Registered domain Remove Rename Reroute Script Set Set security user Sort Split Terminate Trim Uppercase URL decode URI parts User agent Logstash Getting started with Logstash Installing Logstash Stashing Your First Event Parsing Logs with Logstash Stitching Together Multiple Input and Output Plugins How Logstash Works Execution Model ECS in Logstash Processing Details Setting up and running Logstash Logstash Directory Layout Logstash Configuration Files logstash.yml Secrets keystore for secure settings Running Logstash from the Command Line Running Logstash as a Service on Debian or RPM Running Logstash on Docker Configuring Logstash for Docker Running Logstash on Kubernetes Running Logstash on Windows Logging Shutting Down Logstash Upgrading Logstash Upgrading using package managers Upgrading using a direct download Upgrading between minor versions Creating a Logstash Pipeline Structure of a pipeline Accessing event data and fields Using environment variables Sending data to Elastic Cloud (hosted Elasticsearch Service) Logstash configuration examples Secure your connection Advanced Logstash configurations Multiple Pipelines Pipeline-to-pipeline communication Reloading the Config File Managing Multiline Events Glob Pattern Support Logstash-to-Logstash communications Logstash-to-Logstash: Lumberjack output to Beats input Logstash-to-Logstash: HTTP output to HTTP input Logstash-to-Logstash: Output to Input Managing Logstash Centralized Pipeline Management Configure Centralized Pipeline Management Using Logstash with Elastic integrations Working with Filebeat modules Use ingest pipelines for parsing Example: Set up Filebeat modules to work with Kafka and Logstash Working with Winlogbeat modules Queues and data resiliency Memory queue Persistent queues (PQ) Dead letter queues (DLQ) Transforming data Performing Core Operations Deserializing Data Extracting Fields and Wrangling Data Enriching Data with Lookups Deploying and scaling Logstash Managing GeoIP databases GeoIP Database Management Configure GeoIP Database Management Performance tuning Performance troubleshooting Tuning and profiling logstash pipeline performance Monitoring Logstash with Elastic Agent Collect monitoring data for dashboards Collect monitoring data for dashboards (Serverless ) Collect monitoring data for stack monitoring Monitoring Logstash (Legacy) Metricbeat collection Legacy collection (deprecated) Monitoring UI Pipeline Viewer UI Troubleshooting Monitoring Logstash with APIs Working with plugins Cross-plugin concepts and features Generating plugins Offline Plugin Management Private Gem Repositories Event API Tips and best practices JVM settings Logstash Plugins Integration plugins aws elastic_enterprise_search jdbc kafka logstash rabbitmq snmp Input plugins azure_event_hubs beats cloudwatch couchdb_changes dead_letter_queue elastic_agent elastic_serverless_forwarder elasticsearch exec file ganglia gelf generator github google_cloud_storage google_pubsub graphite heartbeat http http_poller imap irc java_generator java_stdin jdbc jms jmx kafka kinesis logstash log4j lumberjack meetup pipe puppet_facter rabbitmq redis relp rss s3 s3-sns-sqs salesforce snmp snmptrap sqlite sqs stdin stomp syslog tcp twitter udp unix varnishlog websocket wmi xmpp Output plugins boundary circonus cloudwatch csv datadog datadog_metrics dynatrace elastic_app_search elastic_workplace_search elasticsearch email exec file ganglia gelf google_bigquery google_cloud_storage google_pubsub graphite graphtastic http influxdb irc java_stdout juggernaut kafka librato logstash loggly lumberjack metriccatcher mongodb nagios nagios_nsca opentsdb pagerduty pipe rabbitmq redis redmine riak riemann s3 sink sns solr_http sqs statsd stdout stomp syslog tcp timber udp webhdfs websocket xmpp zabbix Filter plugins age aggregate alter bytes cidr cipher clone csv date de_dot dissect dns drop elapsed elastic_integration elasticsearch environment extractnumbers fingerprint geoip grok http i18n java_uuid jdbc_static jdbc_streaming json json_encode kv memcached metricize metrics mutate prune range ruby sleep split syslog_pri threats_classifier throttle tld translate truncate urldecode useragent uuid wurfl_device_detection xml Codec plugins avro cef cloudfront cloudtrail collectd csv dots edn edn_lines es_bulk fluent graphite gzip_lines jdots java_line java_plain json json_lines line msgpack multiline netflow nmap plain protobuf rubydebug Plugin value types Logstash Versioned Plugin Reference Integration plugins aws v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 elastic_enterprise_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 Input plugins azure_event_hubs v1.5.1 v1.5.0 v1.4.9 v1.4.8 v1.4.7 v1.4.6 v1.4.5 v1.4.4 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.3 v1.1.2 v1.1.1 v1.1.0 v1.0.4 v1.0.3 v1.0.1 v1.0.0 beats v7.0.0 v6.9.1 v6.9.0 v6.8.4 v6.8.3 v6.8.2 v6.8.1 v6.8.0 v6.7.2 v6.7.1 v6.7.0 v6.6.4 v6.6.3 v6.6.2 v6.6.1 v6.6.0 v6.5.0 v6.4.4 v6.4.3 v6.4.1 v6.4.0 v6.3.1 v6.3.0 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.6 v6.1.5 v6.1.4 v6.1.3 v6.1.2 v6.1.1 v6.1.0 v6.0.14 v6.0.13 v6.0.12 v6.0.11 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.1.11 v5.1.10 v5.1.9 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.0 v5.0.16 v5.0.15 v5.0.14 v5.0.13 v5.0.11 v5.0.10 v5.0.9 v5.0.8 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v3.1.32 v3.1.31 v3.1.30 v3.1.29 v3.1.28 v3.1.27 v3.1.26 v3.1.25 v3.1.24 v3.1.23 v3.1.22 v3.1.21 v3.1.20 v3.1.19 v3.1.18 v3.1.17 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v2.2.4 v2.2.3 v2.2.2 v2.1.1 v2.1.0 v2.0.3 v2.0.2 v2.0.1 couchdb_changes v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 dead_letter_queue v2.0.0 v1.1.12 v1.1.11 v1.1.10 v1.1.9 v1.1.8 v1.1.7 v1.1.6 v1.1.5 v1.1.4 v1.1.2 v1.1.1 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 drupal_dblog v2.0.7 v2.0.6 v2.0.5 elastic_agent elastic_serverless_forwarder v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.1 v0.1.0 elasticsearch v5.0.0 v4.21.0 v4.20.5 v4.20.4 v4.20.3 v4.20.2 v4.20.1 v4.20.0 v4.19.1 v4.19.0 v4.18.0 v4.17.2 v4.17.1 v4.17.0 v4.16.0 v4.15.0 v4.14.0 v4.13.0 v4.12.3 v4.12.2 v4.12.1 v4.12.0 v4.11.0 v4.10.0 v4.9.3 v4.9.2 v4.9.1 v4.9.0 v4.8.1 v4.8.0 v4.7.1 v4.7.0 v4.6.2 v4.6.1 v4.6.0 v4.5.0 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 v4.1.0 v4.0.6 v4.0.5 v4.0.4 eventlog v4.1.3 v4.1.2 v4.1.1 exec v3.6.0 v3.4.0 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 file v4.4.6 v4.4.5 v4.4.4 v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.18 v4.1.17 v4.1.16 v4.1.15 v4.1.14 v4.1.13 v4.1.12 v4.1.11 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.5 v4.0.3 v4.0.2 ganglia v3.1.4 v3.1.3 v3.1.2 v3.1.1 gelf v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 gemfire v2.0.7 v2.0.6 v2.0.5 generator v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 github v3.0.11 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 google_cloud_storage v0.15.0 v0.14.0 v0.13.0 v0.12.0 v0.11.1 v0.10.0 google_pubsub v1.4.0 v1.3.0 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.1 graphite v3.0.6 v3.0.4 v3.0.3 heartbeat v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 heroku v3.0.3 v3.0.2 v3.0.1 http v4.0.0 v3.9.2 v3.9.1 v3.9.0 v3.8.1 v3.8.0 v3.7.3 v3.7.2 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 http_poller v6.0.0 v5.6.0 v5.5.1 v5.5.0 v5.4.0 v5.3.1 v5.3.0 v5.2.1 v5.2.0 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 imap v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 irc v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 jdbc v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.19 v4.3.18 v4.3.17 v4.3.16 v4.3.14 v4.3.13 v4.3.12 v4.3.11 v4.3.9 v4.3.8 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.4 v4.2.3 v4.2.2 v4.2.1 jms v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 jmx v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 journald v2.0.2 v2.0.1 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v9.1.0 v9.0.1 v9.0.0 v8.3.1 v8.3.0 v8.2.1 v8.2.0 v8.1.1 v8.1.0 v8.0.6 v8.0.4 v8.0.2 v8.0.0 v7.0.0 v6.3.4 v6.3.3 v6.3.2 v6.3.0 kinesis v2.3.0 v2.2.2 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.11 v2.0.10 v2.0.8 v2.0.7 v2.0.6 v2.0.5 v2.0.4 log4j v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 meetup v3.1.1 v3.1.0 v3.0.4 v3.0.3 v3.0.2 v3.0.1 neo4j v2.0.8 v2.0.6 v2.0.5 pipe v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 puppet_facter v3.0.4 v3.0.3 v3.0.2 v3.0.1 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.5 v5.2.4 rackspace v3.0.5 v3.0.4 v3.0.1 redis v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.1 v3.5.0 v3.4.1 v3.4.0 v3.2.2 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 relp v3.0.4 v3.0.3 v3.0.2 v3.0.1 rss v3.0.5 v3.0.4 v3.0.3 v3.0.2 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.8.4 v3.8.3 v3.8.2 v3.8.1 v3.8.0 v3.7.0 v3.6.0 v3.5.0 v3.4.1 v3.4.0 v3.3.7 v3.3.6 v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.9 v3.1.8 v3.1.7 v3.1.6 v3.1.5 salesforce v3.2.1 v3.2.0 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 v3.0.2 snmp v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v1.3.3 v1.3.2 v1.3.1 v1.3.0 v1.2.8 v1.2.7 v1.2.6 v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.0 v1.0.1 v1.0.0 snmptrap v4.0.6 v4.0.5 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 sqlite v3.0.4 v3.0.3 v3.0.2 v3.0.1 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.3.2 v3.3.1 v3.3.0 v3.2.0 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 stdin v3.4.0 v3.3.0 v3.2.6 v3.2.5 v3.2.4 v3.2.3 stomp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 syslog v3.7.0 v3.6.0 v3.5.0 v3.4.5 v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 tcp v7.0.0 v6.4.4 v6.4.3 v6.4.2 v6.4.1 v6.4.0 v6.3.5 v6.3.4 v6.3.3 v6.3.2 v6.3.1 v6.3.0 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.1 v6.1.0 v6.0.10 v6.0.9 v6.0.8 v6.0.7 v6.0.6 v6.0.5 v6.0.4 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.2.7 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.0 v5.0.10 v5.0.9 v5.0.8 v5.0.7 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.2.4 v4.2.3 v4.2.2 v4.1.2 twitter v4.1.0 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 udp v3.5.0 v3.4.1 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.3 v3.1.2 v3.1.1 unix v3.1.2 v3.1.1 v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.4 varnishlog v3.0.4 v3.0.3 v3.0.2 v3.0.1 websocket v4.0.4 v4.0.3 v4.0.2 v4.0.1 wmi v3.0.4 v3.0.3 v3.0.2 v3.0.1 xmpp v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 zenoss v2.0.7 v2.0.6 v2.0.5 zeromq v3.0.5 v3.0.3 Output plugins appsearch v1.0.0.beta1 boundary v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 circonus v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 cloudwatch v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.1.0 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 csv v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 datadog v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.1 datadog_metrics v3.0.6 v3.0.5 v3.0.4 v3.0.2 v3.0.1 elastic_app_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 v2.0.0 v1.2.0 v1.1.1 v1.1.0 v1.0.0 elastic_workplace_search v3.0.1 v3.0.0 v2.2.1 v2.2.0 v2.1.2 v2.1.1 v2.1.0 elasticsearch v12.0.2 v12.0.1 v12.0.0 v11.22.12 v11.22.11 v11.22.10 v11.22.9 v11.22.8 v11.22.7 v11.22.6 v11.22.5 v11.22.4 v11.22.3 v11.22.2 v11.22.1 v11.22.0 v11.21.0 v11.20.1 v11.20.0 v11.19.0 v11.18.0 v11.17.0 v11.16.0 v11.15.9 v11.15.8 v11.15.7 v11.15.6 v11.15.5 v11.15.4 v11.15.2 v11.15.1 v11.15.0 v11.14.1 v11.14.0 v11.13.1 v11.13.0 v11.12.4 v11.12.3 v11.12.2 v11.12.1 v11.12.0 v11.11.0 v11.10.0 v11.9.3 v11.9.2 v11.9.1 v11.9.0 v11.8.0 v11.7.0 v11.6.0 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.3 v11.2.2 v11.2.1 v11.2.0 v11.1.0 v11.0.5 v11.0.4 v11.0.3 v11.0.2 v11.0.1 v11.0.0 v10.8.6 v10.8.4 v10.8.3 v10.8.2 v10.8.1 v10.8.0 v10.7.3 v10.7.0 v10.6.2 v10.6.1 v10.6.0 v10.5.1 v10.5.0 v10.4.2 v10.4.1 v10.4.0 v10.3.3 v10.3.2 v10.3.1 v10.3.0 v10.2.3 v10.2.2 v10.2.1 v10.2.0 v10.1.0 v10.0.2 v10.0.1 v9.4.0 v9.3.2 v9.3.1 v9.3.0 v9.2.4 v9.2.3 v9.2.1 v9.2.0 v9.1.4 v9.1.3 v9.1.2 v9.1.1 v9.0.3 v9.0.2 v9.0.0 v8.2.2 v8.2.0 v8.1.1 v8.0.1 v8.0.0 v7.4.3 v7.4.2 v7.4.1 v7.4.0 v7.3.8 v7.3.7 v7.3.6 v7.3.5 v7.3.4 v7.3.3 v7.3.2 elasticsearch_java v2.1.6 v2.1.4 email v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.6 v4.0.4 exec v3.1.4 v3.1.3 v3.1.2 v3.1.1 file v4.3.0 v4.2.6 v4.2.5 v4.2.4 v4.2.3 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 ganglia v3.0.6 v3.0.5 v3.0.4 v3.0.3 gelf v3.1.7 v3.1.4 v3.1.3 gemfire v2.0.7 v2.0.6 v2.0.5 google_bigquery v4.6.0 v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.1 v4.0.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 google_cloud_storage v4.5.0 v4.4.0 v4.3.0 v4.2.0 v4.1.0 v4.0.1 v4.0.0 v3.3.0 v3.2.1 v3.2.0 v3.1.0 v3.0.5 v3.0.4 v3.0.3 google_pubsub v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 graphite v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphtastic v3.0.4 v3.0.3 v3.0.2 v3.0.1 hipchat v4.0.6 v4.0.5 v4.0.3 http v6.0.0 v5.7.1 v5.7.0 v5.6.1 v5.6.0 v5.5.0 v5.4.1 v5.4.0 v5.3.0 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.2 v5.1.1 v5.1.0 v5.0.1 v5.0.0 v4.4.0 v4.3.4 v4.3.2 v4.3.1 v4.3.0 influxdb v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 irc v3.0.6 v3.0.5 v3.0.4 v3.0.3 jira v3.0.5 v3.0.4 v3.0.3 v3.0.2 jms v3.0.5 v3.0.3 v3.0.1 juggernaut v3.0.6 v3.0.5 v3.0.4 v3.0.3 kafka v11.6.0 v11.5.4 v11.5.3 v11.5.2 v11.5.1 v11.5.0 v11.4.2 v11.4.1 v11.4.0 v11.3.4 v11.3.3 v11.3.2 v11.3.1 v11.3.0 v11.2.1 v11.2.0 v11.1.0 v11.0.0 v10.12.1 v10.12.0 v10.11.0 v10.10.0 v10.9.0 v10.8.2 v10.8.1 v10.8.0 v10.7.7 v10.7.6 v10.7.5 v10.7.4 v10.7.3 v10.7.2 v10.7.1 v10.7.0 v10.6.0 v10.5.3 v10.5.2 v10.5.1 v10.5.0 v10.4.0 v10.3.0 v10.2.0 v10.1.0 v10.0.1 v10.0.0 v8.1.0 v8.0.2 v8.0.1 v8.0.0 v7.3.2 v7.3.1 v7.3.0 v7.2.1 v7.2.0 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.10 v7.0.8 v7.0.7 v7.0.6 v7.0.4 v7.0.3 v7.0.1 v7.0.0 v6.2.4 v6.2.2 v6.2.1 v6.2.0 librato v3.0.6 v3.0.5 v3.0.4 v3.0.2 loggly v6.0.0 v5.0.0 v4.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 logstash v1.0.3 v1.0.2 v1.0.1 v1.0.0 v0.0.5 v0.0.4 v0.0.3 v0.0.2 v0.0.1 lumberjack v3.1.9 v3.1.8 v3.1.7 v3.1.5 v3.1.3 metriccatcher v3.0.4 v3.0.3 v3.0.2 v3.0.1 monasca_log_api v2.0.1 v2.0.0 v1.0.4 v1.0.3 v1.0.2 mongodb v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 nagios v3.0.6 v3.0.5 v3.0.4 v3.0.3 nagios_nsca v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 neo4j v2.0.5 null v3.0.5 v3.0.4 v3.0.3 opentsdb v3.1.5 v3.1.4 v3.1.3 v3.1.2 pagerduty v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 pipe v3.0.6 v3.0.5 v3.0.4 v3.0.3 rabbitmq v7.4.0 v7.3.3 v7.3.2 v7.3.1 v7.3.0 v7.2.0 v7.1.1 v7.1.0 v7.0.3 v7.0.2 v7.0.1 v7.0.0 v5.1.1 v5.1.0 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.11 v4.0.10 v4.0.9 v4.0.8 rackspace v2.0.8 v2.0.7 v2.0.5 redis v5.2.0 v5.0.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.5 v3.0.4 redmine v3.0.4 v3.0.3 v3.0.2 v3.0.1 riak v3.0.4 v3.0.3 v3.0.2 v3.0.1 riemann v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 v3.0.2 v3.0.1 s3 v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.4.1 v4.4.0 v4.3.7 v4.3.6 v4.3.5 v4.3.4 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.0 v4.1.10 v4.1.9 v4.1.8 v4.1.7 v4.1.6 v4.1.5 v4.1.4 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.13 v4.0.12 v4.0.11 v4.0.10 v4.0.9 v4.0.8 slack v2.2.0 v2.1.1 v2.1.0 v2.0.3 sns v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v4.0.8 v4.0.7 v4.0.6 v4.0.5 v4.0.4 solr_http v3.0.5 v3.0.4 v3.0.3 v3.0.2 sqs v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.0 v5.1.2 v5.1.1 v5.1.0 v5.0.2 v5.0.1 v5.0.0 v4.0.3 v4.0.2 statsd v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 stdout v3.1.4 v3.1.3 v3.1.2 v3.1.1 stomp v3.0.9 v3.0.8 v3.0.7 v3.0.5 syslog v3.0.5 v3.0.4 v3.0.3 v3.0.2 tcp v7.0.0 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.0.2 v4.0.1 timber v1.0.3 udp v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 webhdfs v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 websocket v3.1.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 xmpp v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 zabbix v3.0.5 v3.0.4 v3.0.3 v3.0.2 zeromq v3.1.3 v3.1.2 v3.1.1 Filter plugins age v1.0.3 v1.0.2 v1.0.1 aggregate v2.10.0 v2.9.2 v2.9.1 v2.9.0 v2.8.0 v2.7.2 v2.7.1 v2.7.0 v2.6.4 v2.6.3 v2.6.1 v2.6.0 alter v3.0.3 v3.0.2 v3.0.1 anonymize v3.0.7 v3.0.6 v3.0.5 v3.0.4 bytes v1.0.3 v1.0.2 v1.0.1 v1.0.0 checksum v3.0.4 v3.0.3 cidr v3.1.3 v3.1.2 v3.1.1 v3.0.1 cipher v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.0.1 v3.0.0 v2.0.7 v2.0.6 clone v4.2.0 v4.1.1 v4.1.0 v4.0.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 collate v2.0.6 v2.0.5 csv v3.1.1 v3.1.0 v3.0.10 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 date v3.1.15 v3.1.14 v3.1.13 v3.1.12 v3.1.11 v3.1.9 v3.1.8 v3.1.7 de_dot v1.1.0 v1.0.4 v1.0.3 v1.0.2 v1.0.1 dissect v1.2.5 v1.2.4 v1.2.3 v1.2.2 v1.2.1 v1.2.0 v1.1.4 v1.1.2 v1.1.1 v1.0.12 v1.0.11 v1.0.9 dns v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.14 v3.0.13 v3.0.12 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 drop v3.0.5 v3.0.4 v3.0.3 elapsed v4.1.0 v4.0.5 v4.0.4 v4.0.3 v4.0.2 elastic_integration v8.17.1 v8.17.0 v8.16.1 v8.16.0 v0.1.17 v0.1.16 v0.1.15 v0.1.14 v0.1.13 v0.1.12 v0.1.11 v0.1.10 v0.1.9 v0.1.8 v0.1.7 v0.1.6 v0.1.5 v0.1.4 v0.1.3 v0.1.2 v0.1.0 v0.0.3 v0.0.2 v0.0.1 elasticsearch v4.1.0 v4.0.0 v3.16.2 v3.16.1 v3.16.0 v3.15.3 v3.15.2 v3.15.1 v3.15.0 v3.14.0 v3.13.0 v3.12.0 v3.11.1 v3.11.0 v3.10.0 v3.9.5 v3.9.4 v3.9.3 v3.9.0 v3.8.0 v3.7.1 v3.7.0 v3.6.1 v3.6.0 v3.5.0 v3.4.0 v3.3.1 v3.3.0 v3.2.1 v3.2.0 v3.1.6 v3.1.5 v3.1.4 v3.1.3 emoji v1.0.2 v1.0.1 environment v3.0.3 v3.0.2 v3.0.1 extractnumbers v3.0.3 v3.0.2 v3.0.1 fingerprint v3.4.4 v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.2 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.2 v3.1.1 v3.1.0 v3.0.4 geoip v7.3.1 v7.3.0 v7.2.13 v7.2.12 v7.2.11 v7.2.10 v7.2.9 v7.2.8 v7.2.7 v7.2.6 v7.2.5 v7.2.4 v7.2.3 v7.2.2 v7.2.1 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v6.0.5 v6.0.3 v6.0.2 v6.0.1 v6.0.0 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.1 grok v4.4.3 v4.4.2 v4.4.1 v4.4.0 v4.3.0 v4.2.0 v4.1.1 v4.1.0 v4.0.4 v4.0.3 v4.0.2 v4.0.1 v4.0.0 v3.4.4 v3.4.3 v3.4.2 v3.4.1 hashid v0.1.4 v0.1.3 v0.1.2 http v2.0.0 v1.6.0 v1.5.1 v1.5.0 v1.4.3 v1.4.2 v1.4.1 v1.4.0 v1.3.0 v1.2.1 v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.0 i18n v3.0.3 v3.0.2 v3.0.1 jdbc_static v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.1.0 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 v1.0.0 jdbc_streaming v5.5.1 v5.5.0 v5.4.11 v5.4.10 v5.4.9 v5.4.8 v5.4.7 v5.4.6 v5.4.5 v5.4.4 v5.4.3 v5.4.2 v5.4.1 v5.4.0 v5.3.0 v5.2.6 v5.2.5 v5.2.4 v5.2.3 v5.2.2 v5.2.1 v5.2.0 v5.1.10 v5.1.8 v5.1.7 v5.1.6 v5.1.5 v5.1.4 v5.1.3 v5.1.2 v5.1.1 v5.1.0 v5.0.7 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v1.0.10 v1.0.9 v1.0.7 v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 v1.0.1 json v3.2.1 v3.2.0 v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 json_encode v3.0.3 v3.0.2 v3.0.1 kv v4.7.0 v4.6.0 v4.5.0 v4.4.1 v4.4.0 v4.3.3 v4.3.2 v4.3.1 v4.3.0 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.3 v4.0.2 v4.0.1 math v1.1.1 v1.1.0 memcached v1.2.0 v1.1.0 v1.0.2 v1.0.1 v1.0.0 v0.1.2 v0.1.1 v0.1.0 metaevent v2.0.7 v2.0.5 metricize v3.0.3 v3.0.2 v3.0.1 metrics v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 multiline v3.0.4 v3.0.3 mutate v3.5.7 v3.5.6 v3.5.5 v3.5.4 v3.5.3 v3.5.2 v3.5.1 v3.5.0 v3.4.0 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.0 v3.1.7 v3.1.6 v3.1.5 oui v3.0.2 v3.0.1 prune v3.0.4 v3.0.3 v3.0.2 v3.0.1 punct v2.0.6 v2.0.5 range v3.0.3 v3.0.2 v3.0.1 ruby v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.4 v3.0.3 sleep v3.0.7 v3.0.6 v3.0.5 v3.0.4 split v3.1.8 v3.1.7 v3.1.6 v3.1.5 v3.1.4 v3.1.3 v3.1.2 syslog_pri v3.2.1 v3.2.0 v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 throttle v4.0.4 v4.0.3 v4.0.2 tld v3.1.3 v3.1.2 v3.1.1 v3.1.0 v3.0.3 v3.0.2 v3.0.1 translate v3.4.2 v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.0 v3.0.4 v3.0.3 v3.0.2 truncate v1.0.6 v1.0.5 v1.0.4 v1.0.3 v1.0.2 unique v3.0.0 v2.0.6 v2.0.5 urldecode v3.0.6 v3.0.5 v3.0.4 useragent v3.3.5 v3.3.4 v3.3.3 v3.3.2 v3.3.1 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 v3.1.3 v3.1.1 v3.1.0 uuid v3.0.5 v3.0.4 v3.0.3 xml v4.2.1 v4.2.0 v4.1.3 v4.1.2 v4.1.1 v4.1.0 v4.0.7 v4.0.6 v4.0.5 v4.0.4 v4.0.3 yaml v1.0.0 v0.1.1 zeromq v3.0.2 v3.0.1 Codec plugins avro v3.4.1 v3.4.0 v3.3.1 v3.3.0 v3.2.4 v3.2.3 v3.2.2 v3.2.1 v3.2.0 cef v6.2.8 v6.2.7 v6.2.6 v6.2.5 v6.2.4 v6.2.3 v6.2.2 v6.2.1 v6.2.0 v6.1.2 v6.1.1 v6.1.0 v6.0.1 v6.0.0 v5.0.6 v5.0.5 v5.0.4 v5.0.3 v5.0.2 v5.0.1 v5.0.0 v4.1.4 v4.1.3 cloudfront v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.3 v3.0.2 v3.0.1 cloudtrail v7.1.8 v7.1.7 v7.1.6 v7.1.5 v7.1.4 v7.1.3 v7.1.2 v7.1.1 v7.1.0 v7.0.1 v7.0.0 v3.0.5 v3.0.4 v3.0.3 v3.0.2 collectd v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 compress_spooler v2.0.6 v2.0.5 csv v1.1.0 v1.0.0 v0.1.4 v0.1.3 dots v3.0.6 v3.0.5 v3.0.3 edn v3.1.0 v3.0.6 v3.0.5 v3.0.3 edn_lines v3.1.0 v3.0.6 v3.0.5 v3.0.3 es_bulk v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 fluent v3.4.3 v3.4.2 v3.4.1 v3.4.0 v3.3.0 v3.2.0 v3.1.5 v3.1.4 v3.1.3 v3.1.2 graphite v3.0.6 v3.0.5 v3.0.4 v3.0.3 gzip_lines v3.0.4 v3.0.3 v3.0.2 v3.0.1 v3.0.0 json v3.1.1 v3.1.0 v3.0.5 v3.0.4 v3.0.3 json_lines v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 line v3.1.1 v3.1.0 v3.0.8 v3.0.7 v3.0.6 v3.0.5 v3.0.4 v3.0.3 msgpack v3.1.0 v3.0.7 v3.0.6 v3.0.5 v3.0.3 multiline v3.1.2 v3.1.1 v3.1.0 v3.0.11 v3.0.10 v3.0.9 v3.0.8 v3.0.7 v3.0.6 v3.0.5 netflow v4.3.2 v4.3.1 v4.3.0 v4.2.2 v4.2.1 v4.2.0 v4.1.2 v4.1.1 v4.1.0 v4.0.2 v4.0.1 v4.0.0 v3.14.1 v3.14.0 v3.13.2 v3.13.1 v3.13.0 v3.12.0 v3.11.4 v3.11.3 v3.11.2 v3.11.1 v3.11.0 v3.10.0 v3.9.1 v3.9.0 v3.8.3 v3.8.1 v3.8.0 v3.7.1 v3.7.0 v3.6.0 v3.5.2 v3.5.1 v3.5.0 v3.4.1 nmap v0.0.21 v0.0.20 v0.0.19 oldlogstashjson v2.0.7 v2.0.5 plain v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 protobuf v1.3.0 v1.2.9 v1.2.8 v1.2.5 v1.2.2 v1.2.1 v1.1.0 v1.0.5 v1.0.3 v1.0.2 rubydebug v3.1.0 v3.0.6 v3.0.5 v3.0.4 v3.0.3 s3plain v2.0.7 v2.0.6 v2.0.5 Elastic Serverless Forwarder for AWS Deploy serverless forwarder Configuration options Search connectors Connectors references Azure Blob Storage Box Confluence Dropbox GitHub Gmail Google Cloud Storage Google Drive GraphQL Jira Microsoft SQL MongoDB MySQL Network drive Notion OneDrive OpenText Documentum Oracle Outlook PostgreSQL Redis S3 Salesforce ServiceNow SharePoint Online SharePoint Server Slack Teams Zoom Self-managed connectors Running from a Docker container Running from the source code Docker Compose quickstart Tutorial Elastic managed connectors Build and customize connectors Connectors UI Connector APIs API tutorial Content syncs Extract and transform Content extraction Sync rules Document level security How DLS works DLS in Search Applications Management topics Scalability Security Troubleshooting Logs Use cases Internal knowledge search Known issues Release notes Elasticsearch for Apache Hadoop Setup and requirements Key features Requirements Installation Reference Architecture Configuration Runtime options Security Logging Map/Reduce integration Apache Hive integration Apache Spark support Mapping and types Error handlers Kerberos Hadoop metrics Performance considerations Cloud or restricted environments Resources License Elastic integrations Integrations quick reference 1Password Abnormal Security ActiveMQ Active Directory Entity Analytics Admin By Request EPM integration Airflow Akamai Apache Apache HTTP Server Apache Spark Apache Tomcat Tomcat NetWitness Logs API (custom) Arista NG Firewall Atlassian Atlassian Bitbucket Atlassian Confluence Atlassian Jira Auditd Auditd Logs Auditd Manager Auth0 authentik AWS Amazon CloudFront Amazon DynamoDB Amazon EBS Amazon EC2 Amazon ECS Amazon EMR AWS API Gateway Amazon GuardDuty AWS Health Amazon Kinesis Data Firehose Amazon Kinesis Data Stream Amazon MQ Amazon Managed Streaming for Apache Kafka (MSK) Amazon NAT Gateway Amazon RDS Amazon Redshift Amazon S3 Amazon S3 Storage Lens Amazon Security Lake Amazon SNS Amazon SQS Amazon VPC Amazon VPN AWS Bedrock AWS Billing AWS CloudTrail AWS CloudWatch AWS ELB AWS Fargate AWS Inspector AWS Lambda AWS Logs (custom) AWS Network Firewall AWS Route 53 AWS Security Hub AWS Transit Gateway AWS Usage AWS WAF Azure Activity logs App Service Application Gateway Application Insights metrics Application Insights metrics overview Application Insights metrics Application State Insights metrics Application State Insights metrics Azure logs (v2 preview) Azure OpenAI Billing metrics Container instance metrics Container registry metrics Container service metrics Custom Azure Logs Custom Blob Storage Input Database Account metrics Event Hub input Firewall logs Frontdoor Functions Microsoft Entra ID Monitor metrics Network Watcher VNet Network Watcher NSG Platform logs Resource metrics Virtual machines scaleset metrics Monitor metrics Container instance metrics Container service metrics Storage Account metrics Container registry metrics Virtual machines metrics Database Account metrics Spring Cloud logs Storage Account metrics Virtual machines metrics Virtual machines scaleset metrics Barracuda Barracuda WAF CloudGen Firewall logs BeyondInsight and Password Safe Integration BeyondTrust PRA BitDefender Bitwarden blacklens.io BBOT (Bighuge BLS OSINT Tool) Box Events Bravura Monitor Broadcom ProxySG Canva Cassandra CEL Custom API Ceph Check Point Check Point Email Check Point Harmony Endpoint Cilium Tetragon CISA Known Exploited Vulnerabilities Cisco Aironet ASA Duo FTD IOS ISE Meraki Nexus Secure Email Gateway Secure Endpoint Umbrella Cisco Meraki Metrics Citrix ADC Web App Firewall Claroty CTD Claroty xDome Cloudflare Cloudflare Cloudflare Logpush Cloud Asset Inventory CockroachDB Metrics Common Event Format (CEF) Containerd CoreDNS Corelight Couchbase CouchDB Cribl CrowdStrike CrowdStrike CrowdStrike Falcon Intelligence Cyberark CyberArk EPM Privileged Access Security Privileged Threat Analytics Cybereason CylanceProtect Logs Custom Websocket logs Darktrace Data Exfiltration Detection DGA Digital Guardian Docker DomainTools Real Time Unified Feeds Elastic APM Elastic Fleet Server Elastic Security Elastic Defend Defend for Containers Prebuilt Security Detection Rules Security Posture Management Kubernetes Security Posture Management (KSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Cloud Native Vulnerability Management (CNVM) Cloud Security Posture Management (CSPM) Kubernetes Security Posture Management (KSPM) Threat intelligence utilities Elastic Stack monitoring Beats Elasticsearch Elastic Agent Elastic Package Registry Kibana Logstash Elasticsearch Service Billing Endace Envoy Proxy ESET PROTECT ESET Threat Intelligence etcd Falco F5 BIG-IP File Integrity Monitoring Filestream (custom) FireEye Network Security First EPSS Forcepoint Web Security ForgeRock Fortinet FortiEDR Logs FortiGate Firewall Logs FortiMail FortiManager Logs Fortinet FortiProxy Gigamon GitHub GitLab Golang Google Google Santa Google SecOps Google Workspace Google Cloud Custom GCS Input GCP GCP Compute metrics GCP VPC Flow logs GCP Load Balancing metrics GCP Billing metrics GCP Redis metrics GCP DNS logs GCP Cloud Run metrics GCP PubSub metrics GCP Dataproc metrics GCP CloudSQL metrics GCP Audit logs GCP Storage metrics GCP Firewall logs GCP GKE metrics GCP Firestore metrics GCP Audit logs GCP Billing metrics GCP Cloud Run metrics GCP CloudSQL metrics GCP Compute metrics GCP Dataproc metrics GCP DNS logs GCP Firestore metrics GCP Firewall logs GCP GKE metrics GCP Load Balancing metrics GCP Metrics Input GCP PubSub logs (custom) GCP PubSub metrics GCP Redis metrics GCP Security Command Center GCP Storage metrics GCP VPC Flow logs GCP Vertex AI GoFlow2 logs Hadoop HAProxy Hashicorp Vault Host Traffic Anomalies HPE Aruba CX HTTP Endpoint logs (custom) IBM MQ IIS Imperva Imperva Cloud WAF Imperva SecureSphere Logs InfluxDb Infoblox BloxOne DDI NIOS Iptables Istio Jamf Compliance Reporter Jamf Pro Jamf Protect Jolokia Input Journald logs (custom) JumpCloud Kafka Kafka Kafka Logs (custom) Keycloak Kubernetes Kubernetes Container logs Controller Manager metrics Scheduler metrics Audit logs Proxy metrics API Server metrics Kube-state metrics Event metrics Kubelet metrics API Server metrics Audit logs Container logs Controller Manager metrics Event metrics Kube-state metrics Kubelet metrics OpenTelemetry Assets Proxy metrics Scheduler metrics LastPass Lateral Movement Detection Linux Metrics Living off the Land Attack Detection Logs (custom) Lumos Lyve Cloud macOS Unified Logs (custom) Mattermost Memcached Menlo Security Microsoft Microsoft 365 Microsoft Defender for Cloud Microsoft Defender for Endpoint Microsoft DHCP Microsoft DNS Server Microsoft Entra ID Entity Analytics Microsoft Exchange Online Message Trace Microsoft Exchange Server Microsoft Graph Activity Logs Microsoft M365 Defender Microsoft Office 365 Metrics Integration Microsoft Sentinel Microsoft SQL Server Mimecast Miniflux integration ModSecurity Audit MongoDB MongoDB Atlas MySQL MySQL MySQL Enterprise Nagios XI NATS NetFlow Records Netskope Network Beaconing Identification Network Packet Capture Nginx Nginx Nginx Ingress Controller Logs Nginx Ingress Controller OpenTelemetry Logs Nvidia GPU Monitoring Okta Okta Okta Entity Analytics Oracle Oracle Oracle WebLogic OpenAI OpenCanary Osquery Osquery Logs Osquery Manager Palo Alto Cortex XDR Networks Metrics Next-Gen Firewall Prisma Cloud Prisma Access pfSense PHP-FPM PingOne PingFederate Pleasant Password Server PostgreSQL Privileged Access Detection Prometheus Prometheus Promethues Input Proofpoint Proofpoint TAP Proofpoint On Demand Proofpoint Insider Threat Management (ITM) Pulse Connect Secure Qualys VMDR QNAP NAS RabbitMQ Logs Rapid7 Rapid7 InsightVM Rapid7 Threat Command Redis Redis Redis Enterprise Rubrik RSC Metrics Integration Sailpoint Identity Security Cloud Salesforce SentinelOne SentinelOne SentinelOne Cloud Funnel ServiceNow Slack Logs Snort Snyk SonicWall Firewall Sophos Sophos Sophos Central Spring Boot Splunk SpyCloud Enterprise Protection SQL Input Squid Logs SRX STAN Statsd Input Sublime Security Suricata StormShield SNS Symantec Endpoint Protection Symantec Endpoint Security Sysmon for Linux Sysdig Syslog Router Integration System System Audit Tanium TCP Logs (custom) Teleport Tenable Tenable.io Tenable.sc Threat intelligence AbuseCH AlienVault OTX Anomali Collective Intelligence Framework Custom Threat Intelligence Cybersixgill EclecticIQ Maltiverse Mandiant Advantage MISP OpenCTI Recorded Future ThreatQuotient ThreatConnect Threat Map Thycotic Secret Server Tines Traefik Trellix Trellix EDR Cloud Trellix ePO Cloud Trend Micro Trend Micro Vision One TYCHON Agentless UDP Logs (custom) Universal Profiling Universal Profiling Agent Universal Profiling Collector Universal Profiling Symbolizer Varonis integration Vectra Detect Vectra RUX VMware Carbon Black Cloud Carbon Black EDR vSphere WatchGuard Firebox WebSphere Application Server Windows Windows Custom Windows ETW logs Windows Event Logs (custom) Wiz Zeek ZeroFox Zero Networks ZooKeeper Metrics Zoom Zscaler Zscaler Internet Access Zscaler Private Access Supported Serverless project types Level of support Kibana Kibana accessibility statement Configuration Elastic Cloud Kibana settings General settings AI Assistant settings Alerting and action settings APM settings in Kibana Banners settings Cases settings Fleet settings i18n settings Logging settings Logs settings Map settings Metrics settings Monitoring settings Reporting settings Search sessions settings Security settings Spaces settings Task Manager settings Telemetry settings URL drilldown settings Advanced settings Kibana audit events Connectors Amazon Bedrock Cases CrowdStrike D3 Security Elastic Managed LLM Email Google Gemini IBM Resilient Index Jira Microsoft Defender for Endpoint Microsoft Teams Observability AI Assistant OpenAI Opsgenie PagerDuty SentinelOne Server log ServiceNow ITSM ServiceNow SecOps ServiceNow ITOM Swimlane Slack TheHive Tines Torq Webhook Webhook - Case Management xMatters Preconfigured connectors Kibana plugins Command line tools kibana-encryption-keys kibana-verification-code Osquery exported fields Osquery Manager prebuilt packs Elasticsearch plugins Plugin management Installing plugins Custom URL or file system Installing multiple plugins Mandatory plugins Listing, removing and updating installed plugins Other command line parameters Plugins directory Manage plugins using a configuration file Upload custom plugins and bundles Managing plugins and extensions through the API API extension plugins Analysis plugins ICU analysis plugin ICU analyzer ICU normalization character filter ICU tokenizer ICU normalization token filter ICU folding token filter ICU collation token filter ICU collation keyword field ICU transform token filter Japanese (kuromoji) analysis plugin kuromoji analyzer kuromoji_iteration_mark character filter kuromoji_tokenizer kuromoji_baseform token filter kuromoji_part_of_speech token filter kuromoji_readingform token filter kuromoji_stemmer token filter ja_stop token filter kuromoji_number token filter hiragana_uppercase token filter katakana_uppercase token filter kuromoji_completion token filter Korean (nori) analysis plugin nori analyzer nori_tokenizer nori_part_of_speech token filter nori_readingform token filter nori_number token filter Phonetic analysis plugin phonetic token filter Smart Chinese analysis plugin Reimplementing and extending the analyzers smartcn_stop token filter Stempel Polish analysis plugin Reimplementing and extending the analyzers polish_stop token filter Ukrainian analysis plugin Discovery plugins EC2 Discovery plugin Using the EC2 discovery plugin Best Practices in AWS Azure Classic discovery plugin Azure Virtual Machine discovery Setup process for Azure Discovery Scaling out GCE Discovery plugin GCE Virtual Machine discovery GCE Network Host Setting up GCE Discovery Cloning your existing machine Using GCE zones Filtering by tags Changing default transport port GCE Tips Testing GCE Mapper plugins Mapper size plugin Using the _size field Mapper murmur3 plugin Using the murmur3 field Mapper annotated text plugin Using the annotated-text field Data modelling tips Using the annotated highlighter Limitations Snapshot/restore repository plugins Hadoop HDFS repository plugin Getting started with HDFS Configuration properties Hadoop security Store plugins Store SMB plugin Working around a bug in Windows SMB and Java on windows Integrations Query languages QueryDSL Query and filter context Compound queries Boolean Boosting Constant score Disjunction max Function score Full text queries Intervals Match Match boolean prefix Match phrase Match phrase prefix Combined fields Multi-match Query string Simple query string Geo queries Geo-bounding box Geo-distance Geo-grid Geo-polygon Geoshape Shape queries Shape Joining queries Nested Has child Has parent Parent ID Match all Span queries Span containing Span field masking Span first Span multi-term Span near Span not Span or Span term Span within Vector queries Knn Sparse vector Semantic Text expansion Weighted tokens Specialized queries Distance feature more_like_this Percolate Rank feature Script Script score Wrapper Pinned query Rule Term-level queries Exists Fuzzy IDs Prefix Range Regexp Term Terms Terms set Wildcard minimum_should_match parameter rewrite parameter Regular expression syntax ES|QL Syntax reference Basic syntax Commands Source commands Processing commands Functions and operators Aggregation functions Grouping functions Conditional functions and expressions Date-time functions IP functions Math functions Search functions Spatial functions String functions Type conversion functions Multivalue functions Operators Advanced workflows Extract data with DISSECT and GROK Combine data with ENRICH Join data with LOOKUP JOIN Types and fields Implicit casting Time spans Metadata fields Multivalued fields Limitations Examples SQL SQL language Lexical structure SQL commands DESCRIBE TABLE SELECT SHOW CATALOGS SHOW COLUMNS SHOW FUNCTIONS SHOW TABLES Data types Index patterns Frozen indices Functions and operators Comparison operators Logical operators Math operators Cast operators LIKE and RLIKE operators Aggregate functions Grouping functions Date/time and interval functions and operators Full-text search functions Mathematical functions String functions Type conversion functions Geo functions Conditional functions and expressions System functions Reserved keywords SQL limitations EQL Syntax reference Function reference Pipe reference Example: Detect threats with EQL Kibana Query Language Scripting languages Painless A brief painless walkthrough Use painless scripts in runtime fields Using datetime in Painless How painless dispatches function Painless debugging Painless API examples Using ingest processors in Painless Painless language specification Comments Keywords Literals Identifiers Variables Types Casting Operators Operators: General Operators: Numeric Operators: Boolean Operators: Reference Operators: Array Statements Scripts Functions Lambdas Regexes Painless contexts Context example data Runtime fields context Ingest processor context Update context Update by query context Reindex context Sort context Similarity context Weight context Score context Field context Filter context Minimum should match context Metric aggregation initialization context Metric aggregation map context Metric aggregation combine context Metric aggregation reduce context Bucket script aggregation context Bucket selector aggregation context Analysis Predicate Context Watcher condition context Watcher transform context ECS reference Using ECS Getting started Guidelines and best practices Conventions Implementation patterns Mapping network events Design principles Custom fields ECS field reference Base fields Agent fields Autonomous System fields Client fields Cloud fields Cloud fields usage and examples Code Signature fields Container fields Data Stream fields Destination fields Device fields DLL fields DNS fields ECS fields ELF Header fields Email fields Error fields Event fields FaaS fields File fields Geo fields Group fields Hash fields Host fields HTTP fields Interface fields Log fields Mach-O Header fields Network fields Observer fields Orchestrator fields Organization fields Operating System fields Package fields PE Header fields Process fields Registry fields Related fields Risk information fields Rule fields Server fields Service fields Service fields usage and examples Source fields Threat fields Threat fields usage and examples TLS fields Tracing fields URL fields User fields User fields usage and examples User agent fields VLAN fields Volume fields Vulnerability fields x509 Certificate fields ECS categorization fields event.kind event.category event.type event.outcome Using the categorization fields Migrating to ECS Products and solutions that support ECS Map custom data to ECS ECS & OpenTelemetry OTel Alignment Overview Field & Attributes Alignment Additional information Questions and answers Contributing to ECS Generated artifacts Release notes ECS logging libraries ECS Logging .NET Get started .NET model of ECS Usage A note on the Metadata property Extending EcsDocument Formatters Serilog formatter NLog layout log4net Data shippers Elasticsearch security ECS ingest channels Elastic.Serilog.Sinks Elastic.Extensions.Logging BenchmarkDotnet exporter Enrichers APM serilog enricher APM NLog layout ECS Logging Go (Logrus) Get started ECS Logging Go (Zap) Get started ECS Logging Go (Zerolog) Get started ECS Logging Java Get started Structured logging with log4j2 ECS Logging Node.js ECS Logging with Pino ECS Logging with Winston ECS Logging with Morgan ECS Logging PHP Get started ECS Logging Python Installation ECS Logging Ruby Get started Data analysis Supplied configurations Apache anomaly detection configurations APM anomaly detection configurations Auditbeat anomaly detection configurations Logs anomaly detection configurations Metricbeat anomaly detection configurations Metrics anomaly detection configurations Nginx anomaly detection configurations Security anomaly detection configurations Uptime anomaly detection configurations Function reference Count functions Geographic functions Information content functions Metric functions Rare functions Sum functions Time functions Metrics reference Host metrics Container metrics Kubernetes pod metrics AWS metrics Canvas function reference TinyMath functions Text analysis components Analyzer reference Fingerprint Keyword Language Pattern Simple Standard Stop Whitespace Tokenizer reference Character group Classic Edge n-gram Keyword Letter Lowercase N-gram Path hierarchy Pattern Simple pattern Simple pattern split Standard Thai UAX URL email Whitespace Token filter reference Apostrophe ASCII folding CJK bigram CJK width Classic Common grams Conditional Decimal digit Delimited payload Dictionary decompounder Edge n-gram Elision Fingerprint Flatten graph Hunspell Hyphenation decompounder Keep types Keep words Keyword marker Keyword repeat KStem Length Limit token count Lowercase MinHash Multiplexer N-gram Normalization Pattern capture Pattern replace Phonetic Porter stem Predicate script Remove duplicates Reverse Shingle Snowball Stemmer Stemmer override Stop Synonym Synonym graph Trim Truncate Unique Uppercase Word delimiter Word delimiter graph Character filter reference HTML strip Mapping Pattern replace Normalizers Aggregations Bucket Adjacency matrix Auto-interval date histogram Categorize text Children Composite Date histogram Date range Diversified sampler Filter Filters Frequent item sets Geo-distance Geohash grid Geohex grid Geotile grid Global Histogram IP prefix IP range Missing Multi Terms Nested Parent Random sampler Range Rare terms Reverse nested Sampler Significant terms Significant text Terms Time series Variable width histogram Subtleties of bucketing range fields Metrics Avg Boxplot Cardinality Extended stats Geo-bounds Geo-centroid Geo-line Cartesian-bounds Cartesian-centroid Matrix stats Max Median absolute deviation Min Percentile ranks Percentiles Rate Scripted metric Stats String stats Sum T-test Top hits Top metrics Value count Weighted avg Pipeline Average bucket Bucket script Bucket count K-S test Bucket correlation Bucket selector Bucket sort Change point Cumulative cardinality Cumulative sum Derivative Extended stats bucket Inference bucket Max bucket Min bucket Moving function Moving percentiles Normalize Percentiles bucket Serial differencing Stats bucket Sum bucket Search UI Ecommerce Autocomplete Product Carousels Category Page Product Detail Page Search Page Tutorials Search UI with Elasticsearch Setup Elasticsearch Setup an Index Install Connector Configure and Run Search UI Using in Production Customise Request Search UI with App Search Search UI with Workplace Search Basic usage Using search-as-you-type Adding search bar to header Debugging Advanced usage Conditional Facets Changing component behavior Analyzing performance Creating Components Building a custom connector NextJS Integration API reference Core API Configuration State Actions React API WithSearch & withSearch useSearch hook React components Results Result ResultsPerPage Facet Sorting Paging PagingInfo ErrorBoundary Connectors API Elasticsearch Connector Site Search Connector Workplace Search Connector Plugins Troubleshooting Cloud Elastic Cloud Enterprise RESTful API API calls How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider Create an API client API examples Setting up your environment A first API call: What deployments are there? Create your first deployment: Elasticsearch and Kibana Applying a new plan: Resize and add high availability Updating a deployment: Checking on progress Applying a new deployment configuration: Upgrade Enable more stack features: Add Enterprise Search to a deployment Dipping a toe into platform automation: Generate a roles token Customize your deployment Remove unwanted deployment templates and instance configurations Secure your settings Changes to index allocation and API Scripts elastic-cloud-enterprise.sh install elastic-cloud-enterprise.sh upgrade elastic-cloud-enterprise.sh reset-adminconsole-password elastic-cloud-enterprise.sh add-stack-version Third party dependencies Elastic Cloud Hosted Hardware GCP instance VM configurations Selecting the right configuration for you GCP default provider Regional availability AWS VM configurations Selecting the right configuration for you AWS default Regional availability Azure VM configurations Selecting the right configuration for you Azure default Regional availability Regions Available regions, deployment templates, and instance configurations RESTful API Principles Rate limiting Work with Elastic APIs Access the Elasticsearch API console How to access the API Access the API using Elastic Cloud Control Access the API from the command line Access the API using a REST application Access the API using the Elastic Cloud Terraform provider API examples Deployment CRUD operations Other deployment operations Organization operations Changes to index allocation and API Elastic Cloud on Kubernetes API Reference Third-party dependencies ECK configuration flags Elasticsearch upgrade predicates Elastic cloud control (ECCTL) Installing Configuring Authentication Example: A shared configuration file Environment variables Multiple configuration files Output format Custom formatting Usage examples List deployments Create a deployment Update a deployment Delete a deployment Command reference ecctl ecctl auth ecctl auth key ecctl auth key create ecctl auth key delete ecctl auth key list ecctl auth key show ecctl comment ecctl comment create ecctl comment delete ecctl comment list ecctl comment show ecctl comment update ecctl deployment ecctl deployment create ecctl deployment delete ecctl deployment elasticsearch ecctl deployment elasticsearch keystore ecctl deployment elasticsearch keystore show ecctl deployment elasticsearch keystore update ecctl deployment extension ecctl deployment extension create ecctl deployment extension delete ecctl deployment extension list ecctl deployment extension show ecctl deployment extension update ecctl deployment list ecctl deployment plan ecctl deployment plan cancel ecctl deployment resource ecctl deployment resource delete ecctl deployment resource restore ecctl deployment resource shutdown ecctl deployment resource start-maintenance ecctl deployment resource start ecctl deployment resource stop-maintenance ecctl deployment resource stop ecctl deployment resource upgrade ecctl deployment restore ecctl deployment resync ecctl deployment search ecctl deployment show ecctl deployment shutdown ecctl deployment template ecctl deployment template create ecctl deployment template delete ecctl deployment template list ecctl deployment template show ecctl deployment template update ecctl deployment traffic-filter ecctl deployment traffic-filter association ecctl deployment traffic-filter association create ecctl deployment traffic-filter association delete ecctl deployment traffic-filter create ecctl deployment traffic-filter delete ecctl deployment traffic-filter list ecctl deployment traffic-filter show ecctl deployment traffic-filter update ecctl deployment update ecctl generate ecctl generate completions ecctl generate docs ecctl init ecctl platform ecctl platform allocator ecctl platform allocator list ecctl platform allocator maintenance ecctl platform allocator metadata ecctl platform allocator metadata delete ecctl platform allocator metadata set ecctl platform allocator metadata show ecctl platform allocator search ecctl platform allocator show ecctl platform allocator vacate ecctl platform constructor ecctl platform constructor list ecctl platform constructor maintenance ecctl platform constructor resync ecctl platform constructor show ecctl platform enrollment-token ecctl platform enrollment-token create ecctl platform enrollment-token delete ecctl platform enrollment-token list ecctl platform info ecctl platform instance-configuration ecctl platform instance-configuration create ecctl platform instance-configuration delete ecctl platform instance-configuration list ecctl platform instance-configuration pull ecctl platform instance-configuration show ecctl platform instance-configuration update ecctl platform proxy ecctl platform proxy filtered-group ecctl platform proxy filtered-group create ecctl platform proxy filtered-group delete ecctl platform proxy filtered-group list ecctl platform proxy filtered-group show ecctl platform proxy filtered-group update ecctl platform proxy list ecctl platform proxy settings ecctl platform proxy settings show ecctl platform proxy settings update ecctl platform proxy show ecctl platform repository ecctl platform repository create ecctl platform repository delete ecctl platform repository list ecctl platform repository show ecctl platform role ecctl platform role create ecctl platform role delete ecctl platform role list ecctl platform role show ecctl platform role update ecctl platform runner ecctl platform runner list ecctl platform runner resync ecctl platform runner search ecctl platform runner show ecctl stack ecctl stack delete ecctl stack list ecctl stack show ecctl stack upload ecctl user ecctl user create ecctl user delete ecctl user disable ecctl user enable ecctl user key ecctl user key delete ecctl user key list ecctl user key show ecctl user list ecctl user show ecctl user update ecctl version Contributing Release notes Glossary Loading Docs / Reference / Elasticsearch and index management / Command line tools / elasticsearch-reset-password The elasticsearch-reset-password command resets the passwords of users in the native realm and built-in users. Synopsis bin/elasticsearch-reset-password [-a, --auto] [-b, --batch] [-E <KeyValuePair] [-f, --force] [-h, --help] [-i, --interactive] [-s, --silent] [-u, --username] [--url] [-v, --verbose] Description Use this command to reset the password of any user in the native realm or any built-in user. By default, a strong password is generated for you. To explicitly set a password, run the tool in interactive mode with -i . The command generates (and subsequently removes) a temporary user in the file realm to run the request that changes the user password. Important You cannot use this tool if the file realm is disabled in your elasticsearch.yml file. This command uses an HTTP connection to connect to the cluster and run the user management requests. The command automatically attempts to establish the connection over HTTPS by using the xpack.security.http.ssl settings in the elasticsearch.yml file. If you do not use the default configuration directory location, ensure that the ES_PATH_CONF environment variable returns the correct path before you run the elasticsearch-reset-password command. You can override settings in your elasticsearch.yml file by using the -E command option. For more information about debugging connection failures, see Setup-passwords command fails due to connection failure . Parameters -a, --auto Resets the password of the specified user to an auto-generated strong password. (Default) -b, --batch Runs the reset password process without prompting the user for verification. -E <KeyValuePair> Configures a standard Elasticsearch or X-Pack setting. -f, --force Forces the command to run against an unhealthy cluster. -h, --help Returns all of the command parameters. -i, --interactive Prompts for the password of the specified user. Use this option to explicitly set a password. -s --silent Shows minimal output in the console. -u, --username The username of the native realm user or built-in user. --url Specifies the base URL (hostname and port of the local node) that the tool uses to submit API requests to Elasticsearch. The default value is determined from the settings in your elasticsearch.yml file. If xpack.security.http.ssl.enabled is set to true , you must specify an HTTPS URL. -v --verbose Shows verbose output in the console. Examples The following example resets the password of the elastic user to an auto-generated value and prints the new password in the console: bin/elasticsearch-reset-password -u elastic The following example resets the password of a native user with username user1 after prompting in the terminal for the desired password: bin/elasticsearch-reset-password --username user1 -i The following example resets the password of a native user with username user2 to an auto-generated value prints the new password in the console. The specified URL indicates where the elasticsearch-reset-password tool attempts to reach the local Elasticsearch node: bin/elasticsearch-reset-password --url \"https://172.0.0.3:9200\" --username user2 -i Previous elasticsearch-reconfigure-node Next elasticsearch-saml-metadata Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Synopsis Description Parameters Examples Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"elasticsearch-reset-password | Elastic Documentation","url":"https://www.elastic.co/docs/reference/elasticsearch/command-line-tools/reset-password","meta_description":"The elasticsearch-reset-password command resets the passwords of users in the native realm and built-in users. Use this command to reset the password..."}
+{"text":"Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / User authentication / Operator privileges / Operator-only functionality ECE ECK Elastic Cloud Hosted Indirect use only This feature is designed for indirect use by Elastic Cloud Hosted, Elastic Cloud Enterprise, and Elastic Cloud on Kubernetes. Direct use is not supported. Operator privileges provide protection for APIs and dynamic cluster settings. Any API or cluster setting that is protected by operator privileges is known as operator-only functionality . When the operator privileges feature is enabled, operator-only APIs can be executed only by operator users. Likewise, operator-only settings can be updated only by operator users. The list of operator-only APIs and dynamic cluster settings are pre-determined in the codebase. The list may evolve in future releases but it is otherwise fixed in a given Elasticsearch version. Operator-only APIs Voting configuration exclusions Delete license Update license Create or update autoscaling policy Delete autoscaling policy Create or update desired nodes Get desired nodes Delete desired nodes Get desired balance Reset desired balance Operator-only dynamic cluster settings All IP filtering settings The following dynamic machine learning settings : xpack.ml.node_concurrent_job_allocations xpack.ml.max_machine_memory_percent xpack.ml.use_auto_machine_memory_percent xpack.ml.max_lazy_ml_nodes xpack.ml.process_connect_timeout xpack.ml.nightly_maintenance_requests_per_second xpack.ml.max_ml_node_size xpack.ml.enable_config_migration xpack.ml.persist_results_max_retries The cluster.routing.allocation.disk.threshold_enabled setting The following recovery settings for managed services : node.bandwidth.recovery.operator.factor node.bandwidth.recovery.operator.factor.read node.bandwidth.recovery.operator.factor.write node.bandwidth.recovery.operator.factor.max_overcommit Previous Configure operator privileges Next Operator privileges for snapshot and restore Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Operator-only APIs Operator-only dynamic cluster settings Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Operator-only functionality | Elastic Docs","url":"https://www.elastic.co/docs/deploy-manage/users-roles/cluster-or-deployment-auth/operator-only-functionality","meta_description":"Operator privileges provide protection for APIs and dynamic cluster settings. Any API or cluster setting that is protected by operator privileges is known..."}
+{"text":"Docs Release notes Troubleshoot Reference Deploy and manage Get started Solutions and use cases Manage data Explore and analyze Manage your Cloud account and preferences Troubleshoot Extend and contribute Release notes Reference Deploy Detailed deployment comparison Elastic Cloud Compare Cloud Hosted and Serverless Sign up Subscribe from a marketplace AWS Marketplace Azure Native ISV Service Google Cloud Platform Marketplace Heroku Install the add-on Remove the add-on Access the console Work with Elasticsearch Migrate between plans Hardware Regions Elastic Cloud Serverless Create a serverless project Regions Project settings Elastic Cloud Hosted Create an Elastic Cloud Hosted deployment Available stack versions Access Kibana Plan for production Manage deployments Configure Change hardware profiles Customize deployment components Edit Elastic Stack settings Add plugins and extensions Upload custom plugins and bundles Manage plugins and extensions through the API Custom endpoint aliases Manage your Integrations server Switch from APM to Integrations Server payload Find your Cloud ID vCPU boosting and credits Change hardware Manage deployments using the Elastic Cloud API Keep track of deployment activity Restrictions and known problems Tools and APIs Elastic Cloud Enterprise Service-oriented architecture Deploy an orchestrator Prepare your environment Hardware prerequisites Software prerequisites System configuration Networking prerequisites Users and permissions prerequisites High availability Separation of roles Load balancers JVM heap size Wildcard DNS record Manage your allocators capacity Install ECE Identify the deployment scenario Configure your operating system Ubuntu RHEL SUSE Installation procedures Deploy a small installation Deploy a medium installation Deploy a large installation Deploy using Podman Migrate ECE to Podman hosts Migrating to Podman 5 Post-installation steps Install ECE on additional hosts Manage roles tokens Ansible playbook Statistics collected by Elastic Cloud Enterprise Air-gapped install With your private Docker registry Without a private Docker registry Available Docker images Configure ECE Assign roles to hosts System deployments configuration Default system deployment versions Manage deployment templates Tag your allocators Edit instance configurations Create instance configurations Create templates Configure default templates Configure index management Data tiers and autoscaling support Integrations server support Default instance configurations Change the ECE API URL Change endpoint URLs Enable custom endpoint aliases Configure allocator affinity Change allocator disconnect timeout Manage Elastic Stack versions Include additional Kibana plugins Log into the Cloud UI Manage deployments Deployment templates Create a deployment Access Kibana Connect to Elasticsearch Configure Customize deployment components Edit stack settings Elasticsearch user settings Kibana user settings APM user settings Enterprise search user settings Resize deployment Configure plugins and extensions Add custom bundles and plugins Custom endpoint aliases Resource overrides Advanced cluster configuration Search and filter deployments Keep track of deployment activity Manage your Integrations Server Enable Integrations Server through the API Switch from APM to Integrations Server payload Tools and APIs Elastic Cloud on Kubernetes Deploy an orchestrator Install YAML manifests Helm chart Required RBAC permissions Deploy ECK on Openshift Deploy the operator Deploy an Elasticsearch instance with a route Deploy a Kibana instance with a route Deploy Docker images with anyuid SCC Grant privileged permissions to Beats Grant host access permission to Elastic Agent Deploy ECK on GKE Autopilot FIPS compatibility Air-gapped environments Configure Apply configuration settings Configure the validating webhook Restrict cross-namespace resource associations Service meshes Istio Linkerd Webhook namespace selectors Manage deployments Deploy an Elasticsearch cluster Deploy a Kibana instance Elastic Stack Helm chart Applying updates Accessing services Configure deployments Elasticsearch configuration Nodes orchestration Storage recommendations Node configuration Volume claim templates Virtual memory Settings managed by ECK Custom configuration files and plugins Init containers for plugin downloads Update strategy Pod disruption budget Advanced Elasticsearch node scheduling Readiness probe Pod PreStop hook Security context Requests routing to Elasticsearch nodes Kibana configuration Connect to an Elasticsearch cluster Advanced configuration Install Kibana plugins Customize pods Manage compute resources Recipes Connect to external Elastic resources Elastic Stack configuration policies Orchestrate other Elastic applications APM server Use an Elasticsearch cluster managed by ECK Advanced configuration Connect to the APM Server Standalone Elastic Agent Quickstart Configuration Configuration examples Fleet-managed Elastic Agent Quickstart Configuration Configuration Examples Known limitations Elastic Maps Server Deploy Elastic Maps Server Map data Advanced configuration Elastic Maps HTTP configuration Beats Quickstart Configuration Configuration Examples Troubleshooting Logstash Quickstart Configuration Securing Logstash API Logstash plugins Configuration examples Update Strategy Advanced configuration Create custom images Tools and APIs Self-managed cluster Deploy an Elasticsearch cluster Local installation (quickstart) Important system configuration Configure system settings Disable swapping Increase the file descriptor limit Increase virtual memory Increase max number of threads DNS cache settings Ensure JNA temporary directory permits executables Decrease the TCP retransmission timeout Bootstrap checks Install on Linux or MacOS Install on Windows Install with Debian package Install with RPM package Install with Docker Single-node cluster Multi-node cluster Production settings Configure Configure Elasticsearch Important settings configuration Add plugins Install Kibana Linux and MacOS Windows Debian RPM Docker Configure Kibana Access Kibana Air gapped install Tools and APIs Distributed architecture Clusters, nodes, and shards Node roles Reading and writing documents Shard allocation, relocation, and recovery Shard allocation awareness Index-level shard allocation Delaying allocation when a node leaves The shard request cache Discovery and cluster formation Discovery Quorum-based decision making Voting configurations Bootstrapping a cluster Cluster state Cluster fault detection Kibana task management Production guidance Run Elasticsearch in production Design for resilience Resilience in small clusters Resilience in larger clusters Resilience in ECH and ECE Scaling considerations Performance optimizations General recommendations Tune for indexing speed Tune for search speed Tune approximate kNN search Tune for disk usage Size your shards Run Kibana in production High availability and load balancing Configure memory Manage background tasks Optimize alerting performance Reporting production considerations Reference architectures Hot/Frozen - High Availability Stack settings Backup, high availability, and resilience tools Snapshot and restore Manage snapshot repositories Self-managed Azure repository Google Cloud Storage repository S3 repository Shared file system repository Read-only URL repository Source-only repository Elastic Cloud Hosted Configure a snapshot repository using AWS S3 Configure a snapshot repository using GCS Configure a snapshot repository using Azure Blob storage Access isolation for the found-snapshots repository Repository isolation on Azure Repository isolation on AWS and GCP Elastic Cloud Enterprise AWS S3 repository Google Cloud Storage (GCS) repository Azure Storage repository Minio on-premise repository Elastic Cloud on Kubernetes Create snapshots Restore a snapshot Restore a snapshot across clusters Restore snapshot into a new deployment Restore snapshot into an existing deployment Restore snapshots containing searchable snapshots indices across clusters Searchable snapshots Cross-cluster replication Set up cross-cluster replication Prerequisites Connect to a remote cluster Configure privileges for cross-cluster replication Create a follower index to replicate a specific index Create an auto-follow pattern to replicate time series indices Manage cross-cluster replication Inspect replication statistics Pause and resume replication Recreate a follower index Terminate replication Manage auto-follow patterns Create auto-follow patterns Retrieve auto-follow patterns Pause and resume auto-follow patterns Delete auto-follow patterns Upgrading clusters Uni-directional index following Bi-directional index following Uni-directional disaster recovery Prerequisites Failover when clusterA is down Failback when clusterA comes back Bi-directional disaster recovery Initial setup Failover when clusterA is down Failback when clusterA comes back Perform update or delete by query Autoscaling In ECE and ECH In ECK Autoscaling deciders Trained model autoscaling Security Secure your orchestrator Elastic Cloud Enterprise Manage security certificates Allow x509 Certificates Signed with SHA-1 Configure the TLS version Migrate ECE on Podman hosts to SELinux enforce Elastic Cloud on Kubernetes Secure your cluster or deployment Self-managed security setup Automatic security setup Minimal security setup Set up transport TLS Set up HTTPS Configure security in Kibana Manage TLS encryption Self-managed Update TLS certificates With the same CA With a different CA Mutual authentication Supported SSL/TLS versions by JDK version Enabling cipher suites for stronger encryption ECK Manage HTTP certificates on ECK Manage transport certificates on ECK Traffic filtering IP traffic filtering In ECH or ECE Manage traffic filters through the API In ECK and Self Managed Private link traffic filters AWS PrivateLink traffic filters Azure Private Link traffic filters GCP Private Service Connect traffic filters Claim traffic filter link ID ownership through the API Kubernetes network policies Elastic Cloud Static IPs Kibana session management Encrypt your deployment data Use a customer-managed encryption key Secure your settings Secure settings on ECK Secure Kibana saved objects Security event audit logging Enable audit logging Configure audit logging Elasticsearch audit events ignore policies Elasticsearch logfile output Audit Elasticsearch search queries Correlate audit events FIPS 140-2 compliance Secure other Elastic Stack components Securing HTTP client applications Limitations Users and roles Cloud organization Manage users User roles and privileges Configure SAML SSO Okta Microsoft Entra ID ECE orchestrator Manage system passwords Manage users and roles Native users Active Directory LDAP SAML Configure SSO for deployments Serverless project custom roles Cluster or deployment Quickstart User authentication Authentication realms Realm chains Security domains Internal authentication Native File-based External authentication Active Directory JWT Kerberos LDAP OpenID Connect With Azure, Google, or Okta SAML With Microsoft Entra ID PKI Custom realms Built-in users Change passwords Orchestrator-managed users ECH and ECE ECK managed credentials Kibana authentication Kibana access agreement Anonymous access Token-based authentication services Service accounts Internal users Operator privileges Configure operator privileges Operator-only functionality Operator privileges for snapshot and restore User profiles Looking up users without authentication Controlling the user cache Manage authentication for multiple clusters User roles Built-in roles Defining roles Role structure For data streams and aliases Using Kibana Role restriction Elasticsearch privileges Kibana privileges Map users and groups to roles Role mapping properties Authorization delegation Authorization plugins Control access at the document and field level Submit requests on behalf of other users Spaces API keys Elasticsearch API keys Serverless project API keys Elastic Cloud API keys Elastic Cloud Enterprise API keys Connectors Remote clusters Elastic Cloud Hosted Within the same Elastic Cloud organization With a different Elastic Cloud organization With Elastic Cloud Enterprise With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Elastic Cloud Enterprise Within the same ECE environment With a different ECE environment With Elastic Cloud With a self-managed cluster With Elastic Cloud on Kubernetes Edit or remove a trusted environment Migrate the CCS deployment template Self-managed Elastic Stack Add remote clusters using API key authentication Add remote clusters using TLS certificate authentication Migrate from certificate to API key authentication Remote cluster settings Elastic Cloud on Kubernetes Monitoring AutoOps How to access AutoOps AutoOps events Views Overview Deployment Nodes Indices Shards Template Optimizer Notifications Settings Event Settings Dismiss Events AutoOps regions AutoOps FAQ Stack monitoring Enable on ECH and ECE Enable on ECK Self-managed: Elasticsearch Collecting monitoring data with Elastic Agent Collecting monitoring data with Metricbeat Collecting log data with Filebeat Monitoring in a production environment Legacy collection methods Collectors Exporters Local exporters HTTP exporters Pausing data collection Self-managed: Kibana Collect monitoring data with Elastic Agent Collect monitoring data with Metricbeat Legacy collection methods Access monitoring data in Kibana Visualizing monitoring data Beats metrics Elasticsearch metrics Kibana metrics Logstash metrics Troubleshooting Stack monitoring alerts Configuring monitoring data streams and indices Configuring data streams created by Elastic Agent Configuring data streams created by Metricbeat 8 Configuring indices created by Metricbeat 7 or internal collection Cloud deployment health Performance metrics on Elastic Cloud JVM memory pressure indicator Kibana task manager monitoring Monitoring orchestrators ECK operator metrics Enabling the metrics endpoint Securing the metrics endpoint Prometheus requirements ECE platform monitoring Platform monitoring deployment logs and metrics Proxy log fields Set the retention period for logging and metrics indices Logging Elasticsearch log4j configuration Update Elasticsearch logging levels Elasticsearch deprecation logs Kibana logging Set global log levels for Kibana Advanced Kibana logging settings Examples Configure Kibana reporting Manage your Cloud organization Billing Hosted billing dimensions Serverless billing dimensions Elasticsearch Elastic for Observability Elastic for Security Billing models Add your billing details View your billing history Manage your subscription Monitor and analyze usage Elastic Consumption Units Billing FAQ Operational emails Update billing and operational contacts Service status Tools and APIs Licenses and subscriptions Elastic Cloud Enterprise Elastic Cloud on Kubernetes Self-managed cluster Maintenance ECE maintenance Deployments maintenance Pause instance Maintenance activities Enable maintenance mode Scale out your installation Move nodes or instances from allocators Perform ECE hosts maintenance Delete ECE hosts Start and stop services Start and stop Elasticsearch Start and stop Kibana Restart an Elastic Cloud Hosted deployment Restart an ECE deployment Full Cluster restart and rolling restart procedures Start and stop routing requests Add and Remove Elasticsearch nodes Upgrade Plan your upgrade Upgrade your ECE or ECK orchestrator Upgrade Elastic Cloud Enterprise Re-running the ECE upgrade Upgrade Elastic Cloud on Kubernetes Prepare to upgrade Upgrade Assistant Upgrade your deployment or cluster Upgrade on Elastic Cloud Hosted Upgrade on Elastic Cloud Enterprise Upgrade on Elastic Cloud on Kubernetes Upgrade Elastic on a self-managed cluster Upgrade Elasticsearch Archived settings Reading indices from older Elasticsearch versions Upgrade Kibana Saved object migrations Roll back to a previous version Upgrade to Enterprise Search Upgrade your ingest components Uninstall Uninstall Elastic Cloud Enterprise Uninstall Elastic Cloud on Kubernetes Delete an orchestrated deployment Loading Docs / Deploy and manage / … / Cluster or deployment / User authentication / Controlling the user cache ECE ECK Elastic Cloud Hosted Self Managed User credentials are cached in memory on each node to avoid connecting to a remote authentication service or hitting the disk for every incoming request. You can configure characteristics of the user cache with the cache.ttl , cache.max_users , and cache.hash_algo realm settings. Note JWT realms use jwt.cache.ttl and jwt.cache.size realm settings. Note PKI and JWT realms do not cache user credentials, but do cache the resolved user object to avoid unnecessarily needing to perform role mapping on each request. The cached user credentials are hashed in memory. By default, the Elasticsearch security features use a salted sha-256 hash algorithm. You can use a different hashing algorithm by setting the cache.hash_algo realm settings. See User cache and password hash algorithms . Evicting users from the cache You can use the clear cache API to force the eviction of cached users . For example, the following request evicts all users from the ad1 realm: $ curl -XPOST 'http://localhost:9200/_security/realm/ad1/_clear_cache' To clear the cache for multiple realms, specify the realms as a comma-separated list: $ curl -XPOST 'http://localhost:9200/_security/realm/ad1,ad2/_clear_cache' You can also evict specific users: $ curl -XPOST 'http://localhost:9200/_security/realm/ad1/_clear_cache?usernames=rdeniro,alpacino' Previous Looking up users without authentication Next Manage authentication for multiple clusters Current version Current version ✓ Previous version (8.18) Edit this page Report an issue On this page Evicting users from the cache Trademarks Terms of Use Privacy Sitemap © 2025 Elasticsearch B.V. All Rights Reserved. Elasticsearch is a trademark of Elasticsearch B.V., registered in the U.S. and in other countries. Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant logo are trademarks of the Apache Software Foundation in the United States and/or other countries. Welcome to the docs for the latest Elastic product versions , including Elastic Stack 9.0 and Elastic Cloud Serverless. To view previous versions, go to elastic.co/guide .","title":"Controlling the user cache | Elastic Docs","url":"https://www.elastic.co/docs/deploy-manage/users-roles/cluster-or-deployment-auth/controlling-user-cache","meta_description":"User credentials are cached in memory on each node to avoid connecting to a remote authentication service or hitting the disk for every incoming request..."}
diff --git a/supporting-blog-content/why-rag-still-matters/why_rag_still_matters.ipynb b/supporting-blog-content/why-rag-still-matters/why_rag_still_matters.ipynb
new file mode 100644
index 00000000..49412e24
--- /dev/null
+++ b/supporting-blog-content/why-rag-still-matters/why_rag_still_matters.ipynb
@@ -0,0 +1,1064 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "l2HOo2IKWnQU"
+      },
+      "source": [
+        "\n",
+        "# Longer ≠ Better: Why RAG Still Matters\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "3I1-6I6vW_0N"
+      },
+      "source": [
+        "Retrieval-Augmented Generation (RAG) emerged as a solution to early large language models' context window limitations, allowing selective information retrieval when token constraints prevented processing entire datasets. Now, as models like Gemini 1.5 have the ability to handle [millions of tokens](https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/), this breakthrough enables us to compare whether RAG is still a necessary tool to provide context in the era of LLMs with millions tokens context.\n",
+        "\n",
+        "**Background**\n",
+        "\n",
+        "\n",
+        "*   RAG was developed as a workaround for token constraints in LLMs\n",
+        "*   RAG allowed selective information retrieval to avoid context window limitations\n",
+        "*   New models like Gemini 1.5 can handle millions of tokens\n",
+        "*   As token limits increase, the need for selective retrieval diminishes\n",
+        "*   Future applications may process massive datasets without external databases\n",
+        "*   RAG may become obsolete as models handle more information directly\n",
+        "\n",
+        "\n",
+        "Let's test it how good models with large token context are compared to RAG\n",
+        "\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "SC4W1ixwxBiv"
+      },
+      "source": [
+        "## Architecture"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "GNPyUtjKxKwy"
+      },
+      "source": [
+        "* **RAG**: We're using Elasticsearch with Semantic text search enabled, and results provided are supplied to LLM as context, in this case Gemini.\n",
+        "\n",
+        "* **LLM**: We're providing context to the LLM, in this case Gemini, with a maximum of 1M token context."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "2LPQNuVbxy7Z"
+      },
+      "source": [
+        "## Methodology"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "tAyhEu8Tx2Rr"
+      },
+      "source": [
+        "To compare performance between RAG and LLM full context, we're going to work a mix of technical articles and documentation. To provide full context to the LLM articles and documentation will be provided as context.\n",
+        "\n",
+        "To identify if answer is the correct or not we're going to ask to both systems *** What is the title of the article?*** . For this we're going to run 2 sets of tests:\n",
+        "\n",
+        "1. Run a **textual** query in order to find an extract of document and identify where it belongs. Compare RAG and LLM performance\n",
+        "2. Run a **semantic** query in order to find a a semantic equivalent sentence from a document. Compare Rag and LLM performance\n",
+        "\n",
+        "To compare both technologies we're going to measure:\n",
+        "- Accuracy\n",
+        "- Time\n",
+        "- Cost"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "b-hGvtbVuZL_"
+      },
+      "source": [
+        "# Setup"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "lqHKpw4DZ_lH"
+      },
+      "source": [
+        "We setup the python libraries we're going to use\n",
+        "*   **Elasticsearch** - To run queries to Elasticsearch\n",
+        "*   **Langchain**     - Interface to LLM\n",
+        "\n",
+        "\n",
+        "Also call API Keys to start working with both components"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "collapsed": true,
+        "id": "bKfjs9tadCg_",
+        "outputId": "7288687e-615f-4116-a4a4-344ee60f1898"
+      },
+      "outputs": [],
+      "source": [
+        "%pip install elasticsearch langchain langchain-core langchain-groq langchain-community matplotlib langchain-google-genai -q"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "jF-sKsNqC7eq"
+      },
+      "source": [
+        "### Import libraries, Elasticsearch, defining LLM and Open AI API Keys"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 71,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "7hDx-LTWa_Vb",
+        "outputId": "e8bf7847-6553-405d-c835-4345a88444ee"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "import json\n",
+        "import time\n",
+        "import pandas as pd\n",
+        "import matplotlib.pyplot as plt\n",
+        "from datetime import datetime\n",
+        "from getpass import getpass\n",
+        "\n",
+        "from elasticsearch import Elasticsearch, helpers\n",
+        "from langchain.callbacks import get_openai_callback\n",
+        "from langchain_core.prompts import ChatPromptTemplate\n",
+        "from langchain_core.output_parsers import StrOutputParser\n",
+        "from langchain_google_genai import ChatGoogleGenerativeAI\n",
+        "\n",
+        "\n",
+        "os.environ[\"GOOGLE_API_KEY\"] = getpass(\"Enter your Google AI API key: \")\n",
+        "os.environ[\"ES_API_KEY\"] = getpass(\"Elasticsearch API Key: \")\n",
+        "os.environ[\"ES_URL\"] = getpass(\"Elasticsearch URL: \")\n",
+        "\n",
+        "\n",
+        "index_name = \"technical-articles\""
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Elasticsearch client"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 46,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "es_client = Elasticsearch(\n",
+        "    os.environ[\"ES_URL\"],\n",
+        "    api_key=os.environ[\"ES_API_KEY\"],\n",
+        "    request_timeout=120,\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Defining LLM"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 72,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "llm = ChatGoogleGenerativeAI(\n",
+        "    model=\"gemini-2.0-flash\",\n",
+        "    temperature=0,\n",
+        "    max_tokens=None,\n",
+        "    timeout=None,\n",
+        "    max_retries=2,\n",
+        ")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Function to calculate cost of LLM"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 49,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# Function to calculate cost of LLM with input and output cost per million tokens\n",
+        "def calculate_cost(\n",
+        "    input_price=0.10, output_price=0.40, input_tokens=0, output_tokens=0\n",
+        "):\n",
+        "    input_total_cost = (input_tokens / 1_000_000) * input_price\n",
+        "    output_total_cost = (output_tokens / 1_000_000) * output_price\n",
+        "\n",
+        "    return input_total_cost + output_total_cost"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "X8b-LGV3opm1"
+      },
+      "source": [
+        "# 1. Index working files"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "Eo2mBCjuw5d9"
+      },
+      "source": [
+        "For this test, we're going to index a mix of 303 documents with technical articles and documentation. These documents will be the source of information for both tests."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "UVNJYmYm5Zx1"
+      },
+      "source": [
+        "## Create and populate index"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "poauJkegegza"
+      },
+      "source": [
+        " To implement RAG we're including in mappings a semantic_text field so we can run semantic queries in Elasticsearch, along with the regular text field.\n",
+        "\n",
+        " Also we're pushing documents to \"technical-articles\" index."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Creating index\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "collapsed": true,
+        "id": "wh3sIujjUOxM",
+        "outputId": "5a98d167-f7e5-409a-86db-d218424b6796"
+      },
+      "outputs": [],
+      "source": [
+        "if not es_client.indices.exists(index=index_name):\n",
+        "    # Define a simple mapping for text documents\n",
+        "    mappings = {\n",
+        "        \"mappings\": {\n",
+        "            \"properties\": {\n",
+        "                \"text\": {\"type\": \"text\", \"copy_to\": \"semantic_text\"},\n",
+        "                \"meta_description\": {\"type\": \"keyword\", \"copy_to\": \"semantic_text\"},\n",
+        "                \"title\": {\"type\": \"keyword\", \"copy_to\": \"semantic_text\"},\n",
+        "                \"imported_at\": {\"type\": \"date\"},\n",
+        "                \"url\": {\"type\": \"keyword\"},\n",
+        "                \"semantic_text\": {\n",
+        "                    \"type\": \"semantic_text\",\n",
+        "                },\n",
+        "            }\n",
+        "        }\n",
+        "    }\n",
+        "\n",
+        "    es_client.indices.create(index=index_name, body=mappings)\n",
+        "\n",
+        "    print(f\"Created index '{index_name}'\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Populating index"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "Indexing documents using de Bulk API to Elasticsearch"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "file_path = \"dataset.json\"\n",
+        "\n",
+        "actions = []\n",
+        "\n",
+        "with open(file_path, \"r\") as f:\n",
+        "    documents = json.load(f)\n",
+        "    for doc in documents:\n",
+        "        document = {\n",
+        "            \"_index\": index_name,\n",
+        "            \"_source\": {\n",
+        "                \"text\": doc[\"text\"],\n",
+        "                \"url\": doc[\"url\"],\n",
+        "                \"title\": doc[\"title\"],\n",
+        "                \"meta_description\": doc[\"meta_description\"],\n",
+        "                \"imported_at\": datetime.now(),\n",
+        "            },\n",
+        "        }\n",
+        "\n",
+        "        actions.append(document)\n",
+        "\n",
+        "\n",
+        "res = helpers.bulk(es_client, actions)\n",
+        "\n",
+        "print(\"documents indexed\", res)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# 2. Run Comparisons\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "_Vt6RM-oHX4b"
+      },
+      "source": [
+        "## Test 1: Textual Query"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Query to retrieve semantic search results from Elasticsearch"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 75,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "query_str = \"\"\"\n",
+        "Let’s now create a test.js file and install our mock client: Now, add a mock for semantic search: We can now create a test for our code, making sure that the Elasticsearch part will always return the same results: Let’s run the tests.\n",
+        "\"\"\""
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "cZLCBBJVXd5v"
+      },
+      "source": [
+        "We extract a paragraph of **Elasticsearch in JavaScript the proper way, part II** article, we will use it as input to retrieve the results from Elasticsearch.\n",
+        "\n",
+        "Results will be stored in the results variable."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "\n",
+        "### RAG strategy (Textual)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Executing Match Phrase Search\n",
+        "\n",
+        "This is the query we're going to use to retrieve the results from Elasticsearch using match phrase search capabilities. We will pass the query_str as input to the match phrase search."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "collapsed": true,
+        "id": "A6QyaYnfcB7C"
+      },
+      "outputs": [],
+      "source": [
+        "textual_rag_summary = {}  # Variable to store results\n",
+        "\n",
+        "start_time = time.time()\n",
+        "\n",
+        "es_query = {\n",
+        "    \"query\": {\"match_phrase\": {\"text\": {\"query\": query_str}}},\n",
+        "    \"_source\": [\"title\"],\n",
+        "    \"highlight\": {\n",
+        "        \"pre_tags\": [\"\"],\n",
+        "        \"post_tags\": [\"\"],\n",
+        "        \"fields\": {\"title\": {}, \"text\": {}},\n",
+        "    },\n",
+        "    \"size\": 10,\n",
+        "}\n",
+        "\n",
+        "response = es_client.search(index=index_name, body=es_query)\n",
+        "hits = response[\"hits\"][\"hits\"]\n",
+        "\n",
+        "textual_rag_summary[\"time\"] = (\n",
+        "    time.time() - start_time\n",
+        ")  # save time taken to run the query\n",
+        "textual_rag_summary[\"es_results\"] = hits  # save hits\n",
+        "\n",
+        "print(\"ELASTICSEARCH RESULTS: \\n\", json.dumps(hits, indent=4))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "This template gives the LLM the instructions to answer the question and the context to do so. At the end of the prompt we're asking for the title of the article.\n",
+        "\n",
+        "The prompt template will be the same for all test. "
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 78,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "# LLM prompt template\n",
+        "template = \"\"\"\n",
+        "  Instructions:\n",
+        "\n",
+        "  - You are an assistant for question-answering tasks.\n",
+        "  - Answer questions truthfully and factually using only the context presented.\n",
+        "  - If you don't know the answer, just say that you don't know, don't make up an answer.\n",
+        "  - Use markdown format for code examples.\n",
+        "  - You are correct, factual, precise, and reliable.\n",
+        "  - Answer\n",
+        "\n",
+        "  Context:\n",
+        "  {context}\n",
+        "\n",
+        "  Question:\n",
+        "  {question}.\n",
+        "\n",
+        "  What is the title article?\n",
+        "\"\"\""
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "p1NXsZqCeP29"
+      },
+      "source": [
+        "#### Run results through LLM"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "yJB7t532RUsx"
+      },
+      "source": [
+        "Results from Elasticsearch will be provided as context to the LLM for us to get the result we need."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "3tfmCLxNtFmB"
+      },
+      "outputs": [],
+      "source": [
+        "start_time = time.time()\n",
+        "\n",
+        "prompt = ChatPromptTemplate.from_template(template)\n",
+        "\n",
+        "context = \"\"\n",
+        "\n",
+        "for hit in hits:\n",
+        "    # For semantic_text matches, we need to extract the text from the highlighted field\n",
+        "    if \"highlight\" in hit:\n",
+        "        highlighted_texts = []\n",
+        "\n",
+        "        for values in hit[\"highlight\"].values():\n",
+        "            highlighted_texts.extend(values)\n",
+        "\n",
+        "        context += f\"{hit['_source']['title']}\\n\"\n",
+        "        context += \"\\n --- \\n\".join(highlighted_texts)\n",
+        "\n",
+        "# Use LangChain for the LLM part\n",
+        "chain = prompt | llm | StrOutputParser()\n",
+        "\n",
+        "printable_prompt = prompt.format(context=context, question=query_str)\n",
+        "print(\"PROMPT WITH CONTEXT AND QUESTION:\\n \", printable_prompt)  # Print prompt\n",
+        "\n",
+        "with get_openai_callback() as cb:\n",
+        "    response = chain.invoke({\"context\": context, \"question\": query_str})\n",
+        "\n",
+        "# Save results\n",
+        "textual_rag_summary[\"answer\"] = response\n",
+        "textual_rag_summary[\"total_time\"] = (time.time() - start_time) + textual_rag_summary[\n",
+        "    \"time\"\n",
+        "]  # Sum of time taken to run the semantic search and the LLM\n",
+        "textual_rag_summary[\"tokens_sent\"] = cb.prompt_tokens\n",
+        "textual_rag_summary[\"cost\"] = calculate_cost(\n",
+        "    input_tokens=cb.prompt_tokens, output_tokens=cb.completion_tokens\n",
+        ")\n",
+        "\n",
+        "print(\"LLM Response:\\n \", response)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "r4iHxJcqbtMK"
+      },
+      "source": [
+        "### LLM strategy (Textual)\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Match all query\n",
+        "\n",
+        "To provide context to the LLM, we're going to get it from the indexed documents in Elasticsearch. Since maximum number of tokens are 1 million, this is 303 documents."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "hD727_xVTUdi"
+      },
+      "outputs": [],
+      "source": [
+        "textual_llm_summary = {}  # Variable to store results\n",
+        "\n",
+        "start_time = time.time()\n",
+        "\n",
+        "es_query = {\"query\": {\"match_all\": {}}, \"sort\": [{\"title\": \"asc\"}], \"size\": 303}\n",
+        "\n",
+        "es_results = es_client.search(index=index_name, body=es_query)\n",
+        "hits = es_results[\"hits\"][\"hits\"]\n",
+        "\n",
+        "# Save results\n",
+        "textual_llm_summary[\"es_results\"] = hits\n",
+        "textual_llm_summary[\"time\"] = time.time() - start_time\n",
+        "\n",
+        "print(\"ELASTICSEARCH RESULTS: \\n\", json.dumps(hits, indent=4))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Run results through LLM\n",
+        "\n",
+        "As in the previous step, we're going to provide the context to the LLM and ask for the answer."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "yu-CT2I4b8xl",
+        "outputId": "9e7731f0-7ff0-4124-f6d5-5cb58eb55d4a"
+      },
+      "outputs": [],
+      "source": [
+        "start_time = time.time()\n",
+        "\n",
+        "prompt = ChatPromptTemplate.from_template(template)\n",
+        "# Use LangChain for the LLM part\n",
+        "chain = prompt | llm | StrOutputParser()\n",
+        "\n",
+        "printable_prompt = prompt.format(context=context, question=query_str)\n",
+        "print(\"PROMPT:\\n \", printable_prompt)  # Print prompt\n",
+        "\n",
+        "with get_openai_callback() as cb:\n",
+        "    response = chain.invoke({\"context\": hits, \"question\": query_str})\n",
+        "\n",
+        "# Save results\n",
+        "textual_llm_summary[\"answer\"] = response\n",
+        "textual_llm_summary[\"total_time\"] = (time.time() - start_time) + textual_llm_summary[\n",
+        "    \"time\"\n",
+        "]  # Sum of time taken to run the match_all query and the LLM\n",
+        "textual_llm_summary[\"tokens_sent\"] = cb.prompt_tokens\n",
+        "textual_llm_summary[\"cost\"] = calculate_cost(\n",
+        "    input_tokens=cb.prompt_tokens, output_tokens=cb.completion_tokens\n",
+        ")\n",
+        "\n",
+        "print(\"LLM Response:\\n \", response)  # Print LLM response"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Test 2: Semantic Query"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### RAG strategy (Non-textual)\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 84,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "query_str = \"This article explains how to improve code reliability. It includes techniques for error handling, and running applications without managing servers.\""
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "To the second test we're going to use a semantic query to retrieve the results from Elasticsearch. For that we built a short synopsis of **Elasticsearch in JavaScript the proper way, part II** article as query_str and provided it as input to RAG."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Executing semantic search\n",
+        "\n",
+        "This is the query we're going to use to retrieve the results from Elasticsearch using semantic search capabilities. We will pass the query_str as input to the semantic search."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "semantic_rag_summary = {}  # Variable to store results\n",
+        "\n",
+        "start_time = time.time()\n",
+        "\n",
+        "es_query = {\n",
+        "    \"retriever\": {\n",
+        "        \"rrf\": {\n",
+        "            \"retrievers\": [\n",
+        "                {\n",
+        "                    \"standard\": {\n",
+        "                        \"query\": {\n",
+        "                            \"bool\": {\n",
+        "                                \"should\": [\n",
+        "                                    {\n",
+        "                                        \"multi_match\": {\n",
+        "                                            \"query\": query_str,\n",
+        "                                            \"fields\": [\"text\", \"title\"],\n",
+        "                                        }\n",
+        "                                    },\n",
+        "                                    {\"match_phrase\": {\"text\": {\"query\": query_str}}},\n",
+        "                                ]\n",
+        "                            }\n",
+        "                        }\n",
+        "                    }\n",
+        "                },\n",
+        "                {\n",
+        "                    \"standard\": {\n",
+        "                        \"query\": {\n",
+        "                            \"semantic\": {\n",
+        "                                \"field\": \"semantic_text\",\n",
+        "                                \"query\": query_str,\n",
+        "                            }\n",
+        "                        }\n",
+        "                    }\n",
+        "                },\n",
+        "            ],\n",
+        "            \"rank_window_size\": 50,\n",
+        "            \"rank_constant\": 20,\n",
+        "        }\n",
+        "    },\n",
+        "    \"_source\": [\"title\"],\n",
+        "    \"highlight\": {\n",
+        "        \"pre_tags\": [\"\"],\n",
+        "        \"post_tags\": [\"\"],\n",
+        "        \"fields\": {\"title\": {}, \"text\": {}},\n",
+        "    },\n",
+        "    \"size\": 10,\n",
+        "}\n",
+        "\n",
+        "\n",
+        "response = es_client.search(index=index_name, body=es_query)\n",
+        "hits = response[\"hits\"][\"hits\"]\n",
+        "\n",
+        "semantic_rag_summary[\"time\"] = (\n",
+        "    time.time() - start_time\n",
+        ")  # save time taken to run the query\n",
+        "semantic_rag_summary[\"es_results\"] = hits  # save hits\n",
+        "\n",
+        "print(\"ELASTICSEARCH RESULTS: \\n\", json.dumps(hits, indent=4))"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Run results through LLM\n",
+        "Now results from Elasticsearch will be provided as context to the LLM for us to get the result we need."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "start_time = time.time()\n",
+        "\n",
+        "prompt = ChatPromptTemplate.from_template(template)\n",
+        "\n",
+        "context = \"\"\n",
+        "\n",
+        "for hit in hits:\n",
+        "    # For semantic_text matches, we need to extract the text from the highlighted field\n",
+        "    if \"highlight\" in hit:\n",
+        "        highlighted_texts = []\n",
+        "\n",
+        "        for values in hit[\"highlight\"].values():\n",
+        "            highlighted_texts.extend(values)\n",
+        "\n",
+        "        context += f\"{hit['_source']['title']}\\n\"\n",
+        "        context += \"\\n --- \\n\".join(highlighted_texts)\n",
+        "\n",
+        "# Use LangChain for the LLM part\n",
+        "chain = prompt | llm | StrOutputParser()\n",
+        "\n",
+        "printable_prompt = prompt.format(context=context, question=query_str)\n",
+        "print(\"PROMPT:\\n \", printable_prompt)  # Print prompt\n",
+        "\n",
+        "with get_openai_callback() as cb:\n",
+        "    response = chain.invoke({\"context\": context, \"question\": query_str})\n",
+        "\n",
+        "# Save results\n",
+        "semantic_rag_summary[\"answer\"] = response\n",
+        "semantic_rag_summary[\"total_time\"] = (time.time() - start_time) + semantic_rag_summary[\n",
+        "    \"time\"\n",
+        "]  # Sum of time taken to run the semantic search and the LLM\n",
+        "semantic_rag_summary[\"tokens_sent\"] = cb.prompt_tokens\n",
+        "semantic_rag_summary[\"cost\"] = calculate_cost(\n",
+        "    input_tokens=cb.prompt_tokens, output_tokens=cb.completion_tokens\n",
+        ")\n",
+        "\n",
+        "print(\"LLM Response:\\n \", response)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### LLM strategy (Non-textual)\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Match all query\n",
+        "\n",
+        "To provide context to the LLM, we're going to get it from the indexed documents in Elasticsearch. Since maximum number of tokens are 1 million, this is 303 documents."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "semantic_llm_summary = {}  # Variable to store results\n",
+        "\n",
+        "start_time = time.time()\n",
+        "\n",
+        "es_query = {\"query\": {\"match_all\": {}}, \"sort\": [{\"title\": \"asc\"}], \"size\": 303}\n",
+        "es_llm_context = es_client.search(index=index_name, body=es_query)\n",
+        "\n",
+        "hits = es_llm_context[\"hits\"][\"hits\"]\n",
+        "\n",
+        "print(\"ELASTICSEARCH RESULTS: \\n\", json.dumps(hits, indent=4))\n",
+        "\n",
+        "# Save results\n",
+        "semantic_llm_summary[\"es_results\"] = hits\n",
+        "semantic_llm_summary[\"time\"] = time.time() - start_time"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "#### Run results through LLM\n",
+        "\n",
+        "As in the previous step, we're going to provide the context to the LLM and ask for the answer."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "start_time = time.time()\n",
+        "\n",
+        "prompt = ChatPromptTemplate.from_template(template)\n",
+        "# Use LangChain for the LLM part\n",
+        "chain = prompt | llm | StrOutputParser()\n",
+        "\n",
+        "printable_prompt = prompt.format(context=context, question=query_str)\n",
+        "print(\"PROMPT:\\n \", printable_prompt)  # Print prompt\n",
+        "\n",
+        "with get_openai_callback() as cb:\n",
+        "    response = chain.invoke({\"context\": hits, \"question\": query_str})\n",
+        "\n",
+        "print(response)\n",
+        "\n",
+        "# Save results\n",
+        "semantic_llm_summary[\"answer\"] = response\n",
+        "semantic_llm_summary[\"total_time\"] = (time.time() - start_time) + semantic_llm_summary[\n",
+        "    \"time\"\n",
+        "]  # Sum of time taken to run the match_all query and the LLM\n",
+        "semantic_llm_summary[\"tokens_sent\"] = cb.prompt_tokens\n",
+        "semantic_llm_summary[\"cost\"] = calculate_cost(\n",
+        "    input_tokens=cb.prompt_tokens, output_tokens=cb.completion_tokens\n",
+        ")\n",
+        "\n",
+        "\n",
+        "print(\"LLM Response:\\n \", response)  # Print LLM response"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "2WktohArnaNV"
+      },
+      "source": [
+        "## 3. Printing results"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Printing results\n",
+        "\n",
+        "Now we're going to print the results of both tests in a dataframe."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 99
+        },
+        "id": "g8gNaX4bnhFA",
+        "outputId": "7f984943-b27c-451f-a568-aae142222cb7"
+      },
+      "outputs": [],
+      "source": [
+        "df1 = pd.DataFrame(\n",
+        "    [\n",
+        "        {\n",
+        "            \"Strategy\": \"Textual RAG\",\n",
+        "            \"Answer\": textual_rag_summary[\"answer\"],\n",
+        "            \"Tokens Sent\": textual_rag_summary[\"tokens_sent\"],\n",
+        "            \"Time\": textual_rag_summary[\"total_time\"],\n",
+        "            \"LLM Cost\": textual_rag_summary[\"cost\"],\n",
+        "        },\n",
+        "        {\n",
+        "            \"Strategy\": \"Textual LLM\",\n",
+        "            \"Answer\": textual_llm_summary[\"answer\"],\n",
+        "            \"Tokens Sent\": textual_llm_summary[\"tokens_sent\"],\n",
+        "            \"Time\": textual_llm_summary[\"total_time\"],\n",
+        "            \"LLM Cost\": textual_llm_summary[\"cost\"],\n",
+        "        },\n",
+        "    ]\n",
+        ")\n",
+        "\n",
+        "\n",
+        "df2 = pd.DataFrame(\n",
+        "    [\n",
+        "        {\n",
+        "            \"Strategy\": \"Semantic RAG\",\n",
+        "            \"Answer\": semantic_rag_summary[\"answer\"],\n",
+        "            \"Tokens Sent\": semantic_rag_summary[\"tokens_sent\"],\n",
+        "            \"Time\": semantic_rag_summary[\"total_time\"],\n",
+        "            \"LLM Cost\": semantic_rag_summary[\"cost\"],\n",
+        "        },\n",
+        "        {\n",
+        "            \"Strategy\": \"Semantic LLM\",\n",
+        "            \"Answer\": semantic_llm_summary[\"answer\"],\n",
+        "            \"Tokens Sent\": semantic_llm_summary[\"tokens_sent\"],\n",
+        "            \"Time\": semantic_llm_summary[\"total_time\"],\n",
+        "            \"LLM Cost\": semantic_llm_summary[\"cost\"],\n",
+        "        },\n",
+        "    ]\n",
+        ")\n",
+        "\n",
+        "print(\"Textual Query DF\")\n",
+        "display(df1)\n",
+        "\n",
+        "print(\"Semantic Query DF\")\n",
+        "display(df2)"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Printing charts\n",
+        "\n",
+        "And for better visualization of the results, we're going to print a bar chart with the number of tokens sent and the response time by strategy."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "df_combined = pd.concat([df1, df2])\n",
+        "\n",
+        "plt.figure(figsize=(10, 5))\n",
+        "df_combined.plot(kind=\"bar\", x=\"Strategy\", y=\"Tokens Sent\", legend=False, ax=plt.gca())\n",
+        "plt.title(\"Tokens Sent by Strategy\")\n",
+        "plt.yscale(\"log\")\n",
+        "\n",
+        "plt.figure(figsize=(10, 5))\n",
+        "df_combined.plot(kind=\"bar\", x=\"Strategy\", y=\"Time\", legend=False, ax=plt.gca())\n",
+        "plt.title(\"Response Time by Strategy\")\n",
+        "plt.ylabel(\"Time (seconds)\")\n",
+        "\n",
+        "\n",
+        "plt.figure(figsize=(10, 5))\n",
+        "df_combined.plot(kind=\"bar\", x=\"Strategy\", y=\"LLM Cost\", legend=False, ax=plt.gca())\n",
+        "plt.title(\"Cost by Strategy\")\n",
+        "plt.yscale(\"log\")\n",
+        "plt.ylabel(\"Cost\")"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "# Clean resources\n",
+        "\n",
+        "As an optional step, we're going to delete the index from Elasticsearch."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {},
+      "outputs": [],
+      "source": [
+        "es_client.indices.delete(index=index_name, ignore=[400, 404])"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "2MQBgFjmq5Yc"
+      },
+      "source": [
+        "## Comments on Textual query\n",
+        "\n",
+        "\n",
+        "### On RAG\n",
+        "1.   RAG was able to find the correct result\n",
+        "2.   The time to run a full context was similar to LLM with partial context\n",
+        "\n",
+        "\n",
+        "### On LLM\n",
+        "1. LLM was unable to find the correct result\n",
+        "2. Time to provide a result was much longer than RAG\n",
+        "3. Pricing is much higher than RAG\\\n",
+        "4. If we are using a self managed LLM, the level hardware must be more powerful than with a RAG approach."
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "toc_visible": true
+    },
+    "kernelspec": {
+      "display_name": "Python 3",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.12.2"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}